The massive change from GPT-3.5 is usually that OpenAI's 4th technology language design is multimodal, which suggests it may process both of those text, photos and audio. What this means is you could display it photographs and it'll respond to them together with a text prompt – an early illustration https://chatgpt-openia.net/login