OpenAI unveils GPT-4o: The next leap in human-like AI

Image source: © canva

Konrad Siwik,14.05.2024 15:30

OpenAI, known for its innovative artificial intelligence solutions, surprised the world with its latest model, the GPT-4o. This revolutionary model can analyse audio, video, and text in real-time, which is extremely impressive, especially considering the speed at which it responds to incoming audio prompts.

AI enthusiasts eagerly awaited the OpenAI Spring Update, an event organised by the developers of ChatGPT. Much speculation circulated in the industry about the possible unveiling of a new AI-based search engine. However, OpenAI decided to focus on something else: the announcement and presentation of its new AI model, GPT-4o.

"We have been introduced to yet another form of voice assistant, and at this stage, it seems to be the most advanced. It appears that Sam Altman and his team at OpenAI did not set out to directly challenge Google in the search engine market. It is possible that they never intended to do so, and the rumours about a tool that is meant to surpass Google may have been spread solely to unsettle OpenAI's competitors," says Marcin Stypuła, the founder and CEO of Semcore, one of Poland's largest SEO agencies.

OpenAI introduces GPT-4o

At the conference, OpenAI not only showcased improvements to ChatGPT but also introduced a brand new model, GPT-4o. The best part? It will be accessible to all users, including those using the free version of ChatGPT. OpenAI reiterated its commitment to inclusivity, making advanced AI tools available for free, ensuring that no one is left behind in the AI revolution.

Interactions with ChatGPT are set to be a game-changer. The process will be transcription-free, significantly speeding up the conversation. But that's not all. The algorithm will empower users to interrupt ChatGPT's responses, fostering a dynamic and engaging interaction between humans and the AI model.

The new GPT-4o model will make use of cameras on a device it is run on. This will allow the algorithm to assess the environment quickly, provide better advice to users, and even recognise their emotions.

Real-time translation of conversations

OpenAI also demonstrated how we could converse while travelling abroad, thanks to GPT-4o. It showcased the program's ability to recognise sentences spoken in Italian by Mira Murati, the company's technical director, and translate them into English in real-time. The program also translated English responses into Italian, facilitating seamless multilingual communication.

The conference ended with a demonstration of yet another new feature. The GPT-4o can now identify and label emotions based on facial expressions captured by the camera. During the conference, the AI recognised the smiling face of an OpenAI employee and inquired about the reason for their happiness, which Sam Altman described as "magic."

"OpenAI's revelations certainly evoke emotion and provide considerable value to the world of new technology. We will definitely make use of the translator. The most pressing question for Polish users is when the Polish language will be available. Undoubtedly, businesses will need to consider how to utilise this tool in advertising. Emotions play a crucial role, and having real-time knowledge of potential customers' feelings seems priceless," comments Marcin Stypuła.

GPT-4o reacts as quickly as a human being

GPT-4o aims to make interactions more natural. OpenAI promises that GPT-4o responds to audio signals in just 232 milliseconds, compared to the average of 320 milliseconds, making its response time comparable to that of a human during conversation. Performance-wise, GPT-4o matches GPT-4 Turbo for English text and even surpasses it in other languages.

"GPT-4o excels in vision and audio understanding compared to existing models," claims OpenAI. So, what are the capabilities of GPT-4o? One notable demonstration is a recording where GPT-4o is asked to count from one to ten.

The recording above shows how quickly GPT-4o responds to commands to change pace, all in real time. Another recording showcases GPT-4o acting as a Spanish teacher, analysing objects visible through the camera.

When will GPT-4o be available?

"GPT-4o’s text and image capabilities are starting to roll out today in ChatGPT. We are making GPT-4o available in the free tier, and to Plus users with up to 5x higher message limits. We'll roll out a new version of Voice Mode with GPT-4o in alpha within ChatGPT Plus in the coming weeks," reads the OpenAI website.

It's important to note that OpenAI offers more than just ChatGPT. The upcoming Sora model will even enable users to generate videos, a feature that has been highly praised even by professional artists.

Let us know what do you think

Luce, Vatican’s cartoon mascot for Jubilee 2025, sparks controversy

OpenAI unveils GPT-4o: The next leap in human-like AI

Related

OpenAI introduces GPT-4o

Real-time translation of conversations

GPT-4o reacts as quickly as a human being

When will GPT-4o be available?

More about #MyImpact