OpenAI Launches GPT-4o: Real-Time AI for Speech and Vision Integration
OpenAI has introduced a new model that boasts real-time capabilities for processing speech and visual input.
Revealed in a livestream hosted by OpenAI, this new model, known as GPT-4o, responds instantly to both auditory and visual prompts. Mira Murati, the Chief Technology Officer, announced that GPT-4o will be available for free due to its enhanced efficiency compared to previous models. However, a paid version will offer users higher capacity limits.
During the presentation, GPT-4o demonstrated its ability to solve mathematical problems captured by an iPhone camera and to modify its speech output in response to verbal cues. It could also maintain a dialogue with the presenter, providing advice on relaxation techniques and analyzing breathing patterns, although there were moments when it misunderstood certain prompts, requiring rephrasing or repetition.
Β
Murati emphasized the significant advancements in GPT-4o, stating, “GPT-4o matches GPT-4 in intelligence but exceeds it in speed, enhancing its functionality across text, vision, and audio.” She continued, “We have been working on enhancing the intelligence of our models for years, and they have become quite adept. However,Β GPT-4o marks a major breakthrough in usability.”
Β
She added, “We’re entering a new era of human-machine interaction that aims for a seamless and more intuitive user experience, making this technology more accessible and simpler to use.”
After its debut in late 2022,Β ChatGPT rapidly became the fastest application to reach 100 million monthly users. The integration of search engine-like capabilities to provide immediate, accurate responses may position OpenAI ahead of its competitors. The demonstration using mobile technology also suggests a shift toward promoting the use of ChatGPT on smartphones.
OpenAI‘s strategic timing for this announcement, just a day beforeΒ Google’s annual developer conference, where the tech giant is expected to unveil its latest AI innovations, highlights the competitive landscape in AI development.
Below are more videos on how GPT-4o works: