Investing.com -- OpenAI has launched GPT-4o, an advanced image generator that has been developed to produce precise and photorealistic visuals. The new model enhances the utility of image generation, a feature considered a primary capability by OpenAI.
GPT-4o image generation is designed to accurately render text, follow prompts precisely, and leverage 4o’s inherent knowledge base and chat context. This includes the ability to transform uploaded images or use them as visual inspiration. The enhanced capabilities make it easier to create the exact image envisioned, thereby aiding effective communication through visuals.
The model was trained on the joint distribution of online images and text, which has resulted in a visually fluent model capable of generating useful, consistent, and context-aware images. It excels in text rendering, multi-turn generation, instruction following, in-context learning, and linking knowledge between text and images.
The ability to blend precise symbols with imagery turns image generation into a tool for visual communication. GPT-4o can build upon images and text in chat context, ensuring consistency throughout. It also follows detailed prompts with attention to detail, and can handle up to 10-20 different objects.
GPT-4o can analyze and learn from user-uploaded images, seamlessly integrating their details into its context to inform image generation. Native image generation enables 4o to link its knowledge between text and images, resulting in a model that feels smarter and more efficient.
Despite its advanced capabilities, OpenAI acknowledges that the model has limitations and plans to address them through model improvements after the initial launch. Safety remains a key concern. OpenAI aims to maximize creative freedom by supporting valuable use cases like game development, historical exploration, and education while maintaining strong safety standards.
The model also comes equipped with C2PA metadata, which identifies an image as coming from GPT-4o, to provide transparency. OpenAI has built an internal search tool that uses technical attributes of generations to help verify if content came from their model.
The rollout of 4o image generation starts today for Plus, Pro, Team, and Free users as the default image generator in ChatGPT, with access coming soon to Enterprise and Edu. Developers will soon be able to generate images with GPT-4o via the API, with access rolling out in the next few weeks.
Creating and customizing images with GPT-4o is as simple as chatting. Users just need to describe what they need, including any specifics like aspect ratio, exact colors using hex codes, or a transparent background. The rendering of images takes up to one minute due to the detailed nature of the pictures.