Subscribe to our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI reporting. Learn more
On the ninth day of the holiday-themed product announcement series known as “12 Days of OpenAI,” OpenAI introduces its most advanced model, o1, for third-party developers via its application programming interface (API).
This represents a major step forward for developers looking to develop new advanced AI applications or integrate the most advanced OpenAI technology into their existing apps and workflows, be they enterprise or consumer.
If you don’t know OpenAI’s o1 series yet, you can find the overview here: It was already announced in September 2024, the first in a new “family” of models from the ChatGPT company, which goes beyond the large language models (LLMs) of the GPT family range and offers “reasoning” functions.
Basically, the models in the o1 family – o1 and o1 mini – take longer to respond to a user’s prompts, but check themselves as they formulate an answer to see if they are correct and to avoid hallucinations. At the time, OpenAI said that o1 could handle more complex problems at the PhD level – somewhat This is also confirmed by real users.
While developers previously had access to a preview version of o1 on which they could build their own apps – for example as a PhD advisor or lab assistant – the production-ready release of the full o1 model via the API brings improved performance and lower latency and new features that facilitate integration into real applications.
OpenAI o1 had already made available to consumers about two and a half weeks ago via its ChatGPT Plus and Pro plansThe ability to analyze and respond to images and files uploaded by users has also been added to models.
Alongside today’s launch, OpenAI announced significant updates to its real-time API, as well as price reductions and a new tuning method that gives developers greater control over their models.
The full o1 model is now available to developers via OpenAI’s API
The new o1 model, available as o1-2024-12-17, is designed to handle complex, multi-stage argumentation tasks. Compared to the previous o1 preview version, this version improves accuracy, efficiency and flexibility.
OpenAI reports significant progress on a number of benchmarks, including coding, math, and visual reasoning tasks.
For example, coding scores on the SWE Bench Verified increased from 41.3 to 48.9, while performance on the mathematically focused AIME test increased from 42 to 79.2. These improvements make o1 well-suited for developing tools that streamline customer support, optimize logistics, or solve sophisticated analytical problems.
Several new features expand o1’s functionality for developers. Structured outputs enable responses to reliably match custom formats such as JSON schemas, ensuring consistency when interacting with external systems. Function calls simplify the process of connecting o1 to APIs and databases. And the ability to reason about visual input opens up use cases in manufacturing, science and coding.
Developers can also tune o1’s behavior using the new reasoning_effort parameter, which controls how long the model spends on a task to balance performance and response time.
OpenAI’s real-time API gets a boost to support intelligent, conversational voice/audio AI assistants
OpenAI also announced updates to its real-time API designed to enable natural, low-latency conversational experiences such as voice assistants, live translation tools or virtual tutors.
A new WebRTC integration simplifies the creation of voice-based apps by providing direct support for audio streaming, noise reduction and congestion control. Developers can now integrate real-time features with minimal setup, even under variable network conditions.
OpenAI is also introducing new pricing for its real-time API, reducing the cost of GPT-4o audio by 60% to $40 per million input tokens and $80 per million output tokens.
The cost of cached audio input is reduced by 87.5% and is now $2.50 per million input tokens. To further improve affordability, OpenAI is adding GPT-4o mini, a smaller, cost-effective model priced at $10 per million input tokens and $20 per million output tokens.
Text token prices for GPT-4o mini are also significantly lower, starting at $0.60 for input tokens and $2.40 for output tokens.
Beyond pricing, OpenAI gives developers more control over the responses in the Realtime API. Features like simultaneous out-of-band responses allow background tasks like content moderation to be performed without interrupting the user experience. Developers can also customize input contexts to focus on specific parts of a conversation and control when voice responses are triggered to enable more accurate and seamless interactions.
Fine-tuning preferences offers new customization options
Another important addition is Fine-tune preferencesa method for customizing models based on user and developer preferences.
Unlike supervised fine-tuning, which relies on exact input-output pairs, preference fine-tuning uses pairwise comparisons to teach the model which responses are preferred. This approach is particularly effective for subjective tasks such as summaries, creative writing, or scenarios where tone and style matter.
Initial tests with partners such as Rogo AI, which develops assistants for financial analysts, show promising results. Rogo reported that preference fine-tuning helped their model handle complex, non-distributed queries better than traditional fine-tuning, improving task accuracy by over 5%. The feature is now available for gpt-4o-2024-08-06 and gpt-4o-mini-2024-07-18, with plans to expand support to newer models early next year.
New SDKs for Go and Java developers
To optimize integration, OpenAI is expanding its official SDK offering with beta versions for Go and Java. These SDKs complement the existing Python, Node.js, and .NET libraries, making it easier for developers to interact with the OpenAI models in more programming environments. The Go SDK is particularly useful for building scalable backend systems, while the Java SDK is tailored for enterprise applications based on strong typing and robust ecosystems.
With these updates, OpenAI provides developers with an expanded toolkit for building advanced, customizable AI-powered applications. Whether through o1’s enhanced reasoning capabilities, real-time API improvements, or fine-tuning options, OpenAI’s latest offerings aim to provide both improved performance and cost effectiveness for companies pushing the boundaries of AI integration.
Source link