OpenAI, beneath growing aggressive strain from Google and Anthropic, has debuted a brand new AI mannequin, GPT-5.2, that it says beats all present fashions by a considerable margin throughout a variety of duties.
The brand new mannequin, which is being launched lower than a month after OpenAI debuted its predecessor, GPT-5.1, carried out notably properly on a benchmark of difficult skilled duties throughout a variety of “information work”—from regulation to accounting to finance—in addition to on evaluations involving coding and mathematical reasoning, in response to information OpenAI launched.
Fidji Simo, the previous InstaCart CEO who now serves as OpenAI’s CEO of purposes, informed reporters that the mannequin mustn’t been seen as a direct response to Google’s Gemini 3 Professional AI mannequin, which was launched final month. That launch prompted OpenAI CEO Sam Altman to subject a “code crimson,” delaying the rollout of a number of initiatives with a view to focus extra employees and computing assets on bettering its core product, ChatGPT.
“I’d say that [the Code Red] helps with the discharge of this mannequin, however that’s not the explanation it’s popping out this week particularly, it has been within the works for some time,” she mentioned.
She mentioned the corporate had been constructing GPT-5.2 “for a lot of months.” “We don’t flip round these fashions in only a week. It’s the results of numerous work,” she mentioned. The mannequin had been identified internally by the code identify “Garlic,” in response to a narrative in The Info. The day earlier than the mannequin’s launch Altman teased its imminent rollout by posting to social media a video clip of him cooking a dish with a considerable amount of garlic.
OpenAI executives mentioned that the mannequin had been within the fingers of “Alpha clients” who assist check its efficiency for “a number of weeks”—a time interval that might imply the mannequin was accomplished previous to Altman’s “code crimson” declaration.
These testers included authorized AI startup Harvey, note-taking app Notion, and file-management software program firm Field, in addition to Shopify and Zoom.
OpenAI mentioned these clients discovered GPT-5.2 demonstrated a “state-of-the-art” skill to make use of different software program instruments to finish duties, in addition to excelling at writing and debugging code.
Coding has develop into some of the aggressive use circumstances for AI mannequin deployment inside firms. Though OpenAI had an early lead within the area, Anthropic’s Claude mannequin has proved particularly standard amongst enterprises, exceeding OpenAI’s marketshare in response to some figures. OpenAI is little doubt hoping to persuade clients to show again to its fashions for coding with GPT-5.2.
Simo mentioned the “Code Crimson” was serving to OpenAI give attention to bettering ChatGPT. “Code Crimson is known as a sign to the corporate that we wish to marshal assets in a single specific space, and that’s a strategy to actually outline priorities and outline issues that may be deprioritized,” she mentioned. “So we’ve got had a rise in assets targeted on ChatGPT typically.”
The corporate additionally mentioned its new mannequin is healthier than the corporate’s earlier ones at offering “secure completions”—which it defines as offering customers with useful solutions whereas not saying issues that may contribute to or worsen psychological well being crises.
“On the protection facet, as you noticed by the benchmarks, we’re bettering on just about each dimension of security, whether or not that’s self hurt, whether or not that’s several types of psychological well being, whether or not that’s emotional reliance,” Simo mentioned. “We’re very pleased with the work that we’re doing right here. It’s a high precedence for us, and we solely launch fashions after we’re assured that the protection protocols have been adopted, and we really feel pleased with our work.”
The discharge of the brand new mannequin got here on the identical day a brand new lawsuit was filed towards the corporate alleging that ChatGPT’s interactions with a psychologically troubled consumer had contributed to a murder-suicide in Connecticut. The corporate additionally faces a number of different lawsuits alleging ChatGPT contributed to individuals’s suicides. The corporate referred to as the Connecticut murder-suicide “extremely heartbreaking” and mentioned it’s persevering with to enhance “ChatGPT’s coaching to acknowledge and reply to indicators of psychological or emotional misery, de-escalate conversations and information individuals towards real-world help.”
GPT-5.2 confirmed a big bounce in efficiency throughout a number of benchmark assessments of curiosity to enterprise clients. It met or exceeded human skilled efficiency on a variety of adverse skilled duties, as measured by OpenAI’s GDPval benchmark, 70.9% of the time. That compares to only 38.8% of the time for GPT-5, a mannequin that OpenAI launched in August; 59.6% for Anthropic’s Claude Opus 4.5; and 53.3% for Google’s Gemini 3 Professional.
On the software program improvement benchmark, SWE-Bench Professional, GPT-5.2 scored 55.6%, which was virtually 5 proportion factors higher than its predecessor, GPT-5.1, and greater than 12% higher than Gemini 3 Professional.
OpenAI’s Aidan Clark, vp of analysis (coaching), declined to reply questions on precisely what coaching strategies had been used to improve GPT-5.2’s efficiency, though he mentioned that the corporate had made enhancements throughout the board, together with in pretraining, the preliminary step in creating an AI mannequin.
When Google launched its Gemini 3 Professional mannequin final month, its researchers additionally mentioned the corporate had made enhancements in pretraining in addition to post-training. This shocked some within the discipline who believed that AI firms had largely exhausted the power to wring substantial enhancements out of the pretraining stage of mannequin constructing, and it was speculated that OpenAI could have been caught off guard by Google’s progress on this space.