By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Scoopico
  • Home
  • U.S.
  • Politics
  • Sports
  • True Crime
  • Entertainment
  • Life
  • Money
  • Tech
  • Travel
Reading: OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour process internally
Share
Font ResizerAa
ScoopicoScoopico
Search

Search

  • Home
  • U.S.
  • Politics
  • Sports
  • True Crime
  • Entertainment
  • Life
  • Money
  • Tech
  • Travel

Latest Stories

Illinois vs. Alabama Prediction, Find out how to Watch, Odds, Channel
Illinois vs. Alabama Prediction, Find out how to Watch, Odds, Channel
‘Spicy’ AI picture generator left thousands and thousands of photographs uncovered
‘Spicy’ AI picture generator left thousands and thousands of photographs uncovered
L.A. metropolis controller’s race will get ruff, as a candidate targets Kenneth Mejia’s corgis
L.A. metropolis controller’s race will get ruff, as a candidate targets Kenneth Mejia’s corgis
Trump and Musk seem to make amends after public feud over coverage
Trump and Musk seem to make amends after public feud over coverage
Days of Our Lives Early Spoilers Nov 24-28: Alex within the Crosshairs & Shawn’s Daring Energy Play!
Days of Our Lives Early Spoilers Nov 24-28: Alex within the Crosshairs & Shawn’s Daring Energy Play!
Have an existing account? Sign In
Follow US
  • Contact Us
  • Privacy Policy
  • Terms of Service
2025 Copyright © Scoopico. All rights reserved
OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour process internally
Tech

OpenAI debuts GPT‑5.1-Codex-Max coding mannequin and it already accomplished a 24-hour process internally

Scoopico
Last updated: November 19, 2025 9:12 pm
Scoopico
Published: November 19, 2025
Share
SHARE



Contents
Efficiency Benchmarks: Incremental Beneficial properties Throughout Key DutiesTechnical Structure: Lengthy-Horizon Reasoning by way of CompactionPlatform Integration and Use CircumstancesCybersecurity and Security ConstraintsDeployment Context and Developer UtilizationOutlook

OpenAI has launched GPT‑5.1-Codex-Max, a brand new frontier agentic coding mannequin now obtainable in its Codex developer setting. The discharge marks a big step ahead in AI-assisted software program engineering, providing improved long-horizon reasoning, effectivity, and real-time interactive capabilities. GPT‑5.1-Codex-Max will now substitute GPT‑5.1-Codex because the default mannequin throughout Codex-integrated surfaces.

The brand new mannequin is designed to function a persistent, high-context software program improvement agent, able to managing advanced refactors, debugging workflows, and project-scale duties throughout a number of context home windows.

It comes on the heels of Google releasing its highly effective new Gemini 3 Professional mannequin yesterday, but nonetheless outperforms or matches it on key coding benchmarks:

On SWE-Bench Verified, GPT‑5.1-Codex-Max achieved 77.9% accuracy at extra-high reasoning effort, edging previous Gemini 3 Professional’s 76.2%.

It additionally led on Terminal-Bench 2.0, with 58.1% accuracy versus Gemini’s 54.2%, and matched Gemini’s rating of two,439 on LiveCodeBench Professional, a aggressive coding Elo benchmark.

When measured in opposition to Gemini 3 Professional’s most superior configuration — its Deep Considering mannequin — Codex-Max holds a slight edge in agentic coding benchmarks, as properly.

Efficiency Benchmarks: Incremental Beneficial properties Throughout Key Duties

GPT‑5.1-Codex-Max demonstrates measurable enhancements over GPT‑5.1-Codex throughout a variety of normal software program engineering benchmarks.

On SWE-Lancer IC SWE, it achieved 79.9% accuracy, a big enhance from GPT‑5.1-Codex’s 66.3%. In SWE-Bench Verified (n=500), it reached 77.9% accuracy at extra-high reasoning effort, outperforming GPT‑5.1-Codex’s 73.7%.

Efficiency on Terminal Bench 2.0 (n=89) confirmed extra modest enhancements, with GPT‑5.1-Codex-Max attaining 58.1% accuracy in comparison with 52.8% for GPT‑5.1-Codex.

All evaluations have been run with compaction and extra-high reasoning effort enabled.

These outcomes point out that the brand new mannequin affords a better ceiling on each benchmarked correctness and real-world usability underneath prolonged reasoning hundreds.

Technical Structure: Lengthy-Horizon Reasoning by way of Compaction

A serious architectural enchancment in GPT‑5.1-Codex-Max is its skill to purpose successfully over prolonged input-output periods utilizing a mechanism known as compaction.

This permits the mannequin to retain key contextual info whereas discarding irrelevant particulars because it nears its context window restrict — successfully permitting for steady work throughout thousands and thousands of tokens with out efficiency degradation.

The mannequin has been internally noticed to finish duties lasting greater than 24 hours, together with multi-step refactors, test-driven iteration, and autonomous debugging.

Compaction additionally improves token effectivity. At medium reasoning effort, GPT‑5.1-Codex-Max used roughly 30% fewer considering tokens than GPT‑5.1-Codex for comparable or higher accuracy, which has implications for each value and latency.

Platform Integration and Use Circumstances

GPT‑5.1-Codex-Max is at the moment obtainable throughout a number of Codex-based environments, which consult with OpenAI’s personal built-in instruments and interfaces constructed particularly for code-focused AI brokers. These embody:

  • Codex CLI, OpenAI’s official command-line device (@openai/codex), the place GPT‑5.1-Codex-Max is already dwell.

  • IDE extensions, seemingly developed or maintained by OpenAI, although no particular third-party IDE integrations have been named.

  • Interactive coding environments, corresponding to these used to exhibit frontend simulation apps like CartPole or Snell’s Regulation Explorer.

  • Inside code assessment tooling, utilized by OpenAI’s engineering groups.

For now, GPT‑5.1-Codex-Max is just not but obtainable by way of public API, although OpenAI states that is coming quickly. Customers who want to work with the mannequin in terminal environments immediately can achieve this by putting in and utilizing the Codex CLI.

It isn’t at the moment confirmed whether or not or how the mannequin will combine into third-party IDEs except they’re constructed on prime of the CLI or future API.

The mannequin is able to interacting with dwell instruments and simulations. Examples proven within the launch embody:

  • An interactive CartPole coverage gradient simulator, which visualizes reinforcement studying coaching and activations.

  • A Snell’s Regulation optics explorer, supporting dynamic ray tracing throughout refractive indices.

These interfaces exemplify the mannequin’s skill to purpose in actual time whereas sustaining an interactive improvement session — successfully bridging computation, visualization, and implementation inside a single loop.

Cybersecurity and Security Constraints

Whereas GPT‑5.1-Codex-Max doesn’t meet OpenAI’s “Excessive” functionality threshold for cybersecurity underneath its Preparedness Framework, it’s at the moment probably the most succesful cybersecurity mannequin OpenAI has deployed. It helps use circumstances corresponding to automated vulnerability detection and remediation, however with strict sandboxing and disabled community entry by default.

OpenAI studies no enhance in scaled malicious use however has launched enhanced monitoring methods, together with exercise routing and disruption mechanisms for suspicious habits. Codex stays remoted to an area workspace except builders opt-in to broader entry, mitigating dangers like immediate injection from untrusted content material.

Deployment Context and Developer Utilization

GPT‑5.1-Codex-Max is at the moment obtainable to customers on ChatGPT Plus, Professional, Enterprise, Edu, and Enterprise plans. It’ll additionally change into the brand new default in Codex-based environments, changing GPT‑5.1-Codex, which was a extra general-purpose mannequin.

OpenAI states that 95% of its inner engineers use Codex weekly, and since adoption, these engineers have shipped ~70% extra pull requests on common — highlighting the device’s influence on inner improvement velocity.

Regardless of its autonomy and persistence, OpenAI stresses that Codex-Max needs to be handled as a coding assistant, not a substitute for human assessment. The mannequin produces terminal logs, take a look at citations, and gear name outputs to assist transparency in generated code.

Outlook

GPT‑5.1-Codex-Max represents a big evolution in OpenAI’s technique towards agentic improvement instruments, providing higher reasoning depth, token effectivity, and interactive capabilities throughout software program engineering duties. By extending its context administration and compaction methods, the mannequin is positioned to deal with duties on the scale of full repositories, relatively than particular person information or snippets.

With continued emphasis on agentic workflows, safe sandboxes, and real-world analysis metrics, Codex-Max units the stage for the following technology of AI-assisted programming environments — whereas underscoring the significance of oversight in more and more autonomous methods.

[/gpt3]

Be careful for pretend iPhone 17 Professional movies
Finest early Black Friday intercourse toy offers of 2025 up to now
Greatest Fitbit deal: Save $50 on Fitbit Sense 2 at Amazon
Must you pay for a VPN?
Fact Social to launch crypto-based betting competitor to Polymarket
Share This Article
Facebook Email Print

POPULAR

Illinois vs. Alabama Prediction, Find out how to Watch, Odds, Channel
Sports

Illinois vs. Alabama Prediction, Find out how to Watch, Odds, Channel

‘Spicy’ AI picture generator left thousands and thousands of photographs uncovered
Tech

‘Spicy’ AI picture generator left thousands and thousands of photographs uncovered

L.A. metropolis controller’s race will get ruff, as a candidate targets Kenneth Mejia’s corgis
U.S.

L.A. metropolis controller’s race will get ruff, as a candidate targets Kenneth Mejia’s corgis

Trump and Musk seem to make amends after public feud over coverage
Politics

Trump and Musk seem to make amends after public feud over coverage

Days of Our Lives Early Spoilers Nov 24-28: Alex within the Crosshairs & Shawn’s Daring Energy Play!
Entertainment

Days of Our Lives Early Spoilers Nov 24-28: Alex within the Crosshairs & Shawn’s Daring Energy Play!

Dozens killed and wounded by Israeli airstrikes in Gaza
News

Dozens killed and wounded by Israeli airstrikes in Gaza

Scoopico

Stay ahead with Scoopico — your source for breaking news, bold opinions, trending culture, and sharp reporting across politics, tech, entertainment, and more. No fluff. Just the scoop.

  • Home
  • U.S.
  • Politics
  • Sports
  • True Crime
  • Entertainment
  • Life
  • Money
  • Tech
  • Travel
  • Contact Us
  • Privacy Policy
  • Terms of Service

2025 Copyright © Scoopico. All rights reserved

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?