Be a part of the occasion trusted by enterprise leaders for practically 20 years. VB Rework brings collectively the folks constructing actual enterprise AI technique. Be taught extra
At VentureBeat’s Rework 2025 convention, Olivier Godement, Head of Product for OpenAI’s API platform, offered a behind-the-scenes have a look at how enterprise groups are adopting and deploying AI brokers at scale.
In a 20-minute panel dialogue I hosted solely with Godement, the previous Stripe researcher and present OpenAI API boss unpacked OpenAI’s newest developer instruments—the Responses API and Brokers SDK—whereas highlighting real-world patterns, safety concerns, and cost-return examples from early adopters like Stripe and Field.
For enterprise leaders unable to attend the session reside, listed here are prime 8 most essential takeaways:
Brokers Are Quickly Shifting From Prototype to Manufacturing
In response to Godement, 2025 marks an actual shift in how AI is being deployed at scale. With over 1,000,000 month-to-month lively builders now utilizing OpenAI’s API platform globally, and token utilization up 700% 12 months over 12 months, AI is shifting past experimentation.
“It’s been 5 years since we launched primarily GPT-3… and man, the previous 5 years has been fairly wild.”
Godement emphasised that present demand isn’t nearly chatbots anymore. “AI use instances are shifting from easy Q&A to truly use instances the place the applying, the agent, can do stuff for you.”
This shift prompted OpenAI to launch two main developer-facing instruments in March: the Responses API and the Brokers SDK.
When to Use Single Brokers vs. Sub-Agent Architectures
A significant theme was architectural selection. Godement famous that single-agent loops, which encapsulate full software entry and context in a single mannequin, are conceptually elegant however usually impractical at scale.
“Constructing correct and dependable single brokers is difficult. Like, it’s actually onerous.”
As complexity will increase—extra instruments, extra potential person inputs, extra logic—groups usually transfer towards modular architectures with specialised sub-agents.
“A follow which has emerged is to primarily break down the brokers into a number of sub-agents… You’d do separation of considerations like in software program.”
These sub-agents operate like roles in a small staff: a triage agent classifies intent, tier-one brokers deal with routine points, and others escalate or resolve edge instances.
Why the Responses API Is a Step Change
Godement positioned the Responses API as a foundational evolution in developer tooling. Beforehand, builders manually orchestrated sequences of mannequin calls. Now, that orchestration is dealt with internally.
“The Responses API might be the largest new layer of abstraction we launched since just about GPT-3.”
It permits builders to specific intent, not simply configure mannequin flows. “You care about returning a extremely good response to the shopper… the Response API primarily handles that loop.”
It additionally contains built-in capabilities for data retrieval, internet search, and performance calling—instruments that enterprises want for real-world agent workflows.
Observability and Safety Are Constructed In
Safety and compliance had been prime of thoughts. Godement cited key guardrails that make OpenAI’s stack viable for regulated sectors like finance and healthcare:
- Coverage-based refusals
- SOC-2 logging
- Knowledge residency help
Analysis is the place Godement sees the largest hole between demo and manufacturing.
“My scorching take is that mannequin analysis might be the largest bottleneck to huge AI adoption.”
OpenAI now contains tracing and eval instruments with the API stack to assist groups outline what success appears like and monitor how brokers carry out over time.
“Until you put money into analysis… it’s actually onerous to construct that belief, that confidence that the mannequin is being correct, dependable.”
Early ROI Is Seen in Particular Capabilities
Some enterprise use instances are already delivering measurable good points. Godement shared examples from:
- Stripe, which makes use of brokers to speed up bill dealing with, reporting “35% quicker bill decision”
- Field, which launched data assistants that allow “zero-touch ticket triage”
Different high-value use instances embody buyer help (together with voice), inner governance, and data assistants for navigating dense documentation.
What It Takes to Launch in Manufacturing
Godement emphasised the human think about profitable deployments.
“There’s a small fraction of very high-end individuals who, every time they see an issue and see a know-how, they run at it.”
These inner champions don’t at all times come from engineering. What unites them is persistence.
“Their first response is, OK, how can I make it work?”
OpenAI sees many preliminary deployments pushed by this group — individuals who pushed early ChatGPT use within the enterprise and at the moment are experimenting with full agent techniques.
He additionally identified a niche many overlook: area experience. “The data in an enterprise… doesn’t lie with engineers. It lies with the ops groups.”
Making agent-building instruments accessible to non-developers is a problem OpenAI goals to deal with.
What’s Subsequent for Enterprise Brokers
Godement supplied a glimpse into the roadmap. OpenAI is actively engaged on:
- Multimodal brokers that may work together through textual content, voice, photos, and structured knowledge
- Lengthy-term reminiscence for retaining data throughout classes
- Cross-cloud orchestration to help complicated, distributed IT environments
These aren’t radical adjustments, however iterative layers that develop what’s already potential. “As soon as we have now fashions that may assume not just for just a few seconds however for minutes, for hours… that’s going to allow some fairly mind-blowing use instances.”
Ultimate Phrase: Reasoning Fashions Are Underhyped
Godement closed the session by reaffirming his perception that reasoning-capable fashions—these that may replicate earlier than responding—would be the true enablers of long-term transformation.
“I nonetheless have conviction that we’re just about on the GPT-2 or GPT-3 stage of maturity of these fashions….We’re nonetheless scratching the floor on what reasoning fashions can do.”
For enterprise resolution makers, the message is obvious: the infrastructure for agentic automation is right here. What issues now could be constructing a centered use case, empowering cross-functional groups, and being able to iterate. The following section of worth creation lies not in novel demos—however in sturdy techniques, formed by real-world wants and the operational self-discipline to make them dependable.