Arcee's U.S.-made, open source Trinity Large and 10T-checkpoint offer rare look at raw model intelligence

Contents

As sparse as they come Sovereignty and the "TrueBase" philosophy Technology: engineering through constraint Architecture: 4-of-256 Sparsity and SMEBU Technical comparison: Trinity Large vs. gpt-oss-120b Sovereignty: filling the vacuum Balancing intelligence with utility

San Francisco-based AI lab Arcee made waves last year for being one of the only U.S. companies to train large language models (LLMs) from scratch and release them under open or partially open source licenses to the public—enabling developers, solo entrepreneurs, and even medium-to-large enterprises to use the powerful AI models for free and customize them at will.

Now Arcee is back again this week with the release of its largest, most performant open language model to date: Trinity Large, a 400-billion parameter mixture-of-experts (MoE), available now in preview,

Alongside the flagship release, Arcee is shipping a "raw" checkpoint model, Trinity-Large-TrueBase, that allows researchers to study what a 400B sparse MoE learns from raw data alone, before instruction tuning and reinforcement has been applied.

By providing a clean slate at the 10-trillion-token mark, Arcee enables AI builders in highly regulated industries to perform authentic audits and conduct their own specialized alignments without inheriting the "black box" biases or formatting quirks of a general-purpose chat model. This transparency allows for a deeper understanding of the distinction between a model's intrinsic reasoning capabilities and the helpful behaviors dialed in during the final stages of post-training.

This launch arrives as powerful Chinese open-source LLM alternatives from the likes of Alibaba (Qwen), z.AI (Zhipu), DeepSeek, Moonshot, and Baidu have flooded the market, effectively leading the category with high-efficiency architectures.

Trinity Large also comes after Meta has notably retreated from the frontier open-source landscape. Following the April 2025 debut of Llama 4, which was met with a mixed reception, and former Meta AI researcher Yann LeCun later admitted the company used multiple specialized versions of the model to inflate scores on third-party benchmarks.

Amidst this domestic vacuum, only OpenAI—with its gpt-oss family released in the summer of 2025—and Arcee are currently carrying the mantle of new U.S.-made open-source models trained entirely from scratch.

As sparse as they come

Trinity Large is noteworthy for the extreme sparsity of its attention mechanism. An MoE architecture, "sparsity" refers to the model's ability to selectively activate only a tiny fraction of its total parameters for any given task.

While Trinity Large houses 400B total parameters, only 1.56% (13B parameters) are active at any given time.

This architectural choice is significant because it allows the model to possess the "knowledge" of a massive system while maintaining the inference speed and operational efficiency of a much smaller one—achieving performance that is roughly 2–3x faster than its peers on the same hardware.

Sovereignty and the "TrueBase" philosophy

The most significant contribution of this release to the research community is Trinity-Large-TrueBase—a raw, 10-trillion-token checkpoint.

Unlike nearly every other "open" release, which arrives after being "warped" by instruction tuning and reinforcement learning, TrueBase offers a rare, unspoiled look at foundational intelligence.

In the rush to make models helpful, most labs apply supervised fine-tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) before the weights are released. While this makes the model a better conversationalist, it can mask underlying knowledge distributions.

TrueBase provides an "OG base model" that has not yet undergone the learning rate anneals or the phase two and three pre-training where instruction data is typically introduced.

For researchers and enterprises in highly regulated industries, starting from TrueBase allows for authentic audits and custom alignment. As Lucas Atkins, Arcee’s CTO, noted in a video call with VentureBeat: "It's interesting like that checkpoint itself is already one of the best performing base models in the world".

Technology: engineering through constraint

The creation of Trinity Large was not a product of infinite resources, but rather what Atkins calls "engineering through constraint".

Trained for approximately $20 million over just 33 days, the model represents a masterclass in capital efficiency.

Arcee, a team of only 30 people, operated on a total capital of just under $50 million, making the $20 million training run a "back the company" bet.

"I've always believed that having a constraint, whether financially or personnel or whatever, is extremely important for creativity," Atkins explained. "When you just have an unlimited budget, you inherently don't have to engineer your way out of complex problems".

Architecture: 4-of-256 Sparsity and SMEBU

Trinity Large utilizes a 4-of-256 sparse MoE architecture, meaning it activates only 4 out of its 256 experts for every token.

This high degree of sparsity—one of the highest ever successfully trained—created significant stability challenges during pre-training.

To solve this, Arcee developed Soft-clamped Momentum Expert Bias Updates (SMEBU). This mechanism ensures that experts are specialized and routed evenly across a general web corpus, preventing a few experts from becoming "winners" while others remain untrained "dead weight".

The speed of the training run was facilitated by Arcee’s early access to Nvidia B300 GPUs (Blackwell). These chips provided roughly twice the speed of the previous Hopper generation and significant memory increases.

"Pre-training was 33 days," Atkins noted. "We could have done it on Hopper, and probably would have taken two to three months. And by that point, we're in a completely new generation of models".

In partnership with DatologyAI, Arcee utilized over 8 trillion tokens of synthetic data. However, this was not typical "imitation" synthetic data where a smaller model learns to talk like a larger one.

Instead, the intent was to take raw web text—such as blogs or Wikipedia articles—and synthetically rewrite it to condense the information into a smaller number of total tokens. This process helped the model learn to reason over information rather than just memorizing exact token strings.

The architectural design also incorporates alternating local and global sliding window attention layers in a 3:1 ratio. This hybrid approach allows the model to be highly efficient in long-context scenarios. While trained for a 256k sequence length, Trinity Large natively supports 512k context, and evaluations suggest it remains performant even at the 1-million-token horizon.

Technical comparison: Trinity Large vs. gpt-oss-120b

As an American alternative, Trinity Large can be compared to OpenAI's gpt-oss-120b.

While both models utilize sparse architectures to achieve frontier-level performance under permissive licenses, they serve different operational roles.

While gpt-oss-120b currently holds an edge in specific reasoning and math benchmarks, Trinity Large offers a significant advantage in context capacity and raw parameter depth for complex, multi-step agentic workflows.

Sovereignty: filling the vacuum

The release of Trinity Large is as much a geopolitical statement as a technical one. CEO Mark McQuade noted to VentureBeat in the same interview that the vacuum of American open-source models at the frontier level forced a pivot in Arcee’s strategy.

"There became this kind of shift where US based or Western players stopped open sourcing these models," McQuade said. "We're relying on these models to then go into organizations and take them further… but the Chinese labs just started… producing frontier state of the art models and open sourcing them".

For McQuade, this created a dependency that American enterprises were increasingly uncomfortable with. "Especially in conversation we're having with large organizations, they were unable to use Chinese based architectures," he explained. "We want to be that champion in the US. [It] actually doesn't exist right now".

By releasing under the Apache 2.0 license, Arcee provides the gold-standard permissive framework that allows companies to "own" the model layer entirely. This is critical for industries like finance and defense, where utilizing a model hosted by a third party or a restrictive cloud provider is a non-starter.

Balancing intelligence with utility

Arcee is currently focusing on the "current thinking model" to transition Trinity Large from a general instruct model into a full reasoning model. The team is wrestling with the balance between "intelligence vs. usefulness"—striving to create a model that excels on benchmarks without becoming "yappy" or inefficient in actual production applications.

"We built Trinity so you can own it," the team states, signaling a return to the foundational values of the American open-source movement. As the industry moves toward agentic workflows and massive context requirements, Trinity Large positions itself not as a "wrapper," but as a sovereign infrastructure layer that developers can finally control.

[/gpt3]

Search

Latest Stories

Clintons scheduled to give House Oversight testimony

Pete Buttigieg leads 2028 Democratic presidential poll in New Hampshire

Eric Dane Dead at 53

As Trump pushes Iran to make a deal, scores of U.S. warplanes join “armada” heading for the region

Amanda Nunes claps back with a strong message after Sean Strickland’s shocking remarks