Multiverse Computing today announced the release of HyperNova 60B 2602, a 50% compressed version of OpenAI’s gpt-oss-120B, now freely available on Hugging Face. The Spanish AI compression leader has reduced the model’s memory requirements from 61GB to 32GB while maintaining near-parity in tool-calling performance, demonstrating significant gains on agentic benchmarks like Tau2-Bench and Terminal Bench Hard. This release signals a move toward efficiency-led AI, particularly relevant as European policymakers prioritize sovereign AI and address infrastructure limitations. “The launch of HyperNova 60B 2602 demonstrates compression as an iterative process of improvement, not a one-time optimization,” said Enrique Lizaso Olmos, CEO of Multiverse Computing, highlighting the company’s commitment to accessible, high-performance AI.

HyperNova 60B 2602: 50% Compression of gpt-oss-120B

A 50% reduction in size doesn’t have to mean a compromise in performance, according to Multiverse Computing’s latest release, HyperNova 60B 2602. This isn’t simply about shrinking models; it’s about fundamentally altering the trajectory of AI development, prioritizing efficiency alongside raw computational power. Initial benchmarks reveal substantial gains in agentic performance, with a five-fold improvement on the Tau2-Bench and a doubling of performance on the Terminal Bench Hard. This advancement arrives at a critical juncture, as European policymakers increasingly emphasize sovereign AI and address existing infrastructure limitations.

Multiverse Computing’s approach, leveraging its proprietary CompactifAI technology—which utilizes quantum-inspired mathematics—offers a pathway to reduce both compute costs and the carbon footprint of large language models. CompactifAI preserves the most vital components of neural networks, achieving up to 95% size reduction with minimal accuracy loss, a significant leap beyond the typical 20-30% accuracy loss seen with conventional 50-60% compression. The release builds upon the success of HyperNova 60B, launched in January 2026, and incorporates developer feedback to enhance tool-calling capabilities.

Specifically, HyperNova 60B 2602 shows a 1.5x improvement on the BFCL v4 benchmark for function calling, further solidifying its potential for advanced agentic workflows and coding applications. According to Multiverse, the model maintains near-parity in tool-calling performance with the full-size gpt-oss-120B, proving that substantial compression doesn’t necessitate sacrificing intelligence.

CompactifAI Technology & Quantum-Inspired Model Compression

Multiverse Computing is pioneering a new approach to artificial intelligence, moving beyond simply scaling up model size to focus on radical efficiency gains through its CompactifAI technology. This compression method, rooted in quantum-inspired mathematics, doesn’t merely shrink large language models (LLMs) – it fundamentally restructures them, preserving core intelligence while dramatically reducing computational demands. This contrasts sharply with conventional compression techniques that typically sacrifice 20-30% accuracy for 50-60% size reduction. The impact is substantial, enabling deployment on less powerful hardware and reducing the energy footprint of increasingly complex AI systems.

The company’s commitment to open access, releasing models like HyperNova 60B 2602 freely on Hugging Face, is intended to foster wider experimentation and validation within the developer community. This move aligns with growing European interest in sovereign AI and addressing infrastructure limitations, offering a viable path toward accessible, high-performance AI for a broader range of applications.

Agentic Benchmarks Show 5x Gains on Tau2-Bench

Multiverse Computing, a Spanish firm specializing in AI compression, is demonstrating substantial performance leaps with its newly released HyperNova 60B 2602 model. Beyond simply reducing model size, the company is achieving significant gains in how these compressed models perform – specifically in agentic tasks. This isn’t merely academic; it suggests a viable path toward powerful AI accessible with considerably fewer computational resources. The core of this advancement lies in Multiverse’s CompactifAI technology, which employs quantum-inspired mathematics to streamline neural networks.

This allows for a 50% reduction in model size – shrinking the memory requirement from 61GB to 32GB for the gpt-oss-120B base model – while maintaining near-identical tool-calling performance. Further validation comes from performance on the BFCL v4 benchmark, where the model achieved a 1.5x improvement in function calling. Multiverse emphasizes that this isn’t about sacrificing accuracy for size; their compression maintains precision within a narrow 2-3% margin, a significant advantage over industry standards that often see 20-30% accuracy loss with similar compression levels. Lizaso Olmos adds, “By continuing to refine HyperNova and releasing it openly, we’re answering and giving developers the tools to experiment, validate, and deploy efficient AI without massive infrastructure investments.”

Unlike conventional compression techniques that rely heavily on weight quantization or structured pruning, which often discard statistically redundant information, CompactifAI employs a form of knowledge distillation guided by quantum-inspired graph theory. This approach fundamentally re-routes and reconstructs the critical activation pathways within the transformer architecture, effectively separating high-utility knowledge components from sheer size overhead. This preservation strategy is key to maintaining robust performance across complex reasoning tasks far exceeding what traditional lossy compression methods can achieve.

From an operational standpoint, shrinking the model footprint from 61GB to 32GB is not merely an academic achievement; it democratizes access to frontier LLM capabilities. It enables deployment on considerably lower-tier GPU clusters, embedded edge devices, and enterprise hardware that previously lacked the necessary VRAM capacity. This efficiency is crucial for globalizing advanced AI workflows, moving complex agentic reasoning out of hyperscale cloud environments and into localized, cost-effective edge compute infrastructure.

However, the field of compressed LLMs still faces critical challenges, particularly regarding extreme long-context window processing and multi-modal coherence. While HyperNova 60B 2602 excels in tool-calling metrics, researchers are now focusing on preserving state fidelity when the model must maintain context across millions of tokens, particularly when integrating visual or temporal data streams. Future iterative improvements must address these multi-modal and longitudinal consistency constraints.

The launch of HyperNova 60B 2602 demonstrates compression as an iterative process of improvement, not a one-time optimization. Each generation of our compressed models pushes the boundaries of what’s possible with efficient AI.
Enrique Lizaso Olmos, CEO of Multiverse Computing

Source: https://huggingface.co/MultiverseComputingCAI/Hypernova-60B-2602

Tags:

Hugging Face Multiverse Computing OpenAI

Dr. Donovan

Multiverse launches HyperNova 60B Quantum LLM, 50% compressed

HyperNova 60B 2602: 50% Compression of gpt-oss-120B

CompactifAI Technology & Quantum-Inspired Model Compression

Agentic Benchmarks Show 5x Gains on Tau2-Bench

Latest Posts by Dr. Donovan:

SPINS Project Aims for Millions of Stable Semiconductor Qubits

Two Clicks Enough for Expert Echolocators to Sense Objects

Adam Back Says Quantum Risk to Crypto Not Imminent Now