At NeurIPS, NVIDIA is expanding its collection of open AI models and tools to support advancements in digital and physical AI research. Specifically, NVIDIA unveiled NVIDIA DRIVE Alpamayo-R1, recognized as the world’s first open, industry-scale reasoning vision language action (VLA) model for autonomous driving, alongside new models and datasets for speech and AI safety. This initiative deepens NVIDIA’s commitment to open source, as recently recognized by Artificial Analysis, which benchmarks AI openness based on model licenses, data transparency, and technical details—rating the NVIDIA Nemotron family highly within the AI ecosystem. Alpamayo-R1 integrates chain-of-thought AI reasoning with path planning to enhance autonomous vehicle safety.
NVIDIA Advances Open Model Development
NVIDIA is expanding its open AI offerings with new models and datasets for both digital and physical AI, unveiled at NeurIPS. A key advancement is NVIDIA DRIVE Alpamayo-R1, the first industry-scale open reasoning vision language action (VLA) model designed for autonomous driving research. This model uses chain-of-thought AI reasoning integrated with path planning, enabling more human-like decision-making in complex driving scenarios, such as navigating pedestrian-heavy intersections or lane closures.
The open foundation of Alpamayo-R1, built on NVIDIA Cosmos Reason, allows researchers to customize the model for non-commercial applications and benchmarking. Post-training with reinforcement learning has demonstrated significant improvements in reasoning capabilities. NVIDIA is making Alpamayo-R1 available on GitHub and Hugging Face, along with a subset of the training data and the open-source AlpaSim evaluation framework, furthering accessibility for the research community.
Beyond autonomous driving, NVIDIA is also releasing new tools for digital AI, including MultiTalker Parakeet for multi-speaker speech recognition, and Nemotron Content Safety Reasoning for AI safety. The NeMo Data Designer Library, now open-sourced, facilitates the creation of high-quality synthetic datasets for generative AI development. These additions, alongside the Nemotron family of models, are recognized for their openness and transparency by the Artificial Analysis Open Index.
NVIDIA DRIVE Alpamayo-R1 for Autonomous Driving
NVIDIA DRIVE Alpamayo-R1 (AR1) is presented as the world’s first open, industry-scale reasoning vision language action (VLA) model designed for autonomous driving research. This model integrates chain-of-thought AI reasoning with path planning, a crucial component for improving AV safety in complex scenarios and enabling level 4 autonomy. AR1 aims to address limitations of previous self-driving models by allowing vehicles to reason through scenarios like navigating pedestrian-heavy intersections or lane closures, driving more like humans.
AR1’s foundation, built on NVIDIA Cosmos Reason, allows researchers to customize the model for non-commercial applications, including benchmarking and building experimental AV systems. Post-training with reinforcement learning has shown significant improvements in AR1’s reasoning capabilities. The model, along with related data and the AlpaSim evaluation framework, will be available on GitHub and Hugging Face, and a subset of training data is available within the NVIDIA Physical AI Open Datasets.
The model works by breaking down scenarios and reasoning through each step, considering possible trajectories and using contextual data to choose the best route. For example, in a busy area with pedestrians and bikes, AR1 can process data, incorporate reasoning traces, and plan a trajectory that avoids potential hazards, like moving away from the bike lane or stopping for jaywalkers. This approach aims to create more robust and human-like autonomous driving systems.
Prolonged reinforcement learning, or ProRL, is a technique that extends model training over longer periods. In this NeurIPS poster, NVIDIA researchers describe how this methodology results in models that consistently outperform base models for reasoning.
Cosmos for Broad Physical AI Applications
NVIDIA is expanding open AI models and tools for both digital and physical AI applications, unveiled at NeurIPS. A key component is NVIDIA Cosmos, which serves as a foundation for customizing models for various use cases. Developers can utilize the Cosmos Cookbook—a comprehensive guide—to navigate data curation, synthetic data generation, and model evaluation. Examples include LidarGen for AV simulation and Cosmos Policy for creating robust robot behaviors, demonstrating the versatility of the platform.
NVIDIA DRIVE Alpamayo-R1 (AR1) is the world’s first open reasoning vision language action (VLA) model for autonomous driving research. AR1 integrates chain-of-thought AI reasoning with path planning, improving AV safety in complex scenarios. By breaking down situations and considering possible trajectories, AR1 enables vehicles to drive more like humans. Researchers can customize AR1 using the open foundation based on NVIDIA Cosmos Reason, and benefit from reinforcement learning for improved reasoning capabilities.
The NVIDIA Cosmos world foundation models (WFMs) are being adopted by ecosystem partners for advanced applications. AV developer Voxel51 is contributing recipes to the Cosmos Cookbook, while companies like 1X, Figure AI, and others are utilizing WFMs for their latest physical AI developments. Researchers at ETH Zurich are also exploring Cosmos models for realistic 3D scene creation, highlighting the broad impact and adaptability of the platform.
It’s a comprehensive guide for physical AI developers that covers every step in AI development, including data curation, synthetic data generation and model evaluation.
New Digital AI Models and Datasets Released
NVIDIA is expanding its open AI model collection with several new releases unveiled at NeurIPS. These include NVIDIA DRIVE Alpamayo-R1, the first industry-scale open reasoning vision language action (VLA) model specifically for autonomous driving. Additionally, new digital AI models and datasets are available for speech and AI safety, bolstering tools for research and development. This commitment to open source is recognized by Artificial Analysis, which rated the NVIDIA Nemotron family highly for openness based on licensing, data transparency, and technical detail.
The newly released Alpamayo-R1 (AR1) integrates chain-of-thought AI reasoning with path planning, aiming to improve AV safety in complex scenarios. Unlike previous models, AR1 can reason through situations – like pedestrian intersections or lane closures – mimicking human driving common sense. Researchers can customize AR1 using the NVIDIA Cosmos Reason foundation, and reinforcement learning has proven effective in improving its reasoning capabilities. AR1 and supporting tools, including AlpaSim and datasets, are available on GitHub and Hugging Face.
NVIDIA also released tools for digital AI development, including MultiTalker Parakeet for multi-speaker audio recognition and Sortformer for accurate speaker diarization. Nemotron Content Safety Reasoning and the Nemotron Safety Audio Dataset address AI safety, while NeMo Gym and the NeMo Data Designer Library simplify reinforcement learning and synthetic data generation. These additions aim to empower developers with tools for creating secure and specialized AI agents, with partners like CrowdStrike, Palantir, and ServiceNow already leveraging these technologies.
NVIDIA Research and Ecosystem Partnerships
NVIDIA is expanding its open AI offerings, unveiling new models, datasets, and tools at NeurIPS. Notably, NVIDIA DRIVE Alpamayo-R1 is presented as the world’s first open, industry-scale reasoning vision language action (VLA) model for autonomous driving. This advancement integrates chain-of-thought AI reasoning with path planning, aiming to improve AV safety in complex scenarios and enable level 4 autonomy by allowing vehicles to “reason” like humans. The model and associated data are available on GitHub and Hugging Face.
NVIDIA’s commitment to open source is recognized by Artificial Analysis’s Open Index, which rates the NVIDIA Nemotron family of technologies among the most open in the AI ecosystem. Beyond autonomous driving, NVIDIA released multi-speaker speech AI models—including MultiTalker Parakeet and Sortformer—and tools for AI safety, like Nemotron Content Safety Reasoning, bolstering the digital AI developer toolkit. These releases extend to datasets for training and evaluation, such as the Nemotron Safety Audio Dataset.
Several ecosystem partners are actively leveraging NVIDIA’s Cosmos world foundation models (WFMs) and Nemotron/NeMo tools. AV developers like Voxel51 are contributing to the Cosmos Cookbook, while companies including 1X, Figure AI, and Foretellix are utilizing WFMs for their physical AI applications. Researchers at ETH Zurich are also presenting work at NeurIPS that utilizes Cosmos models for realistic 3D scene creation, demonstrating broad adoption within the research community.
