
▲DeepSeeker R1 Support in NVIDIA NIM / (Image: NVIDIA)
R1, NIM Microservices Preview Available
NVIDIA announced on the 3rd that it will begin supporting DeepSeek R1 on NVIDIA NIM on the 3rd.
DeepSeek R1 is an open model with state-of-the-art inference capabilities. Instead of providing a direct answer, inference models like DeepSeek R1 perform multiple inference passes on a query to generate the best answer through chain thinking, consensus, and search methods.
R1 is a perfect example of this scaling law, demonstrating the importance of accelerated computing in the demands of agentic AI inference: as the model is allowed to iteratively ‘think’ about the problem, more output tokens and longer generation cycles are produced, continuously scaling model quality.
Implementing both real-time inference and high-quality responses in such inference models requires significant test-time compute. To enable developers to safely experiment with these capabilities and build their own expert agents, the 671 billion-parameter DeepSeek R1 model is now available as a preview in the NVIDIA NIM microservices.
The DeepSec R1 NIM microservice can transfer up to 3,872 tokens per second on a single NVIDIA HGX H200 system. Developers can now test and experiment with the application programming interface (API), which will be delivered as a NIM microservice as part of the NVIDIA AI Enterprise software platform.
Additionally, enterprises can use NVIDIA AI Foundry with NVIDIA NeMo software to create custom DeepSearch-R1 NIM microservices for specialized AI agents.