Nvidia’s most recent GeForce RTX 5090 sees inference performance on the R1 Deepseek much faster than RX 7900 XTX of AMD, credited with new fifth generation tensor nuclei.
Access to Deepseek reasoning models with new NVIDIA RTX GPUs is now quite easy, that too with high -level performance
Well, it seems that the general public GPUs can be one of the best ways to execute high -end LLM models on local machines, because Nvidia and AMD are determined to provide appropriate environments for this execution. We have recently seen AMD present the prowess of the flagship GPU of the Arnc 3 on the Deepseek R1 LLM model, and now, Team Green replied By presenting inference benchmarks operating on their new RTX Blackwell GPUs, and the figures show that the GeForce RTX 5090 has dominated.


On several Deepseek R1 models, the GeForce RTX 5090 shows a clear lead from the Radeon RX 7900 XTX and even the counterpart of the previous generation. The GPU managed to run up to 200 tokens per second in Distill Qwen 7b and Distill Llama 8B, which marks almost twice as much as RX 7900 XTX of AMD. This shows how the dominant IA performance on the NVIDIA GPUs will be, and with the vast “RTX on AI” support, we will see Edge IA on general public PCs are much more frequently.
For people wishing to execute Deepseek R1 on NVIDIA RTX GPUs, the company has published a dedicated blog to guide users, and it is interesting to note that it is as simple as to execute any chatbot on Internet. Here’s how you can access it:
To help developers safely experience these capacities and build their own specialized agents, the Deepseek-R1 model of 671 billion billion Build.nvidia.com. The NIM Deepseek-R1 microservice can provide up to 3,872 tokens per second on a single NVIDIA HGX H200 system.
Developers can test and experiment with the application programming interface (API), which should be available from a downloadable nim microservice, part of the NVIDIA AI ENTERPRISE Software platform.
The Nim Deepseek-R1 microservice simplifies deployments with support for standard APIs in the industry. Companies can maximize data security and confidentiality by running the NIM microservice on their favorite accelerated IT infrastructure.
– nvidia
With NVIDIA NIM, developers and enthusiasts can easily try the AI model on their local versions, which indeed means that not only your data will be saved, but local execution can also provide improved performance, being given that material capacities support it.