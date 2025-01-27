In beginning the NVIDIA Blackwell Linux testing with the GeForce RTX 5090 compute performance, besides all the CUDA/OpenCL/OptiX benchmarks delivered last week a number of readers asked about AI performance and in particular the Llama.cpp performance with the RTX 5090 flagship graphics card. Here are some initial benchmarks looking at the GeForce RTX 5090 performance in Llama.cpp compared to prior RTX 40 and RTX 30 graphics cards.

Over the weekend I carried out some initial tests of Llama.cpp as well as re-testing the higher-end GeForce RTX 30 and RTX 40 graphics cards. All tests were carried out using the NVIDIA 570.86.10 Linux driver on Ubuntu 24.10 with the Linux 6.11 kernel. The graphics cards tested included:

- GeForce RTX 3090

- GeForce RTX 4070

- GeForce RTX 4070 SUPER

- GeForce RTX 4080

- GeForce RTX 4080 SUPER

- GeForce RTX 4090

- GeForce RTX 5090

Llama.cpp with Llama 3.1 and Mistral 7B were used for the initial runs with text generation and prompt processing . More Llama.cpp benchmarks against the NVIDIA GeForce RTX 50 graphics cards to come with enough reader interest. For now let's continue on with this initial look.