Articles & Reviews
News Archive
Forums
Premium
Contact
Categories

Computers Display Drivers Graphics Cards Linux Gaming Memory Motherboards Processors Software Storage Operating Systems Peripherals

NVIDIA Releases TensorRT 8.0 With Big Performance Improvements

Written by Michael Larabel in NVIDIA on 20 July 2021 at 09:00 AM EDT. 5 Comments

NVIDIA today is making available a much faster version of TensorRT, its SDK for optimized deep learning inference on their GPUs.

With TensorRT 8 that is being made public today, NVIDIA is reporting "2x performance" relative to the existing TensorRT 7 release. That 2x performance is around transformer optimizations while they are also claiming 2x accuracy against TensorRT 7 when using INT8 with quantization aware training.

TensorRT 8 also brings the BERT-Large inference time down to 1.2 ms on a V100, which is 2.5x faster than TensorRT 7. TensorRT 8 also has sparsity support for Ampere GPUs, among other improvements.

TensorRT 8.0 should be available shortly via developer.nvidia.com.

5 Comments

Related News

NVIDIA Releases EGL-Wayland 1.1.17

NVIDIA 565.77 Linux Driver Released As First Stable R565 Build

NVIDIA RTX Remix 0.6 Brings CPU/GPU Performance Improvements

NVIDIA MLX5 Introducing Data Direct Placement "DDP" In Linux 6.13 For Boosting Bandwidth

NVIDIA Outlines Current Wayland Limitations & Upcoming Driver Features

NVIDIA Shipping Around One Billion RISC-V Cores In Their 2024 Products

About The Author

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.