NVIDIA Releases TensorRT 2; TensorRT 3 Being Prepped For Volta
Written by Michael Larabel in NVIDIA on 27 June 2017 at 12:52 PM EDT. Add A Comment
NVIDIA has made their TensorRT 2 library publicly available today as the newest major update to their deep-learning inference optimizer and run-time.

With TensorRT 2, NVIDIA is reporting 45x faster inference under 7ms real-time latency with INT8 precision. Besides being much faster, TensorRT 2 allows for user-defined layers as plug-ins using TensorRT's Custom Layer API. This inference optimizer and runtime engine also allows sequence-based models for image captioning / language translation and other possible use-cases using LSTM and RNN layers.

Deep learning developers can download TensorRT 2 via developer.nvidia.com.

NVIDIA also revealed in the TensorRT 2 announcement that TensorRT 3 is being worked on for Volta GPUs. TensorRT 3 is looking to be around 3.5x faster for inference when using Tesla V100 hardware compared to Tesla P100. More details on that via the above link.
Related News
About The Author
Author picture

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter or contacted via MichaelLarabel.com.

Popular News This Week