NVIDIA's CUDA/OpenCL PTX Back-End In LLVM 3.2

Posted by Michael Larabel on December 16, 2012

In preparing for the imminent release of LLVM 3.2, another worthwhile feature to go over is the NVPTX back-end that's been merged for this forthcoming open-source compiler infrastructure release. The NVPTX LLVM back-end is what's used by NVIDIA's closed-source driver for its CUDA and OpenCL compiler.

NVIDIA's "NVPTX" Parallel Thread Execution back-end is replacing the earlier PTX back-end that was previously living within LLVM. NVPTX is what NVIDIA opened up out of their NVCC CUDA and OpenCL compiler, so it's rather high quality and in very good shape. NVPTX is compatible with PTX 3.1 and SM 3.5, supports NVVM intrinsics of the NVIDIA Compiler SDK, is fully compatible with the old PTX back-end, and has much greater coverage of going from LLVM IR to PTX code.

NVIDIA published this new PTX back-end back in April. Parallel Thread Execution is an Assembly-like language that NVIDIA's graphics driver then translates into binary code for the respective hardware. With this NVPTX back-end now being open-sourced and part of LLVM, new possibilities are opened up. Though this work won't directly benefit the open-source Nouveau graphics driver project since it doesn't deal with PTX and the current Nouveau implementation takes LLVM IR and converts it into Gallium3D TGSI for use by their existing compiler.

On the other side of the table, the Radeon R600 back-end was recently merged into LLVM but that won't be appearing in an official release until next year with LLVM 3.3.

With LLVM 3.2 there is also improved CPU support for everything from Apple's A6 SoC in the iPhone 5 to better handling AVX2 with Intel Haswell CPUs, introduces an automatic loop vectorizer, better Polly optimizations, and much more.

Look for the LLVM 3.2 release to happen any time now... The original release plan was to release LLVM 3.2 today (16 December), but there's been no word if it's still happening or has been pushed back by a few days.

Discuss this article in our forums, IRC channel, or email the author. You can also follow our content via RSS and on social networks like Facebook, Identi.ca, and Twitter (@Phoronix and @MichaelLarabel). Subscribe to Phoronix Premium to view our content without advertisements, view entire articles on a single page, and experience other benefits.
Latest Hardware Reviews
  1. Sumo Lounge Emperor
  2. Gallium3D Continues Improving OpenGL For Older Radeon GPUs
  3. 15-Way Open vs. Closed Source NVIDIA/AMD Linux GPU Comparison
  4. Nouveau vs. NVIDIA Linux Comparison Shows Shortcomings
Latest Software Articles
  1. GCC 4.8.0 vs. LLVM Clang 3.3 Compiler Performance
  2. Intel Linux OpenGL Driver Leading Over Apple OS X
  3. The Cost Of Ubuntu Disk Encryption
  4. Btrfs vs. EXT4 vs. XFS vs. F2FS On Linux 3.10
Latest Linux News
  1. A New X.Org-Free Wayland LiveCD Released
  2. Unity 8, Mir Made Progress This Week On Features
  3. LLVM Clang 3.3 RC2 Is Ready For Testing
  4. AMD RadeonSI Gallium3D Begins Simple CL Demos
  5. Intel Shows Off GNOME3-Based Tizen Shell
  6. Linux Desktop Security Could Be A Whole Lot Better
  7. KDE 4.11 Will Be The Last Major KDE4 Workspaces Feature Release
  8. New NVIDIA Linux Driver Supports The GeForce GTX 780
  9. Chrome 28 To Offer More Speed Improvements
  10. Digia Announces "Boot To Qt" Project
  11. X.Org Libraries Hit By Round Of Security Issues
Latest Forum Talk
  1. A New X.Org-Free Wayland LiveCD Released
  2. Microsoft Releases Skype For Linux 4.2, Has...
  3. Unity 8, Mir Made Progress This Week On Features
  4. Linux Desktop Security Could Be A Whole Lot Better
  5. AMD RadeonSI Gallium3D Begins Simple CL Demos
  6. X.Org Libraries Hit By Round Of Security Issues
  1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Motherboards
  5. Peripherals
  6. Processors
  7. Software
  8. Operating Systems
  9. All Articles
  1. Linux Benchmarking
  2. OpenBenchmarking.org
  3. Phoronix Test Suite