NVIDIA's CUDA/OpenCL PTX Back-End In LLVM 3.2
NVIDIA's "NVPTX" Parallel Thread Execution back-end is replacing the earlier PTX back-end that was previously living within LLVM. NVPTX is what NVIDIA opened up out of their NVCC CUDA and OpenCL compiler, so it's rather high quality and in very good shape. NVPTX is compatible with PTX 3.1 and SM 3.5, supports NVVM intrinsics of the NVIDIA Compiler SDK, is fully compatible with the old PTX back-end, and has much greater coverage of going from LLVM IR to PTX code.
NVIDIA published this new PTX back-end back in April. Parallel Thread Execution is an Assembly-like language that NVIDIA's graphics driver then translates into binary code for the respective hardware. With this NVPTX back-end now being open-sourced and part of LLVM, new possibilities are opened up. Though this work won't directly benefit the open-source Nouveau graphics driver project since it doesn't deal with PTX and the current Nouveau implementation takes LLVM IR and converts it into Gallium3D TGSI for use by their existing compiler.
On the other side of the table, the Radeon R600 back-end was recently merged into LLVM but that won't be appearing in an official release until next year with LLVM 3.3.
With LLVM 3.2 there is also improved CPU support for everything from Apple's A6 SoC in the iPhone 5 to better handling AVX2 with Intel Haswell CPUs, introduces an automatic loop vectorizer, better Polly optimizations, and much more.
Look for the LLVM 3.2 release to happen any time now... The original release plan was to release LLVM 3.2 today (16 December), but there's been no word if it's still happening or has been pushed back by a few days.