PathScale's ENZO Compiler To Speed Code On GPU
PathScale, the company that's focused on providing high-performance compiler solutions, is hoping to speed up traditional software packages by automatically leveraging the graphics processor when compiling software with the PathScale ENZO compiler.
After writing this weekend about the PathScale EKOPath 5.0 Beta, Christopher Bergström, the CTO of PathScale, commented in our forums about some of their current work.
First of all, one of the interesting features that Bergström notes about EKOPath 5.0 is that it's leveraging some of Clang. Clang is the C/C++ front-end compiler to LLVM and while EKOPath isn't based upon LLVM, they have adapted some of the BSD-licensed Clang code to their compiler. "EKOPath 5 is really a *BIG* difference behind the scenes. For example if you do pathcc -show hello.c # You'll notice that we're using a modified clang as part of the process. This is almost certainly what allowed us build those additional benchmarks. (To clarify a bit - we're not using the llvm backend or any llvm ir. In the past we were using a modified gnu cc1, but that and all other gnu code has been removed.)"
It was also revealed that the succeeding EKOPath compiler (likely EKOPath 5.5) will feature a new compiler back-end. "EKOPath 5.5 will have a new backend we've been working on, but pushing out both of those big changes at the same time just wasn't possible."
Bergström then commented on their ENZO compiler product that is a GPGPU and multi-core solution that supports HMPP, Fortran, and C/C++ languages. ENZO features true GPGPU network zero copy, the PSCNV open-source compute Tesla driver (their Nouveau driver fork), PathAS assembler support for GPGPU, is compatible with CUDA code, and they have their own PathScale C++ template/class libraries for GPGPU. This "sister compiler to EKOPath", he explains, can now accelerate more code on the GPU. "While it's not possible speed-up every code on the GPU - We have put a huge amount of work in the programming models available for ENZO and it's backend performance. Personally, I don't get as excited (or worried) about 5-10% CPU performance when we can offer 30% gains to 10x with the GPU. I can't make promises, but we may try to drop a few OpenACC pragma around those benchmarks and post numbers on a Tesla 2050."
He followed up to a Phoronix reader comment about the GPU usage with, "I'd certainly recommend you test EKOPath and Intel compilers if you don't have a GPU. If you can get access to a system with a GPU (Tesla 2050, 2070 or 2090) *and* you're willing to add some pragma or directives to your code ENZO may be interesting. (The performance gains can be well worth the effort) We're working on support for -autogpu which like autovectorization or other automatic optimizations requires zero code changes. This isn't ready for production and just "noteworthy" at this point. (Honestly, give us a couple more months)."
It looks like interesting times are ahead for compilers exploiting the massively parallel capabilities of graphics processors. On the open-source side there's been work for LLVM automatic GPGPU code generation although nothing too exciting for end-users at the moment. Unfortunately, OpenACC isn't too widely implemented by open-source compilers at the moment. OpenACC is an industry, open standard to simplify parallel programming on CPUs and GPUs.
After writing this weekend about the PathScale EKOPath 5.0 Beta, Christopher Bergström, the CTO of PathScale, commented in our forums about some of their current work.
First of all, one of the interesting features that Bergström notes about EKOPath 5.0 is that it's leveraging some of Clang. Clang is the C/C++ front-end compiler to LLVM and while EKOPath isn't based upon LLVM, they have adapted some of the BSD-licensed Clang code to their compiler. "EKOPath 5 is really a *BIG* difference behind the scenes. For example if you do pathcc -show hello.c # You'll notice that we're using a modified clang as part of the process. This is almost certainly what allowed us build those additional benchmarks. (To clarify a bit - we're not using the llvm backend or any llvm ir. In the past we were using a modified gnu cc1, but that and all other gnu code has been removed.)"
It was also revealed that the succeeding EKOPath compiler (likely EKOPath 5.5) will feature a new compiler back-end. "EKOPath 5.5 will have a new backend we've been working on, but pushing out both of those big changes at the same time just wasn't possible."
Bergström then commented on their ENZO compiler product that is a GPGPU and multi-core solution that supports HMPP, Fortran, and C/C++ languages. ENZO features true GPGPU network zero copy, the PSCNV open-source compute Tesla driver (their Nouveau driver fork), PathAS assembler support for GPGPU, is compatible with CUDA code, and they have their own PathScale C++ template/class libraries for GPGPU. This "sister compiler to EKOPath", he explains, can now accelerate more code on the GPU. "While it's not possible speed-up every code on the GPU - We have put a huge amount of work in the programming models available for ENZO and it's backend performance. Personally, I don't get as excited (or worried) about 5-10% CPU performance when we can offer 30% gains to 10x with the GPU. I can't make promises, but we may try to drop a few OpenACC pragma around those benchmarks and post numbers on a Tesla 2050."
He followed up to a Phoronix reader comment about the GPU usage with, "I'd certainly recommend you test EKOPath and Intel compilers if you don't have a GPU. If you can get access to a system with a GPU (Tesla 2050, 2070 or 2090) *and* you're willing to add some pragma or directives to your code ENZO may be interesting. (The performance gains can be well worth the effort) We're working on support for -autogpu which like autovectorization or other automatic optimizations requires zero code changes. This isn't ready for production and just "noteworthy" at this point. (Honestly, give us a couple more months)."
It looks like interesting times are ahead for compilers exploiting the massively parallel capabilities of graphics processors. On the open-source side there's been work for LLVM automatic GPGPU code generation although nothing too exciting for end-users at the moment. Unfortunately, OpenACC isn't too widely implemented by open-source compilers at the moment. OpenACC is an industry, open standard to simplify parallel programming on CPUs and GPUs.
Add A Comment