BTW if I remember correctly the only generated code in the ROCm stack other than compiled binaries is (a) register headers in the kernel driver (upstream community project) and (b) code generated by Tensile (a compute kernel optimizer) as part of some of the math libraries (maybe just BLAS). I don't think Tensile-generated code is visible except at runtime during a debug session. Is there some other generated code you didn't like ?
One last question for now I hope... the ROCm stack is basically made up of:
- low level drivers (mostly upstream)
- HIP/Clang compiler/runtime
- OpenCL compiler/runtime
- OpenMP compiler/runtime
- a lot of optimized libraries
- porting/tuning of ML frameworks and HPC apps (with the changes going into the upstream versions of those projects)
The compiler/runtime portions are built around upstream LLVM and Clang, along with back-end code that connects to the low level drivers.
Are there specific parts of the stack where you think the code is particularly bad ? My concern is that you may be conflating a historically clunky build/install experience with the actual source code.
Thanks !
One last question for now I hope... the ROCm stack is basically made up of:
- low level drivers (mostly upstream)
- HIP/Clang compiler/runtime
- OpenCL compiler/runtime
- OpenMP compiler/runtime
- a lot of optimized libraries
- porting/tuning of ML frameworks and HPC apps (with the changes going into the upstream versions of those projects)
The compiler/runtime portions are built around upstream LLVM and Clang, along with back-end code that connects to the low level drivers.
Are there specific parts of the stack where you think the code is particularly bad ? My concern is that you may be conflating a historically clunky build/install experience with the actual source code.
Thanks !
Comment