Originally posted by Alexmitter
View Post
Announcement
Collapse
No announcement yet.
GFX1013 Target Added To LLVM 13.0 For RDNA2 APUs
Collapse
X
-
Originally posted by hoohoo View Post
Two points:
- LLVM compilers have nothing to do with virtual machines per se.
Unusable for serious business.
Comment
-
Originally posted by StillStuckOnSI View Post
Yeah, the VM in LLVM should be mostly ignored as a curiosity. It's an IR manipulation and code generation framework first and foremost.
I would agree that it maybe is okay enough for such compute on GPGPU taskes, but that is about where the line of its okay to use llvm ends.
For GPGPU, a purpose made compiler will always be faster.
For normal compile tasks, a proper compile will always be better, pseudo bytecode and a low level virtual machine are not acceptable where reliability, performance or any sane reasoning are important.
Comment
-
Originally posted by Alexmitter View PostI would not call the VM in LLVM a curiosity. After all its generated pseudo bytecode will be executed by this VM, its the main (anti) feature of LLVM.
I would agree that it maybe is okay enough for such compute on GPGPU taskes, but that is about where the line of its okay to use llvm ends.
For GPGPU, a purpose made compiler will always be faster.
For normal compile tasks, a proper compile will always be better, pseudo bytecode and a low level virtual machine are not acceptable where reliability, performance or any sane reasoning are important.
EDIT - it occurred to me that this would also imply that the bytecode interpreter would have to run on the shader core (with a thread for every simultaneously processed pixel/vertex) which while not impossible would certainly be unusual.
If so then I have to disagree. It is *possible* for clang/llvm to generate bytecode but the most common use is to generate native ISA, whether that be x86, ARM or GPU shader code. That is what we do in all of the open source drivers.Last edited by bridgman; 21 June 2021, 10:22 AM.Test signature
- Likes 4
Comment
-
Originally posted by Alexmitter View Post
Yes sure, the compiler itself has nothing to do with a VM, the VM is part of the final "binary" beside the generated pseudo byte-code that the VM will then execute at runtime. <snip>.
You look at what ROCm is doing when you run a compute kernel: it invokes LLVM to generate GPU-specific instructions. You can see it doing it over and over and over and over when you use something like Tensorflow or Caffe and you run a new or modified model. That's not a VM style scheme: you can see the arch-specific object files under (IIRC) ~/.cache/rocm*
- Likes 1
Comment
-
If I were to believe all the very harshly worded and confidently presented claims by Alexmitter, that would mean that both Intel and AMD have no competent developers who can see these "obvious" downsides of LLVM (which are so obvious that some random dude on a forum can identify them immediately) as both companies make extensive use of LLVM in their graphics stacks. That sounds somewhat unlikely to me. Also, he does not seem to be entirely clear when it comes to what LLVM compilers actually do...
- Likes 1
Comment
-
Originally posted by GruenSein View PostIf I were to believe all the very harshly worded and confidently presented claims by Alexmitter
..... some random dude on a forum ...
Comment
-
the "LLVM compiler " is not used to generate VM bytecode that is interpreted on some VM on the target ISA.. The idea here is a completly different:
LLVM has Frontends and Backends and an intermediate "IR" ISA neutral bytecode representation.
Typically you would write program code in the language of your choice, generate IR from it, run optimizations on that IR and then generate ISA specific instructions from that IR running further ISA specific optimizations in the process..
The "IR" is only used as an intermediate representation here, that way you don´t have to write like "C -> x86, C -> arm64, fortran -> x86, fortran -> arm64" compilers, but you only write
C -> IR, Fortran -> IR frontends
and
IR -> x86
IR -> arm64
Backends..
You could furthermore reuse all optimizations run on the IR step..
It should be obvious that this saves a lot of work with every ISA and programming language added..
It´s possible to run IR in an interpreter / vm, but it is very very rarely done.
B.t.w.: It´s good news, that GFX1013 ISA is added to LLVM, as this means that we will probably see RDNA (gfx1010) / RDNA2 (gfx1013) support in ROCm pretty soon.
B.t.w. ROCm uses hand optimized assembly code for the specific ISA / card for a lot of functions like BLAS and machine learning like the winograd convolution kernel, as compilers don´t deliver the fastest code possibleLast edited by Spacefish; 21 June 2021, 08:00 PM.
Comment
-
Originally posted by Spacefish View PostB.t.w. ROCm uses hand optimized assembly code for the specific ISA / card for a lot of functions like BLAS and machine learning like the winograd convolution kernel, as compilers don´t deliver the fastest code possibleTest signature
Comment
Comment