Announcement
Collapse
No announcement yet.
AMD Radeon RX 480 On Linux
Collapse
X
-
Originally posted by bridgman View PostThat's pretty much the big hairy question of the compiler world AFAICS. Winning with toolchains seems to be simple in principle - identify the perfect IR from both code and dev tool perspective, implement toolchains around that IR, profit - but in practice there seem to be conflicting pressures on the IR from the code (complex & non-flat is better) and dev tool (simple and flat is better) perspective and the representations on both sides of the IR keep changing over time.
Having two levels of IR (one source-oriented and one target-oriented) is one solution but even that gets less than ideal when either source or target are changing rapidly. It's probably fair to say that source changes less (some variant of C++) and target changes more (OMG) these days. The other complication is the usual huge gap between processor speeds and memory latencies, which is managed pretty well in single-thread environments (caches tend to be big enough for working sets these days) but which becomes more complex in highly parallel implementations where you need to fit (#threads x working set of registers & heavily used variables) into RF+cache or performance plummets.
I hope the new compiler gets better with register allocation and that it will allow/take hints form the programmer.
Comment
Comment