Originally posted by quaz0r
View Post
Announcement
Collapse
No announcement yet.
New Heterogeneous Memory Management For Linux, Will Be Supported By NVIDIA/Nouveau
Collapse
X
-
Originally posted by boxie View Post
Yeah but don't let the facts get in the way of a good rant!
to c117152:
There is a time and a place for optimizing sw explicitly for a single purpose on a single hw platform. And yeah, you can extract out that last bit of perf by pinning all your pages and explicitly controlling what data lives in what memory when, using hand written asm, etc.
But there are a lot of cases where you want some sw to run on a lot of different hw platforms but not economically feasible to hand-tune for each one, yet you still want to get the benefit of gpu offload when possible, since while that might be 10% slower (made up number) than something specifically tuned for some particular hw, it is still a lot faster than the alternative ;-)
- Likes 3
Comment
-
Originally posted by robclark View Post
fair 'nuf :-P
to c117152:
There is a time and a place for optimizing sw explicitly for a single purpose on a single hw platform. And yeah, you can extract out that last bit of perf by pinning all your pages and explicitly controlling what data lives in what memory when, using hand written asm, etc.
But there are a lot of cases where you want some sw to run on a lot of different hw platforms but not economically feasible to hand-tune for each one, yet you still want to get the benefit of gpu offload when possible, since while that might be 10% slower (made up number) than something specifically tuned for some particular hw, it is still a lot faster than the alternative ;-)
- Likes 1
Comment
-
Originally posted by droste View Post
The world is not black and white. It has its use cases.: https://en.wikipedia.org/wiki/C_dyna...on#Type_safety
http://www.dirtcellar.net
Comment
-
Originally posted by karolherbst View Post
and now if you explain how you fit this into his "This madness is why people still get paid for writing assembly." you get a cookie from me!
(not that I'm an advocate of that.. just that is an example of where the economics are highly skewed to optimizing for a highly specific problem + hw.. I'm sure those folks are getting paid a lot of $$$ for writing asm and doing whatever else possible to squeeze out a few percent better perf.. to the benefit of a very few..)
Comment
-
Originally posted by robclark View Post
umm, high frequency trading?
(not that I'm an advocate of that.. just that is an example of where the economics are highly skewed to optimizing for a highly specific problem + hw.. I'm sure those folks are getting paid a lot of $$$ for writing asm and doing whatever else possible to squeeze out a few percent better perf.. to the benefit of a very few..)
Comment
-
Originally posted by karolherbst View Post
uhh, I meant more like ... I forgot anyway, so it doesn't matter.
anyways, original point I was trying to make is that there is a lot of value in unlocking gpu perf for a whole new class of problems.. the fact that you could hypothetically extract slightly more perf by optimizing for a particular problem + hw combo doesn't change that.. that was only point I was trying to make to original original (original? I lost track..) poster :-)
Comment
Comment