Announcement

**lkcl** · 07 October 2018, 12:37 AM

Originally posted by davidbepo View Post

huh? a cpu based gpu... larrabee comes to mind and that was an utter failure

larrabee specifically avoided adding in the kinds of custom accelerated instructions that would make it a GPU, as the *specific* goal of that research effort was not to make yet another hardware-based GPU, it was to see *if* a software-based GPU would be successful. as a scientific experiment, it produced a result (which intel had to censor). as a parallel processing compute engine however its performance was ground-breaking.

however the team that worked on larrabee weren't allowed to tell anyone how bad the performance was: it was not until jeff bush replicated that work in nyuzi and made the full design source code public that it was possible to determine *EXACTLY* where the performance was lacking.

i've spent a lot of time talking with jeff (he's really an amazing guy), and he pointed out things to me such as, if nyuzi / larrabee had a single instruction for converting 4-wide F.P. vectors of ARGB into 4 32-bit pixels, for example, that would knock something like... i can't remember exactly... let's say it would knock 20% off the time spent per pixel on rendering. then, the next highest priority to target would be... X (whatever).

basically his paper lays out the groundwork on how to go about profiling a software-rendered design (which is a LOT easier than profiling a hybrid hardware-software design), giving you the statistics needed to decide where to focus time and effort.

and, as the design is based on RISC-V and there are software emulators for that (qemu and spike), the process of doing iteratve development to add *in* the kinds of experimental custom instructions, to see what would and would not work, can be much more rapid than would otherwise be expected.

bottom line: we're aware of larrabee, and nyuzi, and have a strategy in place *thanks* to that work.

**fuzz** · 07 October 2018, 10:34 AM

Originally posted by lkcl View Post

larrabee specifically avoided adding in the kinds of custom accelerated instructions that would make it a GPU, as the *specific* goal of that research effort was not to make yet another hardware-based GPU, it was to see *if* a software-based GPU would be successful. as a scientific experiment, it produced a result (which intel had to censor). as a parallel processing compute engine however its performance was ground-breaking.

however the team that worked on larrabee weren't allowed to tell anyone how bad the performance was: it was not until jeff bush replicated that work in nyuzi and made the full design source code public that it was possible to determine *EXACTLY* where the performance was lacking.

i've spent a lot of time talking with jeff (he's really an amazing guy), and he pointed out things to me such as, if nyuzi / larrabee had a single instruction for converting 4-wide F.P. vectors of ARGB into 4 32-bit pixels, for example, that would knock something like... i can't remember exactly... let's say it would knock 20% off the time spent per pixel on rendering. then, the next highest priority to target would be... X (whatever).

basically his paper lays out the groundwork on how to go about profiling a software-rendered design (which is a LOT easier than profiling a hybrid hardware-software design), giving you the statistics needed to decide where to focus time and effort.

and, as the design is based on RISC-V and there are software emulators for that (qemu and spike), the process of doing iteratve development to add *in* the kinds of experimental custom instructions, to see what would and would not work, can be much more rapid than would otherwise be expected.

bottom line: we're aware of larrabee, and nyuzi, and have a strategy in place *thanks* to that work.

I don't know much about this stuff, but what you're saying reminds me about some of the things Alan Kay claimed about the underestimated performance of SPARC. Makes it feel like the industry stagnated if not went backwards for the last 20-30 years.

**lkcl** · 08 October 2018, 02:16 AM

Originally posted by fuzz View Post

I don't know much about this stuff, but what you're saying reminds me about some of the things Alan Kay claimed about the underestimated performance of SPARC. Makes it feel like the industry stagnated if not went backwards for the last 20-30 years.

Number9 Graphics Card: gone. Matrox: gone. ATI: bought by AMD. the innovation is now only done by a handful of extremely large companies, with massive patent portfolios. only a libre project stands a chance as it would be political and corporate suicide for the incumbents to try anything.

Innovation is taking place... behind closed, secretive doors. an example in another industry: I have a friend who used to work for Seagate, he said that the state of knowledge within each of the HDD companies into electro-magnetism is TWENTY YEARS ahead of the outside world. twenty years! *all* of the HDD companies systematically and routinely reverse-engineer each others' products, performing deep atomic-level scanning. he considered it so unethical that he quit and went to work in academia.

**fuzz** · 09 November 2018, 02:42 PM

Originally posted by lkcl View Post

Innovation is taking place... behind closed, secretive doors. an example in another industry: I have a friend who used to work for Seagate, he said that the state of knowledge within each of the HDD companies into electro-magnetism is TWENTY YEARS ahead of the outside world. twenty years! *all* of the HDD companies systematically and routinely reverse-engineer each others' products, performing deep atomic-level scanning. he considered it so unethical that he quit and went to work in academia.

It's sad to say that I believe that story easily. Thanks for the insight.

**juanrga** · 25 December 2018, 11:33 AM

Originally posted by brent View Post

I'm not quite sure what to think of this, after the big failure that EOMA68 has been. The basic idea doesn't sound that great either: GPUs are efficient because of substantial amount of fixed-function, special-purpose hardware to accelerate common tasks like rasterization, texture sampling, geometry processing and now even raytracing. An array of general-purpose CPUs won't cut it.

Some people disagrees on the efficiency of that fixed-function hardware. The problem is that not all the GPU workloads require the same ammount of resources (e.g. some games require more texture functions, other require more shaders), so engineers have to overdimension the fixed-function parts to not bottleneck specific workloads, and most of the time a part of the fixed-function hardware isn't used.

Announcement

There's A New Libre GPU Effort Building On RISC-V, Rust, LLVM & Vulkan

Comment

Comment

Comment

Comment

Comment