I just wish we could get the code to SGI IRIX opened up. Now THAT would make MIPS a much more interesting platform. I'm a bit surprised that they (Libre) haven't chosen MIPS for a GPU since it has a very long/proven track record of usage in graphics.
Announcement
Collapse
No announcement yet.
Libre RISC-V Open-Source Effort Now Looking At POWER Instead Of RISC-V
Collapse
X
-
Originally posted by DMJC View PostI just wish we could get the code to SGI IRIX opened up. Now THAT would make MIPS a much more interesting platform. I'm a bit surprised that they (Libre) haven't chosen MIPS for a GPU since it has a very long/proven track record of usage in graphics.
yes, for due diligence, we really do need to look at MIPS. sigh. so much to do
Comment
-
Originally posted by lkcl View Post
last time i tried to contact the MIPS open foundation the website had been taken down (or the page they referred to was 404). basically they haven't the infrastructure in place, just a change in "licensing" arrangement. the Open Power Foundation by contrast has been established for years.
yes, for due diligence, we really do need to look at MIPS. sigh. so much to do
- Likes 4
Comment
-
One thing I've often wondered is why x86 CPUs haven't let assembly and compilers issue microop instructions? If all x86 CPUs just convert to micro ops anyway couldn't we cut out the middle man and let people write straight microop instructions? Sure the decoders would be useless but assuming you could that extra unused (for microop code) silicon would just help with the high heat density we run into now
Comment
-
Originally posted by madscientist159 View PostI'm certain we could free up a few POWER machines to the development team here, though we'd like a bit more focus on potential 4k / PCIe support as that would eliminate one of the last remaining binary blobs in a typical built desktop / workstation POWER system (namely the GPU)...
4k / PCIe will (unless we get sponsors / customers with USD $2m+ budgets) be on the table for a Revision 3. the critical first milestone is to prove the architecture on a minimum budget, so that iterations can be done cheaply. this is why the NLNet Grants last month went in for a *180nm* ASIC (USD $600 per sq.mm, we'll need around 20 sq.mm for a single-core chip) because it's peanuts, and the RTL doesn't care if it's running in 180nm or 14nm. we could do a hundred test revisions at 180nm for the cost of a single 14nm ASIC.
_then_ we ramp up through the geometries, _then_ we ramp up with high-end peripherals and high-end performance. reduce risk, get something done. the LIP6.fr team is doing a 360nm tape-out (early next year), all using alliance / coriolis2.
- Likes 3
Comment
-
Originally posted by tului View PostOne thing I've often wondered is why x86 CPUs haven't let assembly and compilers issue microop instructions? If all x86 CPUs just convert to micro ops anyway couldn't we cut out the middle man and let people write straight microop instructions? Sure the decoders would be useless but assuming you could that extra unused (for microop code) silicon would just help with the high heat density we run into now
The instruction decoder is a tiny part of the chip. With something like POWER that doesn't have these concerns, it might even be possible to make a CPU that allows custom development-only instructions via open microcode (with those instructions then being submitted for inclusion in the actual ISA over time). But as far as x86 being able to do this, let alone being a legal choice for new development outside of Intel or AMD, the answer is a hard no.
- Likes 3
Comment
-
Originally posted by lkcl View Postappreciated: do bear in mind that as we're doing this pretty much from-scratch (not entirely, you know what i mean), if when we talk to any potential RTL licensees they say "oh you'll need a proprietary blob for that" we'll just put the phone down on them and find something else.
Originally posted by lkcl View Post4k / PCIe will (unless we get sponsors / customers with USD $2m+ budgets) be on the table for a Revision 3. the critical first milestone is to prove the architecture on a minimum budget, so that iterations can be done cheaply. this is why the NLNet Grants last month went in for a *180nm* ASIC (USD $600 per sq.mm, we'll need around 20 sq.mm for a single-core chip) because it's peanuts, and the RTL doesn't care if it's running in 180nm or 14nm. we could do a hundred test revisions at 180nm for the cost of a single 14nm ASIC.
For reference, our current display output for blob-free systems is a 1080p single digital output with zero 3D capability aside from LLVMPipe on the host CPU. A few 4k unaccelerated framebuffers, especially on a wide bus like CAPI or multi-lane PCIe, would be a big step up from where we are right now -- and any 3D / accelerator capability would be a significant bonus even if it's fundamentally mismatched in raw performance with the 4k framebuffers.
CAPI does neatly sidestep the PCIe issues, since the RTL etc. is open. Would be humorous on some level to have Rev 1 support CAPI but not support PCIe.Last edited by madscientist159; 20 October 2019, 07:31 PM.
- Likes 4
Comment
-
Originally posted by madscientist159 View Post
So do we!
Let me talk some with folks on my side. If we got you a blob-free PCIe core, somehow, even if it was just PCIe 2.0 or 3.0, would that be possible to include in Rev. 1?
if however you can get hold of a PCIe PHY, then yes, we can put it in. however not for the test chip, because it's 180nm.
what *would* work would be to use a Lattice ECP5G as a gateway (communicating using some form of parallel bus e.g. xSPI or SDRAM). the ECP5G already has the balanced differential PHY drivers needed to do PCIe, and someone is actually working on it: https://github.com/enjoy-digital/litepcie/issues/20
Also, when I say 4k, I just mean the display side for the first generation -- i.e. 4k raster across a few outputs, not necessarily a 3D engine that would actually be rendering decent FPS to a canvas of that size.
For reference, our current display output for blob-free systems is a 1080p single digital output with zero 3D capability aside from LLVMPipe on the host CPU. A few 4k unaccelerated framebuffers, especially on a wide bus like CAPI or multi-lane PCIe, would be a big step up from where we are right now -- and any 3D / accelerator capability would be a significant bonus even if it's fundamentally mismatched in raw performance with the 4k framebuffers.
CAPI does neatly sidestep the PCIe issues, since the RTL etc. is open. Would be humorous on some level to have Rev 1 support CAPI but not support PCIe.
the problem is that it's a Bus Master (yes, really, the processor is *not* a Bus Master). this because you absolutely cannot have the scan lines of the video pause at any time.
when you compute the data transfer rate generated by 4k, it's 8.3 million pixels per frame. let's say 30 fps, that's now 250 million pixels per frame. let's say 16 bpp, that's 2 bytes - now that's 500 mbytes/sec, just for the pixel data.
DDR3 @ 800mhz is a nice low-cost RAM rate, 32 bit wide, the power budget is around 300mW with DDR3L. you get 4 bytes so it's 3200 mbytes/sec. FIFTEEN PERCENT of the data bandwidth is taken up by a 4k frame @ 30 fps, 16bpp!
if you went to 60fps, it would be 30%. if you went to 60fps 32bpp, it would be a whopping SIXTY PERCENT of the data bandwidth taken up just feeding the framebuffer, at 2000 mbytes/sec.
at least with 1080p60@32bpp the data rate is 4x less so it's back down to (only!!) 15% of the total bandwidth.
this is why most (power-hungry) systems now have 2x 32-bit DRAM channels @ minimum 1666mhz DDR3/4 rates, and, unfortunately, those are looking at a 3 to 5 watt power budget, just for the DRAM.
honestly it would be better, at this early stage, to use an FPGA as a gateway IC to provide PCIe. something that enjoy-digital already supports, and have a conversion bus (parallel bus) in something dreadful but very simple (multiple xSPI or overclocked SDRAM, or overclocked IDE/AT).
- Likes 2
Comment
-
Originally posted by lkcl View Postthis is why most (power-hungry) systems now have 2x 32-bit DRAM channels @ minimum 1666mhz DDR3/4 rates, and, unfortunately, those are looking at a 3 to 5 watt power budget, just for the DRAM.
Originally posted by lkcl View Posthonestly it would be better, at this early stage, to use an FPGA as a gateway IC to provide PCIe. something that enjoy-digital already supports, and have a conversion bus (parallel bus) in something dreadful but very simple (multiple xSPI or overclocked SDRAM, or overclocked IDE/AT).
- Likes 2
Comment
-
Originally posted by Qaridarium
I think 14nm will be very cheap in 2020 because then IBM power10 will have 7nm and also Intel will have 7nm for all products.
so all the 14nm fabs will be free for low-cost manufacturing.
You might have more of a point with e.g. 32nm, which isn't a bad place to be honestly, but you're not going to compete with the big boys (AMD/Intel/IBM/ARM) on node size until your volumes are high enough that you can recover your manufacturing costs at the smaller node sizes.
- Likes 1
Comment
Comment