Announcement

Collapse
No announcement yet.

Libre RISC-V Snags $50k EUR Grant To Work On Its RISC-V 3D GPU Chip

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • EMPFIRE
    replied
    1.

    RISC·V based linear algebra accelerator for SoC designs
    Claiming faster than GPU

    Tell me, what's this technology all about? What exactly its useful for & what's its purpose?
    [obviously beyond of what#s already said in the presentation]
    What happened to the into the formal specs proposal of this - approved?
    Why do you not use this, or do you?
    What's your stand on this?


    2.

    ha! cool! jacob's talking in that bugreport about the idea of *auto-generating* the actual texturisation HDL (nmigen) directly from the Vulkan Texturisation API formats, at the same time as developing the SPIR-V to LLVM IR conversion, that will have the very texturisation opcodes in it that are also auto-generated. cool!

    the only fly in the ointment being, it's a frickin lot of work.
    Talking about second stage card - so to say amplified, after the 50000 card is delivered.
    super-scaling those processors directly on FPGA alone
    these both to be addressed with the same questions
    - what are the expectations in terms of power, performance, etc. In detail?
    - how much of the biggest Xilinx Virtex Defense Family FPGA would that utilize?
    - & also how many would are required to be run in a cluster for a really high-performance tasks [cad, gaming],
    if capable at all - if not what will it be capable of & what needs to be done to be


    Any other comments on the matter
    Last edited by tildearrow; 07-29-2019, 07:46 PM.

    Leave a comment:


  • lkcl
    replied
    Originally posted by lkcl View Post
    after speaking with Mitch Alsup (who designed the Samsung GPU Texture opcodes), we will almost certainly be doing texturisation instructions. follow the trail here: http://bugs.libre-riscv.org/show_bug.cgi?id=91
    ha! cool! jacob's talking in that bugreport about the idea of *auto-generating* the actual texturisation HDL (nmigen) directly from the Vulkan Texturisation API formats, at the same time as developing the SPIR-V to LLVM IR conversion, that will have the very texturisation opcodes in it that are also auto-generated. cool!

    the only fly in the ointment being, it's a frickin lot of work.

    Leave a comment:


  • lkcl
    replied
    Originally posted by oiaohm View Post
    So I would not be so sure its a whole another world. You need to go back to basics and compare performance to silicon area with allowances for fabric costs. They are not as far apart as it first appears.
    this concurs with my estimates. if we scaled up to say 256 cores, we'd easily be around the 150W mark, and also be at 64x the performance. so if we managed to hit 12 GFLOPs in the current design within the 2.5W budget (@28nm), that ramps up to 768 GFLOPs @ around the 150W mark (@28nm), which is not shabby at all. i've got openpiton as an open bugreport to investigate its use as the NoC http://bugs.libre-riscv.org/show_bug.cgi?id=69 which would give us potential scalability up to 500,000 cores (on and off chip). of course, now we'd also need the memory controller(s) to be able to cope with that... we're into multi-tens-of-millions-of-dollars territory, and i'd rather get the basics up and running first, on this (much smaller) budget. walk before run.

    Leave a comment:


  • lkcl
    replied
    Originally posted by log0 View Post
    You'll never get the GFLOP/Watt of a modern GPU just by slapping a bunch of Risc-V cores with some simd/vector capability together.
    absolutely. Jeff Bush's Nyuzi paper is the canonical reference, here, which was a research project to find out precisely *why* Larrabee failed (Intel's team were prevented and prohibited by their Marketing Dept from speaking up, hence why you saw Larrabee used as a high-performance Computing Cluster ASIC, *NOT* in GPUs).

    after speaking with Mitch Alsup (who designed the Samsung GPU Texture opcodes), we will almost certainly be doing texturisation instructions. follow the trail here: http://bugs.libre-riscv.org/show_bug.cgi?id=91 - we learned from him that texturisation is done through *massive* regularly-sized Vulkan texture maps, and that the Floating-Point pixel coordinates are used as a lookup system. if the FP number is not an integer, you need to also look up the *neighbouring* textures and perform interpolation. if done in software alone it's really quite horrendous, and interacts with the LOAD/STORE system in a way that means it's best done not as "standard" LD/ST but as its own "thing", bypassing the standard LD/ST checks needed for general-purpose computing.

    these kinds of decisions are just not needed - at all - in a standard "Parallel Compute Cluster" ASIC.


    Leave a comment:


  • oiaohm
    replied
    Originally posted by log0 View Post
    You'll never get the GFLOP/Watt of a modern GPU just by slapping a bunch of Risc-V cores with some simd/vector capability together.

    Just look at AMDs GCN. A compute unit has 64KB local data share, 4KB L1 cache, 16 load/store units, 4 texture filtering units, 4 SIMD units. Each SIMD unit has 16 lanes backed by a huge 64KB register file.

    That's a whole 'nother world compared to this RISC-V "GPU" project.
    https://www.crowdsupply.com/libre-ri...ure-by-osmosis

    Except this Libre RISC-V GPU project is not a normal risc-v core. You don't find 128 FP and 128 integer registers in a normal Risc-V gpu. The normal is 16 or 32 registers in a risc-v core.

    You are right normal Risc-V cores would never give the Gflop/watt of a modern GPU. The current targeted GFLOP value at 28 nm when you wake up its 64 bit GLFOP and it size scaled down it can for sure compete against the AMD GCN. AMD GCN uses a lot larger silicon area. SIMD units in the GCN are larger than the targeted Libre RISC-V GPU cores but not in GFLOP of processing. So per GCN silicon area you can have at least 4 of the libre Risc-V GPU.

    So I would not be so sure its a whole another world. You need to go back to basics and compare performance to silicon area with allowances for fabric costs. They are not as far apart as it first appears.

    Leave a comment:


  • log0
    replied
    Once you have a functional core duplicating multi times and doing the interconnect to use it is not labour intensive .
    You'll never get the GFLOP/Watt of a modern GPU just by slapping a bunch of Risc-V cores with some simd/vector capability together.

    Just look at AMDs GCN. A compute unit has 64KB local data share, 4KB L1 cache, 16 load/store units, 4 texture filtering units, 4 SIMD units. Each SIMD unit has 16 lanes backed by a huge 64KB register file.

    That's a whole 'nother world compared to this RISC-V "GPU" project.

    Leave a comment:


  • oiaohm
    replied
    Originally posted by the_scx View Post
    Again, currently HiSilicon uses the Cortex-A core designs, as well Mali GPUs, licensed from ARM Holdings. They never designed a single CPU core or GPU. They have no experience in this at all.
    You have made a bit mistake. You find HiSillicon has the designed accelerators for the bitcoin mining market those contain RISC-V and arm in single designs.

    https://www.phoronix.com/scan.php?pa...ming-Linux-5.1
    This above is not the HiSilicon chip bit is basically related design method. HiSilicon has never design a CPU totally from scratch. They have taken MIPS and RISC-V and made versions of it with custom extensions for different clients including 1024 core version.

    HiSilicon has currently been using Corex-A + Risc-V + Mali GPU + Own CRT. In fact this means you drop Corex-A and Mali you still have 2D graphics out.

    You have a recent Hauwei phone the encryption accelerator inside there is a Risc-V with custom instruction set. So the idea they have never design a single CPU core is totally talking out your ass. They have been doing custom CPU cores for custom accelerators for quite some time its only been fairly recent they have started using risc-v for this. Now they can move this work out of the custom accelerators to primary CPU and GPU. Remember high speed ram access and everything else about a CPU is basically the same for a high performance accelerator.

    Originally posted by the_scx View Post
    If you make a woman pregnant, you can expect a child for 9 months (well, maybe two or more if you are "lucky", but it's not the point).
    If you somehow managed to make 9 women pregnant at once (you know, one of those crazy nights...), you will not get a child for a month. It just doesn't work like that.
    You can't expect that anyone is able to speed up a decade or two of work into half a year or just two months. It is just impossible.
    I am not working on that line. The amount of development work to multi up a core is not that great. The growth of a child is kind of right because its kind of right..

    Once you have a functional core duplicating multi times and doing the interconnect to use it is not labour intensive . Now you may need todo 500 different designs to test your choices in fabric between those cpus. Please note this is not designing the 500 different fabric designs either this is just picking a stack of off the shelf reference design fabrics for accelerators.

    Has hisilicon being making accelerators for other markets other than GPU yes. Has hisilicon being using risc-v chips todo this yes. Does Hisillicon have their own CRT controller IP yes(that the thing that drives the LCD screen itself). So Mali is just a accelerator. Hisilicon the have not had a GPU accelerator core to drop into their accelerator hardware process that has decent GFLOP/Watt at nm of course that could all change next year if the Risc-V GPU project lands with what they promised.

    Something to remember ARM IP does not allow you to make custom accelerators using their tech.

    The horrible reality is that basically everything bar the core to make a high performance GPU has been design in Open Hardware IP.

    The reality is hisilicon from a tech point of view is basically at the 9 months pregnant from being able to totally spit out a Risc-v CPU/GPU. There is just a little piece missing.

    You cannot in fact license Mali GPU without licensing the ARM cpu core and all the hardware test cases to check that mali GPU is functioning are provided in arm binaries. Horrible reality is even if you can make a decent Risc-V CPU chip you cannot use Mali with it directly and be inside ARM IP license.

    Leave a comment:


  • the_scx
    replied
    Originally posted by oiaohm View Post
    2 to 6 months is not a wild guess this is how fast it happened in the past with a different design chip with the same amount of resources hisilicon has.
    Again, currently HiSilicon uses the Cortex-A core designs, as well Mali GPUs, licensed from ARM Holdings. They never designed a single CPU core or GPU. They have no experience in this at all.

    Originally posted by oiaohm View Post
    If Huawei goes all in on risc-v using hisilicon they could basically complete 2 decades worth of development at current speeds in 6 months or less due to massively increase iteration rate. So highly likely as a possibility. Its not a absolute sure they will go this way but it cannot be ruled out.
    If you make a woman pregnant, you can expect a child for 9 months (well, maybe two or more if you are "lucky", but it's not the point).
    If you somehow managed to make 9 women pregnant at once (you know, one of those crazy nights...), you will not get a child for a month. It just doesn't work like that.
    You can't expect that anyone is able to speed up a decade or two of work into half a year or just two months. It is just impossible.

    Leave a comment:


  • the_scx
    replied
    Originally posted by lkcl View Post
    we could indeed have chosen to write a direct OpenGL driver. it would be a multi-man-year effort. then we would have to turn our attention to OpenCL. that would be a multi-man-year effort, too.
    It is not just a "multi-man-year effort". OpenGL is extremely complex. I would say that it is relatively easy to write a basic OpenGL driver, but at the same time it is super hard to provide a full implementation.
    Back in time, there was MiniGL - an incomplete OpenGL implementation provided by several graphics card hardware companies including 3dfx, PowerVR and Rendition in the late 1990s.
    Today, all major 3D card manufacturers claim to provide complete OpenGL implementations. However, if we look at open source drivers, we see that many of them are far from implementing the entire API.
    Before Valve started the whole Linux thing (Steam client for Linux, Steam Runtime, Steam OS, Steam Machines, etc.), the situation was just terrible. Mesa 8.0 (early 2012) didn't fully support OpenGL 3.1 from 2009, not to mention OpenGL 4.0 from 2010. And the latest was the OpenGL 4.2 from 2011. Even Mesa 10.6 from mid 2015 was still unable to fully support OpenGL 4.0, when the latest version was OpenGL 4.5 from 2014. Fortunately, in 2015-2017 the situation began to change. Today we have full support for OpenGL 4.5 and in the near future we can expect that Open 4.6 from 2017 will be fully supported as well. It wouldn't be possible without huge effort from both AMD and Valve. Did you know, that Valve & RADV developers were topping contributions to Mesa in 2018? The second and third most active contributors to Mesa were Timothy and Samuel as part of Valve's Linux GPU driver team where they are primarily working on the RadeonSI and RADV drivers.
    Unfortunately, it doesn't mean that all Mesa drivers are in the same state. Although Nouveau claims to support OpenGL up to version 4.5, it is very buggy (even unstable) driver, and its performance is just terrible.
    Then we have Virgl (sometimes called Virgil 3D). The idea behind it is to have a guest GPU that is fully independent of the host GPU, but in reality it should be able to expose host capabilities to the guest. It should be easy, right? Unfortunately, it is not. Red Hat has invested a lot of money into it, but even after several years of development, the results were just horrible: barely support for OpenGL 2, buggy, unstable, with terrible performance (like 5%-10% of the native speed). Recently, they have made some noticeable progress, but it is only thanks to Google. They are interested in Virgl because of their Project Crostini that allows running Linux applications on ChromeOS. But still, it will be at least several years before Virgl would be suitable for modern games (even very simple ones).
    We have also software renders: Softpipe, LLVMpipe and SWR. All of them are not able to fully support anything above OpenGL 3.3.
    Freedreno, the best open source driver for mobile GPU, is even worse, because it fully supports only OpenGL 3.1. The best open source driver for mobile ARM GPU! In Mesa we also have Lima, which is even worse! Much worse!

    My point is that even if you manage to achieve your goals in 2020, we still will have to wait at least a few more years for OpenGL, not to mention other APIs (Have you noticed that OpenCL in Mesa has been "WIP" for years, and now 1.2 is still in progress, while 2.x does not exist at all?).

    It's funny because it reminds me of the situation with Windows 10 on ARM. You will be able to use your USB adapters if the manufacturer provides drivers for ARM for them. You will be able to run old games if you are lucky enough to make dgVoodoo 2 work. You will be able to run new games if you can find 32-bit x86 executables (just forget about native games on ARM - they almost don't exist in the Windows desktop world). You will be able to work on batteries for a long time if you manage to get native software on ARM. And so on. So many "IFs" there... too much for the average customer.

    Originally posted by lkcl View Post
    by focussing on Vulkan, *other people* take care of the OpenCL-to-Vulkan adaptor. *other people* take care of the OpenGL-to-Vulkan adaptor.
    I am not sure if it is even possible to implement full OpenGL stack over Vulkan. A few years ago there was a discussion if it is possible to create a Vulkan state tracker for Gallium. The conclusion was that it does not make any sense, because "Gallium3D is higher-level than Vulkan, so such a state tracker wouldn't really work out". Then a new idea appeared - Germanium. It was supposed to be "a superset of Vulkan with a similar API and exposing additional state that was found in OpenGL but not with Vulkan". So, in one possible future, new drivers may be coded against "the Vulkan-like Germanium interface, and target both GL/Vulkan simulatnously". Jason Ekstrand from Intel, one of the main developer of the Mesa-based Vulkan driver, gives us also a few ideas about implementing OpenGL on top of Vulkan, but there was some obstacles. This would require adding a few extensions to Vulkan, but even then there would be still some unresolved problems.
    Today we know that the idea of Germanium didn't really work out. However, several projects about Direct3D over Vulkan appeared: VK9 (D3D9 → Vulkan), DXVK (D3D10/D3D11 → Vulkan), VKD3D (D3D12 → Vulkan). We have also dgVoodoo 2 (Glide 2-3 & DD 1-7 & D3D 2-8 → D3D11) and DXUP (D3D10 → D3D11), but it doesn't matter here. We also know that some time ago a few Vulkan extensions supposed to help these projects have appeared, e.g. VK_EXT_transform_feedback, VK_GOOGLE_hlsl_functionality1, etc.
    But what about OpenGL (ES) over Vulkan? There are several project about this:
    - Google ANGLE (Almost Native Graphics Layer Engine): OpenGL ES over OpenGL, D3D 9/11 or Vulkan (OpenGL ES 2.0 in progress, everything above has not started yet).
    - VKGL: OpenGL ES 2.0 over Vulkan
    - GLOVE: OpenGL ES over Vulkan (again, only ES 2.0)
    - Zink: OpenGL over Vulkan (early stage, "started off with basic OpenGL 2.1 and since then has advanced to OpenGL 3.0")
    However, all these projects seems to progress slower than theirs D3D-over-Vk counterparts. There is no such pressure as in the case of Direct3D over Vulkan, because there is no strong need for OpenGL over Vulkan on Linux desktop (we already have a good implementation for OpenGL 4.5 in Mesa). And I am still not sure if it is even possible to provide complete OpenGL 4.6 implementation over Vulkan. Even if, it would take many years to complete this goal.
    I understood why you would like to focus on Vulkan, but at the same time I don't believe that we will see a fully functional open hardware GPU (with proper support for OpenGL, OpenCL, etc.) in 2020. And as I said, it will take years before it can be used for something more than just hobby projects. But of course, maybe around 2030 (or later) it will be eventually useful for some serious usages (e.g. as a base for domestic GPU, for national security, or as an open hardware project that could be produced directly on Mars for manned or unmanned missions).

    Originally posted by lkcl View Post
    nothing... except have you tried contacting intel to license their GPUs?
    What about you? Have you tried to live as an open-source gamedeveloper who targets exclusively open hardware platforms?

    Anyway, I asked him as a user. We basically have several types of users who complain about current GPUs or drivers for them in Linux:
    - Gamers. They complain about performance. While NVIDIA drivers have a very good shape, AMD drivers for Linux are still behind their drivers for Windows. As I said before, there has been enormous progress in recent years, but there is still a shortage. We also have Intel drivers here. While they provide a really good desktop experience, they are not so good when it comes to games. Anyway, complaining here is fully understandable.
    - Desktop users who complain about proprietary drivers (especially these on the kernel side). It is not because that these drivers are closed source, but because for some reasons they are unable to deeply integrate with Linux kernel mechanisms. For example, they are unable to use DRI3, nor KMS implementation in the Linux kernel (they can provide their own KMS implementation, but it's not the point). These things have some impact for so-called desktop experience. This is why we still don't have native resolution on tty text console or support for PRIME render offloading (PRIME output offloading already works pretty well on NVIDIA hardware). Of course, NVIDIA is trying to figure out something. We already have GLVND and GLXVND, but the whole thing is still in progress. And this doesn't apply only to NVIDIA binary drivers, but to all closed source drivers. Anyway, there is some kind of mass hysteria about proprietary drivers in Linux community, which I really do not share, but I understood their point of view: a proper support for Optimus should have appeared a long time ago, in one way or another.
    - Users of rare hardware, i.e. S3G/VIA and SiS/XGI GPUs, or Adreno and Mali GPUs on Linux desktop. There are reasons to complain here. A lot of them. But to be honest, you can not blame only vendors here. For example, back in 2011, VIA released full documentation for theirs chipsets with Unichrome and Chrome9 GPU. This includes the latest chipset with GPU from Chrome9 family - VX900 (Chrome9 HD). Doc for CX700 (Unichrome Pro II), VX800 (Chrome9 HC3), and VX855 (Chrome9 HCM) was available even earlier. This documentation covers the 2D, 3D, and video engines for these integrated graphics processors. However, even today 3D accelerations for Chrome 9 (HC/HC3/HCM/HD) is not implemented at all in the open source Openchrome driver. It also lacks proper support for video decoding/encoding. So as you can see, providing documentation is not enough.
    - Finally, the last group. They hate NVIDIA and Intel. They claim that they only care about freedom and will burn their non-free hardware as soon as the first libre RISC-V SoC appears on the market. However, they had already many chances to replace theirs Intel-based PCs with something at least a little bit more open. For years, we have had computers with Loongson processors. Richard Stallman "has even used and promoted the Loongson-based Lemote Yeeloong netbook since it can run 100% free software down to the BIOS level" (Phoronix). Now we have POWER9 workstations, such as Raptor Talos II, which is an open-source product, and we already know that the PPC ISA is much more open than x86, since it is "open for licensing and modification by the OpenPOWER Foundation member" (Wikipedia). But these so-called freedom supporters doesn't seems to care about it or any computer recommended by the FSF. Even RPi would be a better choice if they are so afraid about IME (Intel Management Engine) or PSP (Platform Security Processor). But they did absolutely nothing to get rid of their current non-free hardware, which "harms theirs freedom". At least most of them.
    I really appreciate open source projects. In addition, I believe that the idea of open hardware can be useful in the long run. However, I also try to be a pragmatist. I don't believe that in the near future there will be open hardware that would be able to fully replace our desktops and smartphones. It is just impossible. There is too big technological gap to catch up in such a short time. And I am really sick of people who claim that in next 1-2 years (or even within a few months!) there will be a big libre revolution when it comes to hardware. Sorry, but this not gonna happen.
    Even you... You still use the Intel hardware (or at least something based on x86) but you don't have to. You could replace it with a Power workstation, Loongson-based netbook or RPi, which are at least a little bit more open, but you didn't. Why? Because you don't care about your freedom? Or because you want to be a pragmatist as well? And yes, I know that LLVMpipe is currently x86-/PPC-only thing, but as I said, you could use the Raptor Talos II workstation if you really want to.

    Leave a comment:


  • kpedersen
    replied
    Originally posted by the_scx View Post
    Is the 1280x720 really a native resolution of your monitor? I don't know what to say. I feel sorry for you...
    Close; I run 1280x768. You don't need to feel sorry for me, I don't run Gnome 3. I probably have more usable screenspace than most people here XD.

    Originally posted by the_scx View Post
    BTW: What's wrong with Intel graphics processors?
    Currently I use them and for the short term I am certainly very interested in the dedicated card that Intel is bringing out so I am not tied to their processors. However like most things in the computing world, libre is more important in the long run. We need independence from proprietary hardware venders if we are to avoid being locked into a walled / streaming garden with nothing but consumption devices in about 50 years.

    Soon, crap like the Google Stadia will be the only thing you will be able to buy. There is more to computers than the internet, Google docs and streaming movies and games!

    Leave a comment:

Working...
X