Announcement

Collapse
No announcement yet.

Libre RISC-V Snags $50k EUR Grant To Work On Its RISC-V 3D GPU Chip

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by oiaohm View Post

    Mathematically sound from applied security is broader than most who have done course work in mathematics think. So you are alone thinking that doing maths is not anything practical and related to the real world.
    Anyone with a bachelor or more in maths will tell you otherwise, me included.

    First, "mathematically sound" is a term used in logic, the philosophy behind deductive reasoning in mathematical terms. Then you cant prove anything at all from observatory sciences, you can only prove the mathematical model which is an approximation of the world.
    The issue here is the approximation, you simply cant deal maths on anything else than abstract models. Making sure your approximation is right is key to making that math relevant, but this requires "not-math" which together with math is still "not-math".

    Originally posted by oiaohm View Post
    Mathematically Verified Silicon you start seeing coming out of CSIRO data61 related projects. This is where the design is being put through full formal mathematical proofs of function.
    You are way to vague with your wording, you have formal verification, not mathematical verification.
    Mathematically speakin: "almost all" you write is wrong (please look that up before you think that's an insult).

    Originally posted by oiaohm View Post
    Sorry the separating between between practical and your maths is basically gone in the security side. You say something is secure you need to put up a mathematical proof covering all the possible problems proving they don't exist. CSIRO and other employ a lot of mathematicians to in fact do these insanely complex proof systems.
    formal proof, and necessary verification of the models involved.

    Originally posted by oiaohm View Post
    Yes of course Physics people lay out the basic structures that the mathematician has to base proof around.
    You know you got me to write way more than I was expecting to cover up for a ironic remark that math is disconnected from any real world...

    The area of cryptography (which IS pure math for a change) is totally separate from software, material science, and making sure that you don't expose more than you want.

    Comment


    • #32
      Originally posted by the_scx View Post

      This GPU (or rather the "Vulkan accelerator") does not even have the performance of NVIDIA graphics chips from 15 years ago.
      it has to be pointed out that if you're comparing against desktop GPUs, you're just plainly not familiar with the embedded GPU market (and haven't read what i wrote only a couple of comments ago).

      we're doing an *achievable* design, that requires *less money* to tape out on an MVP Programme, and *planning* the infrastructure - now - to be able to scale up to desktop level *later*.

      Please just look at the spec:
      - [email protected] -
      - 100 Mpixels/sec -
      - 30 Mtriangles/sec -
      - 5-6 GFLOPs -
      these are the numbers from Vivante GC800, which costs USD $250,000 to license, and has staggeringly good power efficiency.

      you simply cannot compare an embedded GPU, which uses a shared memory bus with the main core *and* the LCD frame-driver *and* with all other peripherals, with a dedicated GPU that has special parallel GDDR RAM, and hope to get any kind of meaningful or useful numbers.

      if you are not familiar with this type of design I can recommend some SoCs that you can study.

      Adreno 640 has 898.56 GFLOPS in FP16 and 449.28 GFLOPS in FP32!
      Adreno has a decade of incremental development behind it, and the financial resources of a billion-dollar company behind it (Qualcomm) that allow them to make 5 MILLION dollar iterative mistakes at a time, on failed tape-outs.

      ST Micro gets away with that kind of costs through its employees having back-to-back business cards with the local university, an arrangement that allows them to apply for whopping great EU Grants that are then thrown at the Foundry, repeatedly, until at least one revision of the ASIC actually works. this was how ST developed one of the first commercial dual-core ARM Cortex A9s to come out.

      it would be financially irresponsible of us to follow these Corporations' leads, here. you missed it, twice, i will say it again: we are *deliberately* going for a lower performance (with a lower power budget) because the resultant reduction in square millimetres associated directly (and near linearly) with the reduced performance costs proportionately less to get it taped out on the Multi-Vendor Programme available from Foundries.

      if we had USD $1m available, we would have the choice of blowing the entire lot on a single tape-out (of a larger ASIC).... *or* we could do up to TEN tape-outs of a 4mm^2 ASIC in 28nm for the same money.

      which do you think a small team doing their very first design should consider doing?

      or, do you have a source of money that you can get at which would allow us to do what Qualcomm and ST do? if not, *stop complaining*!

      Maybe 10 years ago 5-6 GFLOPs wouldn't look so bad on the smartphone market, but today it only makes people laugh.
      As you can see, it is definitely not suitable for any kind of modern smartphone, even a low-end (~100 USD). And it is not even finished and won't be before 2020!

      Maybe this performance would be enough for cheap smartwatches. However, power consumption (2.5 W) is definitely too high.
      that could be solved easily by the exponential application of money. USD $1 million gets you 45nm production masks. $2m gets you 28nm production masks. $4m gets you 22nm production masks, and so on. each shrink in geometry uses 1/sqrt(2) the power consumption (appx a 30% reduction per halving of the geometry)

      so 7nm would reduce that 2.5 watts down to a scant 0.6 watts - 600mW.

      The 38mm variant of the Apple Watch has a 3.8 V 0.78 W·h (205 mA·h) battery which is able to power the device for many hours!
      probably because, just as you can see from the above calculations, Apple has put a tiny fraction of some of the billions of dollars it makes every year into paying for a 14, 10 or even a 7nm tapeout of the processor in that watch. they'll also keep a close eye on what the OS is allowed to do.

      i've seen this kind of unrealistic comparison done many many times before. a startup design with a strategy for ramping up is compared against the absolute latest-and-greatest mass-produced item made by a billion-dollar-backed Corporation, and oh look! what a surprise! it's a failure! it's a non-starter! they're dooomed, dooomed i tell you! hear meee, hear meee, oh yaee, oh yaee, the harbinger of dooom and gloooom...

      We have one more problem here - there is no mature software mobile platform on RISC-V at all.
      this does not bother the client that wanted the embedded design that i spec'd out, at all. the application they have in mind will require siiignificant customisation, for which a hybrid and importantly *libre* CPU/VPU/GPU is far better suited than any proprietary solution, as the 3D projection requirements that they have are are *non-standard*.

      All current and planed Linux solution are tied to x86 and ARM CPUs. This includes Tizen, Sailfish OS, Ubuntu Touch, KaiOS and PureOS. What is worse, port to RISC-V wouldn't be so easy. For example, PureOS Store is supposed to be based around flatpaks. However, the Freedesktop runtime (as well as its derivatives: GNOME and KDE) doesn't support RISV-V.
      oink?

      GNOME on RISC-V: https://fedoraproject.org/wiki/Archi...xpansion_board

      Plasma Mobile on RISC-V https://www.reddit.com/r/kde/comment...iscv_hardware/

      where you are absolutely right, however, is that certain key strategic pieces of software have not been completed yet. LLVM has still patches outstanding, java (the JIT) has a *lot* of work needed, however the key here is: *it's software*. it can be done *later*. we do not have to hold up the hardware development because of it.

      I am not saying that this RISC-V solution is completely useless. I believe that there are some applications where it could fit. In my option, this initiative has more sense than OGP (Open Graphics Project). However, do not expect a mass adaptation in commercial devices. It just won't happen.
      i hear these things (and take them on-board because it would be foolish not to), and yet i am not going to stop. i've been sitting around waiting for a Corporation - any Corporation - to produce an embedded SoC - any SoC - that is libre to the bedrock, the CPU, GPU and VPU all supported by full software libre stacks, and for twelve YEARS they have failed to do so.

      i'm fed up with it. absolutely sick of it. so at that point i had two choices. continue to feel pissed off, or get of my fat ass and *do* something. i'm still exhausted, nearly every day, yet now i am *doing* something that i've always wanted to do, for nearly 25 years (design and build a processor), and because it's Libre, i know that even if we don't succeed completely, our efforts will get *someone* - somewhere - that much closer to a properly libre design.

      so please: stop complaining and making apples-vs-oranges comparisons, because doing so undermines the chances of success of the *second* iteration which might actually meet the performance characteristics that you'd actually like to see! we do however have to prove that we can do what we set out to do, first, as that will give future investors the confidence to drop USD $10m at us, on the basis that we achieved - with a tiny budget - what would normally require a large Corporation 10 to 100x more money to *fail* to deliver.

      we'll get there, ok?

      Comment


      • #33
        Originally posted by starshipeleven View Post
        I take great pride in being "directly responsible for the ongoing environmental damage that EOMA68 *would* have reduced if it had been possible to complete earlier".
        you're a bully, and a sadist. you enjoy causing other people pain. other peoples' suffering gives you great pleasure, doesn't it? it makes you feel better to watch someone else suffer, doesn't it? so i reiterate: feel free to continue to do what you find to be most useful to you. your sadistic and destructive desires do serve a purpose in certain special circumstances. people who want to be tortured will come and find you.

        Comment


        • #34
          Originally posted by discordian View Post
          Anyone with a bachelor or more in maths will tell you otherwise, me included.
          Really stop. The terms I am using from CSIRO Australia. There are terms from those with PHD and above specialists in secure hardware.

          Originally posted by discordian View Post
          First, "mathematically sound" is a term used in logic, the philosophy behind deductive reasoning in mathematical terms. Then you cant prove anything at all from observatory sciences, you can only prove the mathematical model which is an approximation of the world.
          The issue here is the approximation, you simply cant deal maths on anything else than abstract models. Making sure your approximation is right is key to making that math relevant, but this requires "not-math" which together with math is still "not-math".
          Absolutely correct. Mathematical Verification is using a mathematical model of the hardware to validate the design. Yes it only an approximation of the world. But its a approximation where you can look for edge cases more effectively. It also the fact that what ever you have produced in silicon will only be a approximation of your design.

          Originally posted by discordian View Post
          You are way to vague with your wording, you have formal verification, not mathematical verification.
          Mathematically speakin: "almost all" you write is wrong (please look that up before you think that's an insult).

          formal proof, and necessary verification of the models involved.
          I am not talking about what is called a formal proof.
          A formal proof or derivation is a finite sequence of sentences (called well-formed formulas in the case of a formal language), each of which is an axiom, an assumption, or follows from the preceding sentences in the sequence by a rule of inference.
          Key problem here is the word finite.

          Mathematical verification of silicon or software I am referring it has to deal with solving infinity problems.

          Why is a standard formal proof is not much use when dealing with modern silicon. That right to make the small structures in silicon you have multi passes of application of light to make them. So the construction process of silicon basically gives you almost a infinity number of possible different produced chips defects. So the general formal proof does not cut it and the system that work at this stage has the generic name of mathematical verification and its a horrible dive into solving for infinity looking for defects in any possible chip that could be produced no matter how rare. This gives very different answers to use the formal proof model using only finite sequences.

          Comment


          • #35
            Originally posted by uid313 View Post
            Okay, well in that case; will it support HDMI,
            it's an embedded SoC. HDMI would be expected to cost around $50,000 to $100,000 to license from a third party, or we could spend the entire budget developing a libre version of an HDMI interface (and not deliver an SoC).

            we chose instead to utilise Richard Herveille's superb RGB/TTL frame driver interface (because it is libre, and because i happen to have full linux kernel source for it) and if people want external HDMI they can use the TFP410a from Texas Instruments, or any DVI / HDMI converter IC such as the Chrontel 7036.

            this puts the output GPIO driving range safely into the 150mhz range (using Single Data Rate) for the RGB/TTL interface, which is reasonably achievably by us for a first-time ASIC, without having to be seriously concerned about on-chip EMI and cross-talk on the GPIO.

            DisplayPort,
            likewise. another external converter IC with a different spec will cover that. MIPI is covered by the Solomon SSD2828 (a converter IC that has linux kernel driver support).

            OpenGL 4.6,
            this is an embedded-style GPU, providing a *Vulkan* compatible driver, because of the high number of converter / adapter APIs that have been written, we don't have to.

            so if *someone else* writes a general-purpose Vulkan to OpenGL 4.6 converter / adapter, we get to support OpenGL 4.6.... *without having our small team to pay for that development*.

            if *you* would like to write such a converter / adapter, feel free to do so.

            OpenCL,
            again, Vulkan to OpenCL interoperability layer. not our department. see https://www.phoronix.com/scan.php?pa...L-Interop-2019

            ASTC, ETC2,
            don't know, don't know.

            VESA Adaptive-Sync,
            https://en.wikipedia.org/wiki/FreeSync - it will have to be part of the Vulkan API, communicated to the display driver. google "vulkan adaptive sync" and you'll find the answers.

            tesselation
            google "Vulkan tesselation"

            raytracing?
            google "Vulkan raytracing"

            Comment


            • #36
              Originally posted by brent View Post
              Unfortunately that's peanuts. Designing a capable GPU is a year-long effort for a team of skilled engineers. 50K only pays a single engineer for a few months. And I still don't trust any efforts lead by lkcl after what has happened in the past. Let's hope the money will be put to good use, but I'm very skeptical about it.
              But, if I understood it right, the idea is to create a software that transforms a RISC-V in a GPU capable of running Vulcan...

              Comment


              • #37
                Originally posted by oiaohm View Post
                Why is a standard formal proof is not much use when dealing with modern silicon. That right to make the small structures in silicon you have multi passes of application of light to make them. So the construction process of silicon basically gives you almost a infinity number of possible different produced chips defects. So the general formal proof does not cut it and the system that work at this stage has the generic name of mathematical verification and its a horrible dive into solving for infinity looking for defects in any possible chip that could be produced no matter how rare. This gives very different answers to use the formal proof model using only finite sequences.
                ah, it's worthwhile pointing out that we'll be using yosys to perform the formal proofs. actually, yosys doesn't even do the proofs: it acts as a gateway to various back-ends such as z3 solver, yices2 and so on. symbiyosys - https://symbiyosys.readthedocs.io/en...uickstart.html

                basically these are boolean logic and *numerical* provers, some of which can do K-Induction (whatever that is), some can do Bounded Model Checks (run a few iterations with random or semi-random data), and a few others that i don't understand, hence the reason for putting in another application with a large enough budget to cover donations to a suitably well-informed mathematician dash programmer or two. or five or ten if they're from india.

                what we are *not* doing is formal mathematical proofs of the *actual silicon*. we will likely have to do some SPICE simulations (and other simulations that check for latency and other things), however that is much later.

                what the formal proofs is for is to prove that if we write a priority picker (source code here https://git.libre-riscv.org/?p=soc.g...f3;hb=HEAD#l49) it is *formally proven* that it *will* generate the correct output for *all* possible combinations of inputs.

                samuel falvo, the developer of the Kestrel family of RISC-V processors, first introduced me to this technique. here is an example:
                http://chiselapp.com/user/kc5tja/rep...ea88450c1a3fc6

                go to the bottom of that file and you can see the unit tests that he runs through the links that nmigen provides to yosys. note in the class MStatusFormal that he uses something called "Past()". that's a *formal* mathematical declaration to the provers to check that the *previous* value (from the past clock cycle) matches the desired mathematical expression.

                for anyone who has had to write extensive unit tests, you *know* how much work goes into writing a Finite State Machine that would do that same job, and, worse, you *know* that you would not be confident that you had covered all the corner-cases.

                this is what we believe will save us drastic amounts of effort, and catch many potential errors that would otherwise be missed, and at the same time leaving an extremely clean and readable codebase.

                Comment


                • #38
                  Originally posted by lkcl View Post
                  you enjoy causing other people pain.
                  I enjoy justice. People like you make the whole opensource and open hardware community look bad.

                  As that's what people will see or think when someone mentions the concepts again, all the cash grabs of dud hardware that isn't even shipped anyway.

                  I enjoy drinking your tears when you fail.

                  Comment


                  • #39
                    Originally posted by rastersoft View Post

                    But, if I understood it right, the idea is to create a software that transforms a RISC-V in a GPU capable of running Vulcan...
                    ok. so it's a little more complex than that. a standard RISC-V core, compliant with all official extensions (including RVV Vectorisation) is just simply not capable of the required performance to meet GPU workloads, at least not in a reasonable power budget.

                    i mentioned before in other posts here that it is absolutely essential to do some level of custom opcodes and extensions. i spent several months corresponding with Jeff Bush, of nyuzi, and he taught me about power consumption, and the importance of getting through the L1/L2 cache barrier and *not* transferring data back and forth because there's not enough registers, for example.

                    i was absolutely stunned to find, for example, that even a lowly MALI400 has a whopping *128* registers. this because by the time you start issuing instructions in 4-vectors, a standard 32-entry register file is very very quickly exhausted.

                    then there are things like FP32 A,R,G,B pixel conversion to ARGB8,8,8,8 (32 bit INT). this is *expensive* in software as it is not a straight linear conversion: Vulkan requires a non-linear transformation, due to colour distortion effects that are discernable by human eyes.

                    thus, for better parallelism and also to get the CPU cycles back, we *have* to do that one as a custom instruction.

                    likewise we learned today, from Mitch Alsup, that it would be a really good idea to have custom Texturing instructions to perform the FP-to-Address calculation and matching interpolation.

                    likewise, Z-Buffers are expensive if done in pure software... that costs you pixels and if not done in hardware costs you power/performance....

                    all the little things that add up which, if you are familiar with Larrabee and with Nyuzi, if you only have a good Vector Engine, you *still only reach 25% of the performance / watt of a GPU in the same class/category*.

                    that research by Jeff Bush, on Nyuzi, was *not* a failure, it was an extremely important exercise to *show* people *exactly* why Software 3D Rendering is slow, and, thus, how and precisely where to do better, in hardware.

                    so, "the idea is to create a software that transforms a RISC-V in a GPU capable of running Vulkan" - not quite: the idea is to write a Vulkan driver (actually mostly a Shader Compiler which compiles SPIRV into LLVM IR) *and* develop the hardware and the minimum custom accelerated opcodes and infrastructure needed to get us reasonable 3D performance *on the same CPU*. that CPU will happen to be based around a RISC-V compatible instruction set.

                    Comment


                    • #40
                      Cool, good start for a third-party project. Esperanto Technologies prototyped a RISC-V based shader and they were able to get that working pretty quick. To make a competitive RISC-V shader core would take a sizable instruction set extension, but I wonder how big the extension would really need to be to make it just more useful than software rendering.

                      Comment

                      Working...
                      X