Announcement

Collapse
No announcement yet.

OpenBLAS 0.3.20 Adds Support For Russia's Elbrus E2000, Arm Neoverse N2/V1 CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by In_Mint_Condition View Post
    Better be thought a fool than open your mouth and confirm it.
    Who's the idiot now? He started the genocide few minutes ago.

    Comment


    • #72
      Originally posted by coder View Post
      No, I'm pretty sure the last several iterations of IA64 CPUs were only by contractual obligation. I think they gave up on IA64 after about the second generation CPU.
      It could be, but the last version Kitsson was released actually in 2017:

      Comment


      • #73
        Originally posted by coder View Post
        What I really want is for Michael to keep covering all tech, whether it's from China, Russia, or anywhere else. I don't think he's easily dissuaded by arguments in the forums, but it would be nice if we can also have intelligent discussions about this tech, and not get bogged down in politics.

        Thanks to you and other Russians for being here and helping us understand your cool CPUs.
        Agree on that, we love technology!

        Comment


        • #74
          Originally posted by mSparks View Post

          Yes, I got that, I just didn't get what you think is the difference between M1 long instruction words that contain multiple RISC instructions that run in parallel and Elbrus VLIW that contain multiple RISC instructions that run in parallel.

          You keep mentioning out of order, but these are all parallel instructions, the whole point is there is no order.
          No, there are no "M1 long instruction words that contain multiple RISC instructions that run in parallel". Just look at any aarch64 manual. It's a "standard" RISC ISA, not VLIW. They are executed in parallel as the CPU uses the OoO machinery to dynamically figure out register dependencies, as described in Tomasulo's 1967 paper.

          Now, there have been CPU's that use a VLIW-like backend, with a frontend converting the instructions of a "traditional" ISA to VLIW-like internal instructions. Like Transmeta and NVIDIA Denver, IIRC. But I have seen no indications anywhere that Apple M1 would be anything like that. Everything I've seen suggests the M1 microarchitecture is a "normal" OoO core design.

          Comment


          • #75
            Originally posted by coder View Post
            "APB (automatic prefetch buffer) programmable to deliver RAM contents at given patterns into L2 cache predictably"

            Prefetching is essential, even for modern, out-of-order cores. Because even they don't have big enough reorder buffers to hide the latency of a read that has to go all the way out to DRAM. And deep reorder buffers presume you can even find enough work to do that doesn't depend on the missing data.
            Prefetching can help, but it won't . I mean, would prefetching be possible all time, we wouldn't need any caches, just load register 1000 cycles before using it.

            The tragedy of VLIW is that any single unexpected cache miss stops thread from execution for unknown amount of cycles. It has all cons of OoO like need to have some work to do while waiting for data, but all these problems amplified by VLIW. Number of independent instructions are limited, OoO can use them when it needed, in case of L1 cache miss it will need to fill L2 delay only. With VLIW you have to predict data availability so you have to fill up to RAM access delay. And what if there are two dependent memory accesses? Like ptr1->parent.data? Will compiler have independent instructions to hide 2 RAM accesses?

            VLIW proponent usually talk about magic compiler which will somehow solve all problems. But that is nonsense. Any compiler can guess code behavior with some amount of certainty. You talking about PGO, so you mean it, PGO just allows compiler to evaluate probability more correctly. But unforgiving nature of VLIW makes compiler to be more pessimistic. That mean if someone able to create some state of art compiler, able to produce a near-perfect code for 16-way VLIW, then this compiler can be easily adapted to be more optimistic and to rearrange scalar code in such way, that some 24-way OoO superscalar CPU will be able to fill all its pipelines.

            OoO is always better than VLIW. And, second tragedy of VLIW: it has no real advantages. The idea behind it was to pack more computation power into CPU. Classic ALU + Control looked ineffective, lets add more ALUs. But this problem already have been solved, partly by superscalar, partly by SIMD. VLIW have nothing to offer to justify problems it brings.

            Comment


            • #76

              Originally posted by coder View Post
              Let's say you're right. If the point of your posts is primarily to feed their algorithms, it might be self-defeating. You could just end up informing them of which facts to refute.
              Their propaganda machine doesnt present or refute any facts, they just present an Agenda and back it up with make believe stories like Osama Bin Ladin in Afghanistan, Iraq weapons of mass destruction hitting London in 90 minutes or the Syria gas attack.

              Presenting some facts would be a nice move in the right direction.
              ​​
              Originally posted by jabl View Post

              No, there are no "M1 long instruction words that contain multiple RISC instructions that run in parallel". Just look at any aarch64 manual.
              So your position is Apple aren't doing with the M1 what they announced they are doing with an M1, and to prove it to myself I should look at a manual for an entirely different processor.

              Perhaps you should look at "why the M1 is so fast".

              e.g.

              https://www.linkedin.com/pulse/refle...ric-kolotyluk/

              I have since learned that Apple's M1 incorporates some of the innovations from this product, in particular, the Very Large Instruction Word Very Long Instruction Word architecture
              Last edited by mSparks; 24 February 2022, 08:01 AM.

              Comment


              • #77
                Originally posted by caligula View Post
                Who's the idiot now? He started the genocide few minutes ago.
                Clearly you don't understand the word genocide. I would ask if you even knew the Donbass to be independent and not part of Ukraine since 2015 but it would be a waste of time.
                My time is too precious to talk to such ignorant idiots!

                Comment


                • #78
                  Originally posted by mSparks View Post
                  ...and if people are accepting that CNN/FOX/BBC/ABC etc agenda....
                  But accepting Russia Today's agenda is completely legitimate?

                  Comment


                  • #79
                    Originally posted by In_Mint_Condition View Post
                    I would ask if you even knew the Donbass to be independent and not part of Ukraine since 2015...
                    If that would be true (clearly it's not, since Russia just recently acknowledged Donbass as independent) then there would not be any need for the Minsk agreements.
                    But surely you knew that?

                    Comment


                    • #80
                      Originally posted by tomas View Post

                      But accepting Russia Today's agenda is completely legitimate?
                      listening to what people said as a source of knowing what they said is perfectly legitimate, do you not think?

                      Comment

                      Working...
                      X