Announcement

Collapse
No announcement yet.

Intel Has A Single-Chip Cloud Computer

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    hmm.. what's stronger: 48x 8086 or 1x 80386?

    This really doesn't mean much.

    Comment


    • #12
      Originally posted by Louise View Post
      Yes, and Sony have finally realised that Try searching for IBM, Cell, cancel

      So my guess is that the PS4 will have a Larrabee solution, as Sony likes to try out new technologies, and XBox 720 will have an AMD solution, since nVidia is very likely not even interested in working with MS again after the XBox 1, where MS trashed millions of SouthBridges, giving nVidia a loss in one quarter.
      Microsoft and AMD together. Yet another reason not to buy products from either of them.

      Comment


      • #13
        AMD Core Counts and Bulldozer: Preparing for an APU World


        Who are these guys? And where do they get those wonderful toys from?

        AMD did add that eventually, in a matter of 3 - 5 years, most floating point workloads would be moved off of the CPU and onto the GPU. At that point you could even argue against including any sort of FP logic on the "CPU" at all. It's clear that AMD's design direction with Bulldozer is to prepare for that future.
        I wonder that would mean for programmers.
        Last edited by Louise; 03 December 2009, 06:46 PM.

        Comment


        • #14
          I'm not sure moving *all* of the floating point logic off the CPU would ever make sense, since all kinds of programs use floating point variables and you still want those programs to run without falling back to SW floating point emulation.

          I believe the discussion is more about the SIMD instruction extensions, which work on explicitly vectorized code.
          Last edited by bridgman; 03 December 2009, 07:16 PM.
          Test signature

          Comment


          • #15
            Originally posted by bridgman View Post
            I'm not sure moving *all* of the floating point logic off the CPU would ever make sense, since all kinds of programs use floating point variables and you still want those programs to run without falling back to SW floating point emulation.
            Hopefully for Linux programmers it will just be a case of
            Code:
            sed -i 's/%f/%f_gpu/g' *


            Are there other things that GPU's are unquestionable better at than CPU's?

            I suppose that preforming FP calculations are very parallelizable?

            Originally posted by bridgman View Post
            I believe the discussion is more about the SIMD instruction extensions, which work on explicitly vectorized code.
            So something like the Cell architecture?

            Comment


            • #16
              Are there other things that GPU's are unquestionable better at than CPU's?
              It's typically highly parallelizable low-precision floating point code, but over the last few years a lot of progress has been made to improve the GPU in other areas and I'm sure that will continue.

              I suppose that preforming FP calculations are very parallelizable?
              Not intrinsically, no. Multiplying 2.5 * 3.5 is no more parallel than multiplying 2 * 3. However, many of the applications that heavily use fp hardware are - anything that benefits from SSE support, for example, is probably at least somewhat parallelizable.

              Also, GPU hardware is built from the ground up around floating point operations, while most of the code that goes through a cpu is integer based. Floating point hardware is a lot more complicated (and therefore expensive) than integer hardware, so the amount of fp resources that can be justified on a cpu are fairly limited.

              I believe they're talking about having a single gpu on chip to handle multiple cpu cores (threads) at a time, so that will let the hardware stretch it's legs a little and of course the compilers will probably be tuned to try to schedule as many fp operations at a time as they can.

              So something like the Cell architecture?
              It is sort of like the Cell architecture in that they're talking about having different kinds of specialized hardware on the same chip. Cell had a weak general purpose core with a bunch of highly specialized SPUs, while AMD is talking about multiple x86 cores along with a gpu that would handle most of the fp load.

              I'm just guessing here, but I think the idea would be for the hardware to automatically do all the offloading here, which is different than Cell. With Cell, you had to be very careful about what you were programming where, and I would assume that this stuff from AMD would just take normal code and have the cpu fetch/decode logic forward fp calls through to another part of the chip automatically for you.
              Last edited by smitty3268; 03 December 2009, 09:48 PM.

              Comment


              • #17
                Sorry, when I mentioned SIMD instruction extensions I was talking about the SIMD extensions which are already built into x86 processors today. SIMD extensions are how most of the serious floating point work is done on x86 today, but unless you're writing math libraries or game engines you probably don't see them.

                Originally x86 processors only handled integer work, and a separate x87 coprocessor handled floating point operations (or you trapped down to a software emulation library). The coprocessor had its own set of registers and other state information, but it pulled instructions out of the same stream as the rest of your program. The floating point coprocessor moved onto the same die as the CPU around the 486 days, but the separate instructions and registers remained (and are still there today AFAIK).

                Enter SIMD. The Intel MMX extensions added integer SIMD functions -- the ability to process multiple sets of data with a single instruction. AMD's 3DNow! added the first floating point SIMD extensions, targetted at 3D game geometry but useful in other areas as well. Intel's SSE extensions (Streaming SIMD aka Screaming Sindy) added more floating point capabilities along with a third set of registers.

                In general compilers don't directly use the SIMD extensions - you get at them through math libraries or calls into hand-tweeked assembler routines. This is starting to change but we're still at the early stages - OpenCL is one attempt to make the SIMD extensions on CPUs generally accessible.

                While all this was happening, GPUs were gradually evolving into floating point SIMD engines as well, but with a higher degree of streaming (trading off cache coherence for throughput) and parallelism than CPUs (HD5870 has 20 SIMD engines each executing 80 floating point ops per clock, ie 1600 operations per clock or 3200 FLOPs/clock for MAD).

                This obviously reminds one of Seymour Cray's comment about the emerging conflict between highly parallel microprocessor-based systems and optimized supercomputers :

                If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?
                It took a lot of years, but the chickens are finally starting to win.

                The GPU vs CPU discussion is primarily about whether the work currently handled by the CPU's SIMD extensions (SSEx) can be handled as well or better by a GPU-type architecture instead. So far it looks pretty good - the biggest application of SIMD floating point was 3D geometry and that moved to the GPU quite a few years ago. Math libraries come next, and many of them have been ported to GPUs. APIs like OpenCL and DirectCompute are designed to pick up most of the remaining workload and isolate it from the hardware specifics.
                Last edited by bridgman; 03 December 2009, 11:23 PM.
                Test signature

                Comment


                • #18
                  FYI, Larrabee is dead. (As I said was going to happen so long ago).

                  CNET is the world's leader in tech product reviews, news, prices, videos, forums, how-tos and more.

                  Comment


                  • #19
                    Originally posted by deanjo View Post
                    FYI, Larrabee is dead. (As I said was going to happen so long ago).
                    From what I've read it's not officially dead, they're just not releasing the first version because it's not competitive; which is hardly unusual with new architectures in the consumer graphics market. Of course they may decide never to build a second version with the lessons they've learned from this one, but they haven't killed Itanium yet so Larrabee may still be with us a decade or two from now.

                    Personally I've never thought that Larrabee made much sense -- why put x86 instructions into a massively parallel architecture if you could use a new instruction set and eliminate all the complex instruction decoding? -- but I wouldn't write it off yet.

                    Comment


                    • #20
                      Originally posted by movieman View Post
                      From what I've read it's not officially dead, they're just not releasing the first version because it's not competitive; which is hardly unusual with new architectures in the consumer graphics market. Of course they may decide never to build a second version with the lessons they've learned from this one, but they haven't killed Itanium yet so Larrabee may still be with us a decade or two from now.
                      Really by a decade or two there really would be no point as by that time CPU's should be massively parallel with possibly hundreds (maybe thousands) of cores making a x86 based graphics card relatively a moot product.

                      Comment

                      Working...
                      X