Announcement

Collapse
No announcement yet.

NVIDIA Announces The GeForce GTX 1060, Linux Tests Happening

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by siavashserver
    I heard you like optical illusions, so here you go )
    Ha, ha, marketing does that always, i don't claim AMD does not do that ... just point out what people could expect for real. In hope that 15% advantage claimed is not against slowest 4GB RX 480 tested and not in some nvidia optimized titles with too much tweaked drivers for release which will break soon after that

    Comment


    • #32
      Originally posted by Passso View Post
      So this will be 480 VS 1060. At last a real battle, the winner will get my money.

      Fight!

      I don't think there should be any doubt that Nvidia will win hands down with the closed drivers.
      More than performance is a matter of driver choice.

      Personally I'd go with the rx 480 since I'm terribly enjoying mesa drivers.
      While the don't offer the best performance, they're still good enough and I never get an issue of stability with them.
      Something I couldn't say neither with fglrx or nvidia closed drivers.

      On windows they should perform equally despite the Nvidia marketing slides (yes marketing lies..)
      Last edited by sonnet; 07 July 2016, 01:10 PM.

      Comment


      • #33
        Originally posted by duby229 View Post

        Are you sure they are in order? The diagram for a compute unit doesn't look at all like an in order architecture. Look at the diagram in the post above this one and then compare it to an out of order architecture. They look very similar.
        https://www.amd.com/Documents/GCN_Ar...whitepaper.pdf

        Search for "in-order", "To preserve in-order execution, each instruction must also come from a different wavefront;" in CU FRONT-END section. This is about instruction execution.
        Search for "out-of-order", "Tasks complete out-of-order, which releases resources earlier, but they must be tracked in the ACE for correctness." in SYSTEM ARCHITECTURE. This is about Asynchronous Compute.

        I don't think there are any throughput optimized architecture that executes instructions out-of-order. In fact, the whole point of building throughput optimized architecture is to use more transistors for calculation instead of instruction execution order tracking.

        Comment


        • #34
          Originally posted by duby229 View Post

          I wrote a reply, but it's in the mod que.

          EDIT: Basically the gist was that for AMD architectures it takes a compute unit to be a core.
          http://images.anandtech.com/doci/4455/GCN-CUTh.png

          Take a good look at this diagram, you can clearly see how the front end, fetch, and decode are at the compute unit. the stream processors by themselves aren't capable of doing anything. The logic to function exists at the compute unit level, which means the compute unit is the core.
          I know, that's what I meant. Both NVIDIA and AMD call each ALU a core. The RX 480 has 2304 "cores", the same as my GTX 780. They should be called ALUs because that's what they are.


          Originally posted by Passso View Post
          So this will be 480 VS 1060. At last a real battle, the winner will get my money.

          Fight!

          HADOUKEN!

          Comment


          • #35
            Originally posted by sonnet View Post
            On windows they should perform equally despite the Nvidia marketing slides
            GTX 1060: 1280 ALUs * 1.5GHz = 1900
            RX 480: 2304 ALUs * 1.2GHz = 2700

            Equal performance in Windows would mean GTX 1060 has 2700/1900=1.4 IPC advantage over RX 480.

            Comment


            • #36
              Originally posted by atomsymbol View Post
              GTX 1060: 1280 ALUs * 1.5GHz = 1900
              RX 480: 2304 ALUs * 1.2GHz = 2700

              Equal performance in Windows would mean GTX 1060 has 2700/1900=1.4 IPC advantage over RX 480.
              Actually, the 1060 has 1.7 GHz.
              However, the 480 is still in front, obviously, in terms of FLOPS. It has been like this for a while now, but it hasn't been a big problem for AMD because they were able to compensate it with wider GPUs. By 'wider' I mean basically more ALUs. AMD's GPUs are also more dense so they don't get much bigger (=more expensive).
              Now Nvidia has this Maxwell-shrink and raises clocks to a completely new dimension without extraordinary high voltages and AMD has its GCN shrink and can't rise clocks really much with quite high voltages. That's what really strokes them, imho. And that's what I don't understand. They had this problem before, now it is even worse. Did they think 14 LPP would fix all that automatically?
              Last edited by juno; 07 July 2016, 01:34 PM.

              Comment


              • #37
                Originally posted by dungeon View Post

                Ha, ha, marketing does that always
                One of the reasons I do my best to ignore marketing and wait for actual product launches and reviews.
                Edit: Yes, I know most reviews start by regurgitating all the marketing slides I try to ignore.

                Comment


                • #38
                  Originally posted by atomsymbol View Post
                  I don't understand the meaning of in-order and out-of-order in the context of a SIMD processor. Radeon GPUs cannot execute other instructions while waiting for data to arrive from memory for example?
                  They can not execute other instructions from the same instruction stream (the usual definition of out-of-order processing), but they can switch to another thread on the next clock and execute from that instruction stream instead. Each SIMD has 10 program counters associated with it (10 threads) for a total of 40 threads per CU.

                  What makes the terminology tricky is that the other thread may be executing the same shader program but with different data and a different program counter, but IMO that does not count as "out of order processing"... "block multithreading" is probably a good description.

                  Comment


                  • #39
                    Originally posted by atomsymbol View Post
                    I don't understand the meaning of in-order and out-of-order in the context of a SIMD processor. Radeon GPUs cannot execute other instructions while waiting for data to arrive from memory for example?
                    Bah... forum software ate my post again (not moderated, just started processing the post, then stopped, redrew the screen again, and my post was gone forever).

                    So... GCN GPUs can not execute other instructions from the same instruction stream while waiting for data to arrive from memory. They can, however, switch to another instruction stream on the next clock cycle and continue executing seamlessly. I believe "block multiplexing" is the usual name for this.

                    Each SIMD has 10 program counters associated with it, for a total of 40 threads per CU.

                    Comment


                    • #40
                      Auggh, that's two posts eaten one after another. Each SIMD is associated with 10 program counters, so 40 threads per CU. When waiting for memory etc... the shader core switches to another instruction stream on the next clock cycle, allowing it to continue execution but not on the same instruction stream.

                      Comment

                      Working...
                      X