Announcement

Collapse
No announcement yet.

Goodbye ATI

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Measuring general flops is wrong, the right thing is measuring mac-flops. Example: madd(x=a*b) is 1 general flop because does a *action or a +action, but is 2-mac-flops because does action in 2 objects a,b. fmac(x=a*b+c) is 2-general-flops because does 2 fused actions a* and a+, but is 3-mac-flops because does action in 3 objects a,b,c. So fmac is 1,5-2 times faster than madd. Examples:

    gtx280: 240mimd-cores(32bit)*1,4ghz*2-3(madd+mul)= 1-mac-tflop

    radeon6900: 384vliw4-cores*4-32bit-executions*900mhz= 1,35-general-tflops(32bit) *2(madd)=2,7mactflops(32bit)

    gtx580: 512mimd(dual issue)cores*1-64bit or 2-32bit executions*1,55ghz*2-3(fmac)= 1,6-general-tflops(64bit) or 3,2-general-tflops(32bit) or 4,7-mac-tflops(32bit)

    radeon7000: 512simd4-cores*4-32bit-executions*900mhz*2-3(fmac)= 3,8-general-tflops(32bit) or 5,7-mac-tflops(32bit)

    gtx600: 2xCores(gtx580)*2xBitrate(128bit quad issue, at the same transistors)= 6,5-general-tflops(64bit) or 13-general-tflops(32bit) or 20-mac-tflops(32bit).

    Wile amd gains 2xFlops/watt per generation, nvidia gains 4x after gtx280. Also amd has mediocre opengl and bad d3d to ogl translation, regardless of the generation. Amd is not even close in programmability to fermi and kepler, no full native integer, no full native 64bit, no good vm like cuda for wide many language support. Amd is not cheap, I can buy a gtx-460 new for 100 box in my country, overclocking to 1,8-1,9ghz and I have near gtx580 or 75%-radeon7000 performance, and I use wine-mediacoder_cuda for h264 encoding.

    Comment


    • Originally posted by karmakoma View Post
      I've also switched to nvidia (GTX 560). Solved problems (using closed nvidia driver):
      • games under wine work,
      • accelerated video decoding without problems (more formats supported),
      • hibernation/sleep problems solved (under flgrx it usually worked),
      • vsync works,
      • KDE works without screen problems (seems like uninitialized video memory)

      Before I used HD4870, this card is great but only under Windows (it is almost as powerful as my current nvidia card). Using fglrx wasn't so bad, problem was that there were many many small issues that made the overall experience quite hard.
      The vertical sync issue seems to be getting a bug fix within the next two releases (See the Unofficial bug report)

      Comment


      • Originally posted by artivision View Post
        Measuring general flops is wrong, the right thing is measuring mac-flops. Example: madd(x=a*b) is 1 general flop because does a *action or a +action, but is 2-mac-flops because does action in 2 objects a,b. fmac(x=a*b+c) is 2-general-flops because does 2 fused actions a* and a+, but is 3-mac-flops because does action in 3 objects a,b,c. So fmac is 1,5-2 times faster than madd. Examples:

        gtx280: 240mimd-cores(32bit)*1,4ghz*2-3(madd+mul)= 1-mac-tflop

        radeon6900: 384vliw4-cores*4-32bit-executions*900mhz= 1,35-general-tflops(32bit) *2(madd)=2,7mactflops(32bit)

        gtx580: 512mimd(dual issue)cores*1-64bit or 2-32bit executions*1,55ghz*2-3(fmac)= 1,6-general-tflops(64bit) or 3,2-general-tflops(32bit) or 4,7-mac-tflops(32bit)

        radeon7000: 512simd4-cores*4-32bit-executions*900mhz*2-3(fmac)= 3,8-general-tflops(32bit) or 5,7-mac-tflops(32bit)

        gtx600: 2xCores(gtx580)*2xBitrate(128bit quad issue, at the same transistors)= 6,5-general-tflops(64bit) or 13-general-tflops(32bit) or 20-mac-tflops(32bit).

        Wile amd gains 2xFlops/watt per generation, nvidia gains 4x after gtx280. Also amd has mediocre opengl and bad d3d to ogl translation, regardless of the generation. Amd is not even close in programmability to fermi and kepler, no full native integer, no full native 64bit, no good vm like cuda for wide many language support. Amd is not cheap, I can buy a gtx-460 new for 100 box in my country, overclocking to 1,8-1,9ghz and I have near gtx580 or 75%-radeon7000 performance, and I use wine-mediacoder_cuda for h264 encoding.
        old and wrong story again. this is only true if you need FMA for general use you can check the speed with bitcoin as a benchmark and in fact the AMD card RIP the nvidia cards.

        but yes you can explain to us why amd cards are sooo good in bitcoin.

        its the same with AMD-FX8150 vs intel and FMA4 most of the programms don't use FMA4 because of this intel win all the time.

        with the nvidia card same shit. FMA is really specific and not "Generally"

        Comment


        • FMA is not instruction set and it is general, but not certain. It is just the two functions of Madd(*,+) fused. You must have a well written and clean code in order to achieve more than 95% fusion. The BitcoinXXX lacks, here some benchmarks: http://www.xbitlabs.com/articles/gra...k_5.html#sect3 In Unigine with heavy tessellation, Fermi crushes Radeon by 1.5-2x!!! Any way I want to correct my previous post for gtx600. I have new information: gtx600 has only 3.5 billion transistor 28nm against 3bt 40nm of gtx580, has 1024 64bit cores (gtx500 kind), and 1.8ghz clock with "turbo bust" near 2ghz, with only 180w (30% less than gtx580). Offers 2.5x the gtx500 performance and 3.4x the per/watt efficiency (4Tflops@64bit - 8Tflops@32bit - 12macTflops). Also offers 2.1x the radeon7970 performance with less watt, that means nVidia can give a 1536 cores card in the near future (3 months later). Any way here is PHORONIX-linux(libre), we not speak for something that does not work on Linux. So why some of you speak about Radeon cards here?

          Comment


          • Originally posted by artivision View Post
            FMA is not instruction set and it is general, but not certain. It is just the two functions of Madd(*,+) fused. You must have a well written and clean code in order to achieve more than 95% fusion. The BitcoinXXX lacks, here some benchmarks: http://www.xbitlabs.com/articles/gra...k_5.html#sect3 In Unigine with heavy tessellation, Fermi crushes Radeon by 1.5-2x!!! Any way I want to correct my previous post for gtx600. I have new information: gtx600 has only 3.5 billion transistor 28nm against 3bt 40nm of gtx580, has 1024 64bit cores (gtx500 kind), and 1.8ghz clock with "turbo bust" near 2ghz, with only 180w (30% less than gtx580). Offers 2.5x the gtx500 performance and 3.4x the per/watt efficiency (4Tflops@64bit - 8Tflops@32bit - 12macTflops). Also offers 2.1x the radeon7970 performance with less watt, that means nVidia can give a 1536 cores card in the near future (3 months later). Any way here is PHORONIX-linux(libre), we not speak for something that does not work on Linux. So why some of you speak about Radeon cards here?
            "In Unigine with heavy tessellation, Fermi crushes Radeon"
            you life in the past because the hd7970 is faster in tessellation than your gtx580
            means old storry

            "FMA is not instruction set and it is general, but not certain. It is just the two functions of Madd(*,+) fused. You must have a well written and clean code in order to achieve more than 95% fusion."

            bla bla bla bla and you don't count it if your software don't use FMA

            same with amd and FMA4 if your software use it then the AMD is 56 times faster than the intel.
            FMA is the only case were nvidia wins and you make it generall only to get a point but this is just not true!

            "Any way I want to correct my previous post for gtx600. "

            the hd7970 runs qt 1,4ghz (base is 900) and the Guinness book of records list the hd7970 as the fastest card on earth and not an nvidia gtx680

            to beat a gtx680 you only need high resulutions then the 2gb vram hit as a bottle neck and the hd7970 win because of the 3gb vram

            also nvidia is known to manipulate many benchmarks just because physX work only on nvidia.

            also there comes a hd7970 with 6gb vram(not a dualcard means 6gb is useable) only stupid people think 2gb vram in the gtx680 is better vor compute gpu tasts than 6gb vram.

            the gtx680 is only for poor people.

            Comment


            • New radeon7970 has FMA replacing old Vector Vliw5 with new scalar arch. Radeon7970 is only 20% faster than gtx580 and less programmable, what to do against ftx600(2,5x faster). Tessellation is a control engine, not an execution unit, the main program runs on shaders, more flops=more speed. Many GB-ram has nothing to do with speed. And radeon does not work on Linux.

              Comment


              • And a Radeon7970 has 1.1ghz maximum clock not 1.4ghz. http://www.tomshardware.co.uk/AMD-Ra...ews-37352.html

                Comment


                • Originally posted by artivision View Post
                  And a Radeon7970 has 1.1ghz maximum clock not 1.4ghz. http://www.tomshardware.co.uk/AMD-Ra...ews-37352.html
                  Guinness book of records count 1,4ghz.

                  Comment


                  • Originally posted by artivision View Post
                    New radeon7970 has FMA replacing old Vector Vliw5 with new scalar arch. Radeon7970 is only 20% faster than gtx580 and less programmable, what to do against ftx600(2,5x faster). Tessellation is a control engine, not an execution unit, the main program runs on shaders, more flops=more speed. Many GB-ram has nothing to do with speed. And radeon does not work on Linux.
                    "Many GB-ram has nothing to do with speed. "

                    its a bottle neck on high resolutions and Super sampling AA

                    "And radeon does not work on Linux."

                    well this depends how you count i only count open-source drivers.

                    "Radeon7970 is only 20% faster than gtx580"

                    if you overclock both the gtx680 and the hd7970 then the radeon wins

                    Comment


                    • I don't count open source drivers because they don't exist, even the gallium r600 driver has closed firmware. About 1.4ghz radeon, that was with ice cooling, the true maximum clock is 1.1ghz: http://www.tomshardware.co.uk/AMD-Ra...ews-37352.html Radeon7970 is only 20% faster than gtx580: http://www.xbitlabs.com/articles/gra...c_6.html#sect3 The end gtx600 specs are 1536(64bit) cores @1.8+ghz, in 28nm, 3.5Btrans, 200+-w: http://www.techpowerup.com/162500/GK...Explained.html and http://www.techpowerup.com/162504/NV...-Detailed.html How a card 3.6x faster than gtx580 and 3x faster than radeon7970 will lose to someone? Even the 1/3cores 130box mini Kepler will be magnificent!

                      Comment


                      • Originally posted by artivision View Post
                        I don't count open source drivers because they don't exist, even the gallium r600 driver has closed firmware. About 1.4ghz radeon, that was with ice cooling, the true maximum clock is 1.1ghz: http://www.tomshardware.co.uk/AMD-Ra...ews-37352.html Radeon7970 is only 20% faster than gtx580: http://www.xbitlabs.com/articles/gra...c_6.html#sect3 The end gtx600 specs are 1536(64bit) cores @1.8+ghz, in 28nm, 3.5Btrans, 200+-w: http://www.techpowerup.com/162500/GK...Explained.html and http://www.techpowerup.com/162504/NV...-Detailed.html How a card 3.6x faster than gtx580 and 3x faster than radeon7970 will lose to someone? Even the 1/3cores 130box mini Kepler will be magnificent!
                        Nouveau has free and open source firmware. They asked Nvidia if they could distribute theirs and Nvidia said no, so the Nouveau project came up with their own.

                        I guess it's an odd example of a company being openly hostile to free and open source software where some good came out of it.

                        I doubt we'll ever be able to use most of our devices without blob firmware. Sometimes it's hard to convince a company to allow their blobs to be redistributed. So many distributions like Debian and Fedora give up and try to rationalize blob firmware as being somehow different than proprietary software so that they can look like they are not violating their own policies.

                        Comment


                        • Originally posted by artivision View Post
                          I don't count open source drivers because they don't exist, even the gallium r600 driver has closed firmware. About 1.4ghz radeon, that was with ice cooling, the true maximum clock is 1.1ghz: http://www.tomshardware.co.uk/AMD-Ra...ews-37352.html Radeon7970 is only 20% faster than gtx580: http://www.xbitlabs.com/articles/gra...c_6.html#sect3 The end gtx600 specs are 1536(64bit) cores @1.8+ghz, in 28nm, 3.5Btrans, 200+-w: http://www.techpowerup.com/162500/GK...Explained.html and http://www.techpowerup.com/162504/NV...-Detailed.html How a card 3.6x faster than gtx580 and 3x faster than radeon7970 will lose to someone? Even the 1/3cores 130box mini Kepler will be magnificent!
                          "About 1.4ghz radeon, that was with ice cooling, the true maximum clock is 1.1ghz:"

                          LOL if the card run 1,4ghz on ice then the true maximum clock is 1,4ghz.

                          everyone can buy a refrigerator for his card. a friend of my use a Beer-continuous-refrigerator for his Water cooling system and he don't use water he use Glycol so he run on -27C

                          "will lose to someone?"

                          sure nvidia will lose because : hd7990- 12gb vram then you buy a "beer cooler by running" for your water cooling system and then you overclock it

                          Comment


                          • Hardware is worth nothing if the drivers suck

                            Comment


                            • That goes both ways, drivers are worth nothing if the hardware sucks

                              Comment


                              • Indeed. Good thing the hardware doesn't suck ;-)

                                Comment

                                Working...
                                X