Announcement

Collapse
No announcement yet.

AMD's HD7970 is ready for action! The most effiency and fastest card on earth.

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Sourse1: http://en.wikipedia.org/wiki/Norther...s_(GPU_family) - Sourse2: http://en.wikipedia.org/wiki/GeForce_400_Series - Sourse3: http://en.wikipedia.org/wiki/GeForce_500_Series - Look at the gtx400 wiki under the history: Fermi has double precision, half the single precision. Radeon has 1/4(vliw4) or 1/5(vliw5). - Look at the gtx500 wiki: gtx580 has 1.6Tflops "estimated 64bit-FMA2 power". But we want the final 32bit-FMA3 power, so we multiply by 3(64bit-DualIssue=2*32bit)(FMA3=3ops when FMA2=2). If you see gtx400 wiki under the products: FMA3 has estimations for 2.5 and 2.7ops when the diagram has only count the base 2ops(FMA2). So true gtx580 power=4.7Tflops when Radeon6970=2.7Tflops. Thats 75% faster for gtx580. Prove: http://www.anandtech.com/bench/Product/305?vs=292 - See that in the new Unigine (Heavy Tessellation Mode) gtx580=40%faster than Radeon6970. In DX11(high)=2x faster!!! And a real benchmark: h264 transcode= 75% faster (said so). Games are not good bench because multi-sampling is cheat and because there is not only FPS but FPS*Image Quality, and gtx580 has better FX that apply even in 3Dmark. - So how Radeon7000 with 35% better math (from 6970), can win gtx580??? And how can win a gtx600 that is 5times faster than gtx580: http://en.wikipedia.org/wiki/Compari...rce_600_Series ??? - Finlay VLIW has nothing to do with multiply. Vliw executes operations in parallel, based on a fixed schedule determined when programs are compiled, by a more complex compiler. So Vliw4 means that a vector register is 128bit and executes 4*32bit objects (Amd called those objects: stream processors). - Also Amd wont work for Linux: Many game crush when 3D starts (like CSS). Many have pure FPS due to pure Radeon D3D to OGL translations. Techs like Physics and CG wont work on GPU (only on CPU, so PCSX2 will run with 20fps on a Dual-SSSE3-@3ghz). Video acceleration does not do anything important, wille video transcoding practically don't exist (OpenCL expensive transcoders have bad quality, when Mediacoder-Cuda rocks), so I remember a friends phrase: "A bad h264 transcoder has less quality than a good Mpeg2".

    Comment


    • #62
      I mean poor, sorry!!!

      Comment


      • #63
        GTX 460se has only 100box new!!! Mine has 336cores unloked and 1.9ghz = 3,83Tflops(32bit-FMA3)!!!

        Comment


        • #64
          Originally posted by efikkan View Post
          I'm still waiting for boards shipping with Coreboot, which my company would consider buying. Replacing the EFI is ok for home use, but not for commercial use.

          Keep in mind AMD is committed to EFI.
          Of course they must be committed to EFI, Windows 8 will not boot without EFI. I'm hopping they will go with coreboot with a EFI payload and linux users will be able to replace the payload with a useful one. BTW, your company wants desktops or servers? For servers there is a higher chance to find boards made specifically for linux.

          Originally posted by efikkan View Post
          Then maybe you should check it out before dismissing the driver.
          1 Usually in the next or the following driver release, within a month.
          2 For Debian-based distributions you can just use the one in the ppa or add the custom ppa.
          3 I would disagree.
          1 That's exactly what I said. For my work computer it may be ok to wait up to a month before upgrading to the latest kernel, I do that anyway, but my home computer is as bleeding-edge as possible.
          2 I don't use a Debian based distribution, and even if I did I don't think it would solve my problem, I'm talking about building my own custom kernel (make oldconfig; make -j5; make install; make modules_install) not using the one provided by the distribution or some ppa
          3 Did you do a `dmesg | grep -i taint` to check it or you just like to disagree? Debian wiki still mentions this as a problem http://wiki.debian.org/NvidiaGraphic...ints_kernel.22

          BTW, all the problems I mentioned are design problems that apply to all binary drivers (or even free drivers maintained out of the linux kernel) and will most likely never be fixed.

          Originally posted by efikkan View Post
          You didn't get my point. Both alternatives has proprietary componens, so by following your logic both should be evil. There is no way to avoid everything proprietary in the real world.
          Keeping the kernel not tainted is a big enough issue for me. I do use hardware that requires proprietary firmware and some user-space blobs as well, I try to replace them with free alternatives whenever possible, but proprietary kernel drivers it's too much.

          Originally posted by efikkan View Post
          If you do OpenGL development, you should write on the OpenGL 3.2+ platform which every following version is based on. Every modern hardware since GeForce 8000/Radeon HD 2000 series support this hardware feature level(SM4), and SM4 differs from older versions. There is no sensible reason to optimize for OpenGL 2.1 today, when every performing piece of hardware today is optimized for SM4, and SM3(or older) will be a serious bottleneck.

          Utilities like Blender and GIMP uses OpenGL and OpenCL, so proper support is actually important for a large group of people.
          With OpenGL 3.0 being almost ready and most features required for OpenGL 3.3 except for newer GLSL being implemented, the biggest issue still remaining is OpenCL.

          Comment


          • #65
            Originally posted by artivision View Post
            you don't have any clue about how to work with sources.
            you have to quote the part of the source covers your claim.
            do you really think the people read the hole bibel just because you claim the bibel is the source?
            also your source about the Northern Island GPU family are useless because no quote.

            Originally posted by artivision View Post
            and again a source without quote is useless.
            the same source claim complete different numbers than your numbers.

            this source point out that the GTX480 only do have 0,16 TFLOPS (DP) and not your claimed 4 TFLOPS DP.


            Originally posted by artivision View Post
            - Sourse3: http://en.wikipedia.org/wiki/GeForce_500_Series - Look at the gtx400 wiki under the history: Fermi has double precision, half the single precision.
            you don't get the point that the gtx480 and gtx580 do have speed throttling in double precision. the speed throttling do a throttling to 1/8 of the speed.
            because of this the TESLA do have 0,666 Tflops DP without Throttling and the gtx580 only do have 0.18 Tflops DP.

            and even without Throttling the hd7970 is faster.


            Originally posted by artivision View Post
            Radeon has 1/4(vliw4) or 1/5(vliw5). - Look at the gtx500 wiki: gtx580 has 1.6Tflops "estimated 64bit-FMA2 power". But we want the final 32bit-FMA3 power, so we multiply by 3(64bit-DualIssue=2*32bit)(FMA3=3ops when FMA2=2). If you see gtx400 wiki under the products: FMA3 has estimations for 2.5 and 2.7ops when the diagram has only count the base 2ops(FMA2). So true gtx580 power=4.7Tflops when Radeon6970=2.7Tflops. Thats 75% faster for gtx580.
            and again you mix instruction set with raw calculating power FMA3 isn't RAW calculating power.

            and you don't understand the architecture of the VLIW its not 1/4.

            based on the optimisations and the workload its 100% of the speed not 1/4.

            I'm 100% sure your fake numbers are just your misunderstanding "So true gtx580 power=4.7Tflops " there are no "true power" you imagine a single workload with a specific instruction set. but without this kind of FAKE the gtx580 is much slower.


            Originally posted by artivision View Post
            your claimed prove prove you wrong :

            SmallLuxGPU 1.7beta - Luxball HDR
            Thousands of Rays per Second - Higher is Better

            gtx580- 6750 vs hd6970- 14600

            the amd card is 2,16 times faster than the nvidia.

            and surprise surprise this numbers covers the official numbers: 1,581 Tflops SP vs 2703 Tflops SP on amd side.

            Hell you are Stupid Really! i prove you wrong with you own SOURCE!

            Originally posted by artivision View Post
            - See that in the new Unigine (Heavy Tessellation Mode) gtx580=40%faster than Radeon6970.
            Tessellation has nothing to do with raw calculating speed.

            Comment


            • #66
              Originally posted by artivision View Post
              GTX 460se has only 100box new!!! Mine has 336cores unloked and 1.9ghz = 3,83Tflops(32bit-FMA3)!!!
              I also have a 460se. Can you tell me how you unlocked your card?

              Comment


              • #67
                1) FMA3(3ops),Madd+Mul(3ops),Madd(2ops), are not only parts of the instruction set, but raw power as well. - 2) 4.7TflopsSP-32bit or 2.3Tflops-64bit I said, and 2.3DP-32bit or 1.1DP-64bit. Each Vliw core is one DP regardless of the stream processors inside. At the end, even if a card does not have at all DP, you can make it happen in software, so not important. - 3) I am telling you exactly ware to look: gtx400 wiki under "products" tag and under "history" tag. - 4) Unigine is not tessellation, is a complete benchmark with heavy tessellation. And SmallLux is old, represents nothing. The truth is in the h264 transcoding. - 5) A friend of mine has Radeon6870@1ghz=22FPS with custom settings in a well known benchmark. I have gtx460@1.9ghz=26FPS at the same settings and same benchmark. - 6) Sorry for my tone, I mean no disrespect, but your knowledge is based in a very simplistic understanding of processors (not scientific way). - 7) I don't want to continue this dialog, it is not important from here on. The clues I give and the clues you give, are enough for anyone to understand what is more profitable for him. I don't sell Nvidia cards anyway.

                Comment


                • #68
                  Originally posted by artivision View Post
                  1) FMA3(3ops),Madd+Mul(3ops),Madd(2ops), are not only parts of the instruction set, but raw power as well.
                  in fact you calculate the raw power without an FMA3 instruction set.
                  because of this the nvidia Fermi cards do not have 4 Tflops DP
                  the Fermi do have 0,666 Tflops DP

                  your FMA3 Instruction Set claim only affect some calculations but not all.

                  Originally posted by artivision View Post
                  - 2) 4.7TflopsSP-32bit or 2.3Tflops-64bit I said, and 2.3DP-32bit or 1.1DP-64bit.
                  the Official nvidia number is 1,5811 Tflop SP and not 4.7TflopsSP

                  and again you imagine the speed up from a FMA3 instruction set as real Tflops but this is wrong.

                  Originally posted by artivision View Post
                  4) Unigine is not tessellation, is a complete benchmark with heavy tessellation.
                  in fact the Unigine benchmark do not test the calculation speed.

                  And the gtx580 is only faster in this benchmark than the hd6970 because the tesselation unit is faster and now comes the super clue the hd7970 do have a faster tesselation unit than the gtx580.
                  Originally posted by artivision View Post
                  And SmallLux is old, represents nothing.
                  In your source it represents the speed of Ray-Tracing. And in fact RayTracing is only calculation sped nothing more.
                  the SmallLux Raytracing benchmark prove the hd6970 is faster in calculating than the gtx580.
                  Originally posted by artivision View Post
                  The truth is in the h264 transcoding. - 5) A friend of mine has Radeon6870@1ghz=22FPS with custom settings in a well known benchmark. I have gtx460@1.9ghz=26FPS at the same settings and same benchmark.
                  no its not the "truth" you claimed your card is modification by bios manipulation.
                  this means if you don't manipulate your card the radeon card is faster.

                  also the h264 test do not prove the raw calculating speed its just speed up because of the FMA3 Instruction Set.

                  Originally posted by artivision View Post
                  3) I am telling you exactly ware to look: gtx400 wiki under "products" tag and under "history" tag. -
                  ok i do your job for you:

                  "So the theoretical single precision peak performance, with shader count [n] and shader frequency [f, GHz], can be estimated by the following, FLOPSsp ≈ f × n × 2 (FMA). Total Processing Power: for GF100 FLOPSsp ≈ f × m ×(32 SPs × 2(FMA) + 4 × 4 SFUs) "

                  GF100-> f(1401mhz) × m ×(32 SPs × 2(FMA) + 4 × 4 SFUs)≈ 1,34496 TFLOPsp

                  there is no room for your fake numbers.

                  this just prove the nvidia card slower in raw calculating power.


                  Originally posted by artivision View Post
                  - 6) Sorry for my tone, I mean no disrespect, but your knowledge is based in a very simplistic understanding of processors (not scientific way).
                  even your source prove 1,34TFLOPsp for the GTX480 and 1,6 TFLOPsp for the GTX580.

                  and the ironie is i use the same sources than you but you claim if you use the same source its Scientific and if i use the same source its not Scientific.


                  Originally posted by artivision View Post
                  - 7) I don't want to continue this dialog, it is not important from here on. The clues I give and the clues you give, are enough for anyone to understand what is more profitable for him. I don't sell Nvidia cards anyway.
                  sure your own source "wikipedia" and your benchmark source prove the nvidia as slower in raw Calculation.

                  anyone can read this in YOUR Sources LOL

                  anyway maybe raw-speed isn't all maybe only the instruction set FMA3 matters?

                  but you can prove this with the FX8150 amd cpu-> no intel customer care about this.

                  FMA4 doesn’t matter in real world.

                  Comment


                  • #69
                    @Ansla

                    W8 boots fine without EFI, but maybe you will need EFI in order to use OEM preactivation. For the W8 logo the system must support secure boot. Maybe you didnt notice that there was a 32 bit W8 preview, that will never require EFI, because thats something for 64 bit systems only.

                    Comment


                    • #70
                      @artivision i do have another source to prove you complete wrong:
                      http://bitclockers.com/forums/index.php?topic=6.0

                      HD6970=402Mhash/s
                      GTX580=125Mh/s

                      in fact the amd card is 4 time faster in calculations than nvidia!

                      Comment


                      • #71
                        For the last time: "FLOPSsp ≈ f × n × 2 (FMA)"=wrong, does not exist. Correct="FLOPSsp ≈ f × n × 2 (FMA2)". Fermi="FLOPSsp ≈ f × n × 3 (FMA3)". They don't write it because the third OP is not certain, but with a good driver is above 90% certain for games. They actually explain that in the past, and in g80-Gtx200 models they actually count it "FLOPSsp ≈ f × n × 3 (madd+mul). Second: gtx580=1.6TflopsSP is in 64bit Dual-Isue. Fermi cores are 64bit, you can't compare them with 32bit cores of Radeon6970=2.7Tflops. You must convert Fermi Flops in 32bit if you want to be accurate, thats 2x tested. Anyway Fermi is faster in 99% of all benchmarks, I don't thing you have an objection here, are you? Ad at the end, a card with 3 billion transistor@1.5ghz, can't lose to a card (same generation) with 2 billion transistor@880mhz. And Kepler with 3 billion transistor@3ghz, 640cores-128bit-quad-issue(probably not sure), will be 5 times faster than gtx580. Also don't forget all the Radeon problems on Linux.

                        Comment


                        • #72
                          Originally posted by Kano View Post
                          @Ansla

                          W8 boots fine without EFI, but maybe you will need EFI in order to use OEM preactivation. For the W8 logo the system must support secure boot. Maybe you didnt notice that there was a 32 bit W8 preview, that will never require EFI, because thats something for 64 bit systems only.
                          You don't really think some OEM will release a system without the Windows 8 logo long after it's release? They don't target advanced users that would install Windows themselves, even if it were possible.

                          Comment


                          • #73
                            The most problematic part with EFI booting is the mac support. Usually you can add menu entries using efibootmgr but there this does not work - on standard UEFI systems it works. You just need to learn how to use it when you exchange the motherboard and your bootloader is not named "/efi/boot/bootx64.efi". You can not chainload to EFI when you boot via mbr, so you want grub2 to boot Win you need to use Linux via EFI too. os-prober does not find it, but grub2 itself can find it as well - a new 64 bit Kanotix Hellfire will feature grub2 hybrid mode iso image where you can directly test it on mbr, EFI and even mac. So 1 iso for every system...

                            Comment


                            • #74
                              Originally posted by artivision View Post
                              For the last time: "FLOPSsp ≈ f × n × 2 (FMA)"=wrong, does not exist. Correct="FLOPSsp ≈ f × n × 2 (FMA2)". Fermi="FLOPSsp ≈ f × n × 3 (FMA3)". They don't write it because the third OP is not certain, but with a good driver is above 90% certain for games. They actually explain that in the past, and in g80-Gtx200 models they actually count it "FLOPSsp ≈ f × n × 3 (madd+mul). Second: gtx580=1.6TflopsSP is in 64bit Dual-Isue. Fermi cores are 64bit, you can't compare them with 32bit cores of Radeon6970=2.7Tflops. You must convert Fermi Flops in 32bit if you want to be accurate, thats 2x tested. Anyway Fermi is faster in 99% of all benchmarks, I don't thing you have an objection here, are you? Ad at the end, a card with 3 billion transistor@1.5ghz, can't lose to a card (same generation) with 2 billion transistor@880mhz. And Kepler with 3 billion transistor@3ghz, 640cores-128bit-quad-issue(probably not sure), will be 5 times faster than gtx580. Also don't forget all the Radeon problems on Linux.
                              the raytracing and bitcoin benchmarks show you wrong.

                              the nvidia is not magic magic faster and the HD7970 much faster than the old 6970.

                              nvidia lose in nearly all "Calculating only" benchmarks and they won in graphic benchmarks only because of there tessellation unit but the tessellation unit of the hd7970 is faster.

                              you claim that 64bit means double of the speed is proved wrong on the PC if you benchmark an old Pentium 4- 571 on 32bit and 64bit then the result is 32 bit is faster.
                              only the amd cpus are faster in 64bit because they do have deactivated calculating units in 32bit and they activate it in 64bit mode.
                              In fact in most of the cases 32bit is faster than 64bit if you use the same instruction set and the same pices of calculating units.

                              You claim your nvidia is double speed just because 64bit is just wrong in most of the time its slower because of 64bit because 64bit just eat your memory bandwith.

                              Comment


                              • #75
                                Originally posted by Ansla View Post
                                Of course they must be committed to EFI, Windows 8 will not boot without EFI. I'm hopping they will go with coreboot with a EFI payload and linux users will be able to replace the payload with a useful one. BTW, your company wants desktops or servers? For servers there is a higher chance to find boards made specifically for linux.
                                I'm thinking both servers and workstations. For servers there is a tiny hope something will show up, but we are not talking about large quantities of servers here, so custom ordering is out of the question. For workstations high performance per core is a requirement, so I believe Xeon E5(LGA2011) is the only choice here, and I don't know of any coreboot plans for these boards.

                                Originally posted by Ansla View Post
                                1 That's exactly what I said. For my work computer it may be ok to wait up to a month before upgrading to the latest kernel, I do that anyway, but my home computer is as bleeding-edge as possible.
                                2 I don't use a Debian based distribution, and even if I did I don't think it would solve my problem, I'm talking about building my own custom kernel (make oldconfig; make -j5; make install; make modules_install) not using the one provided by the distribution or some ppa
                                3 Did you do a `dmesg | grep -i taint` to check it or you just like to disagree? Debian wiki still mentions this as a problem http://wiki.debian.org/NvidiaGraphic...ints_kernel.22

                                BTW, all the problems I mentioned are design problems that apply to all binary drivers (or even free drivers maintained out of the linux kernel) and will most likely never be fixed.

                                Keeping the kernel not tainted is a big enough issue for me. I do use hardware that requires proprietary firmware and some user-space blobs as well, I try to replace them with free alternatives whenever possible, but proprietary kernel drivers it's too much.
                                I don't think many consider waiting a few weeks for new drivers a problem. nVidia is very quick to immediately release drivers when new hardware or new OpenGL specifications arrive, and even sometimes before. Don't forget the beta drivers either.

                                When both the official nVidia drivers and the open drivers has proprietary components, they both should be considered a "degree of evil", if considered evil at all. It would be wrong to consider the one as "evil" while the other one as "good". Having a buggy, unstable, low-performing, power hungry driver would certainly taint my system.

                                BTW, are you expecting DeepColor(30-bit) through DisplayPort on the open drivers anytime soon? SLI support?


                                Originally posted by Ansla View Post
                                With OpenGL 3.0 being almost ready and most features required for OpenGL 3.3 except for newer GLSL being implemented, the biggest issue still remaining is OpenCL.
                                The essential GLX_ARB_create_context is still missing, and GLX_ARB_create_context_profile which is used for context creation for every version since 3.2 is also missing. This is quite annoying for handling context creation on the open drivers, you'll have to write a fallback.

                                While the open drivers still are playing catch-up struggling with version 3.0, version 4.3 could just be around the corner. A new revision is expected when Kepler arrives. There is also a difference between proof-of-concept support and performing production support.

                                Comment

                                Working...
                                X