Announcement

Collapse
No announcement yet.

GravityMark OpenGL/Vulkan Performance For NVIDIA RTX 30 vs. AMD Radeon RX 6000 Series

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Found an error! Under the very first graph, you say that the RX 6800XT is equal to the 3080 Ti but you mean the 3070 Ti

    Comment


    • #12
      I also get an illegal instruction error.

      I am not much of a developer but I can follow the monkey steps for using GDB

      From GDB:

      >0x7fd939fb8e38 vxorps %xmm0,%xmm0,%xmm0

      I think this is an AVX instruction and it would make sense that I couldn't run it since this old workstation is a Xeon X5670.

      They might add that you need AVX for this to run.
      Last edited by cbdougla; 18 June 2021, 04:55 PM.

      Comment


      • #13
        Michael Any chance you could try AMD's proprietary binary vulkan driver with those cards. Possibly you hit a slow path in radav.

        Comment


        • #14
          here are my 1080p results with my 6900xt for those curious:

          vulkan: https://i.imgur.com/rn86Hwg.png
          opengl: https://i.imgur.com/93pZrqM.png

          specs:
          amd ryzen 5800x
          32gb ddr4 3200 cas 16
          amd reference 6900xt
          msi b550 tomahawk with latest agesa
          sam enabled
          arch linux - kernel 5.12.11-arch1-1
          mesa 21.1.2-1
          radv 21.1.2-1

          Comment


          • #15
            Just if anyone was wondering about how RX Vega 56/64 is doing.

            99.3FPS OpenGL
            105.8FPS Vulkan (RADV)

            The benches ran with a bunch of instances of Firefox (and even more YouTube tabs) opened in the background, so there might be some improvements possible. GPU capped out in Vulkan and OpenGL at ~160W instead of the 220W+ I usually have in games, indicating a GPU internal bottleneck (geometry?). Performance in Vulkan seems to be two thirds of the RTX3060's, which should place it around RTX2060 levels of performance, so everything seems to be in order. In OpenGL, Vega and 3060 are within 10% of each other.

            RX Vega 56 (1610Mhz core/1145Mhz HBM), no powerlimit, SAM
            Ryzen 3900X stock, schedutil
            32GB 3733-CL17 1:1:1
            Gentoo GNU/Linux
            KDE Plasma Wayland 5.22.1, Xwayland 1.20.11
            Kernel 5.12.7, PDS-mq Scheduler
            Mesa 21.1.2, libdrm 2.4.106, glibc 2.33
            Global compiler options: "-O2 -pipe -march=znver2 -mtune=znver2 -ftree-vectorize -fomit-frame-pointer -fstack-protector-strong"

            Comment


            • #16
              Originally posted by kiffmet View Post
              Just if anyone was wondering about how RX Vega 56/64 is doing.
              The benches ran with a bunch of instances of Firefox (and even more YouTube tabs) opened in the background, so there might be some improvements possible. GPU capped out in Vulkan and OpenGL at ~160W instead of the 220W+ I usually have in games, indicating a GPU internal bottleneck (geometry?). Performance in Vulkan seems to be two thirds of the RTX3060's, which should place it around RTX2060 levels of performance, so everything seems to be in order. In OpenGL, Vega and 3060 are within 10% of each other.
              There was a strange bottleneck on Vega 7, limiting DIP performance by 36M million DIP per second under all API.
              At the same time, Nvidia GPUs can draw 200M DIPs per second. Vega 56/64 likely has the same geometry throughput constraint.

              Comment


              • #17
                Originally posted by frustum View Post

                There is no AVX support on i7-3635QM. It's not, actually, required for the benchmark. It looks like we missed it.

                There is a benchmark update at https://gravitymark.com where LD_LIBRARY_PATH modification is not required.
                Thank you for your feedback.
                Do you plan to release a non-AVX version eventually?

                Edit:
                From the back of my mind came the memory of the CPU having AVX and, behold lscpu:
                Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pc id sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida ar at pln pts md_clear flush_l1d
                Last edited by reba; 19 June 2021, 02:09 AM.

                Comment


                • #18
                  Did a small digging with gdb.


                  Used command line:
                  Code:
                  user@~/GravityMark_1.1_linux/bin$ gdb ./GravityMark.x64

                  Triggering the bad instruction:
                  Code:
                  (gdb) run
                  Starting program: [...]/GravityMark_1.1_linux/bin/GravityMark.x64
                  [Thread debugging using libthread_db enabled]
                  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
                  M: 0 us: GravityMark Started
                  M: 12.85 ms: ../data.zip: 313 files
                  [New Thread 0x7fffe37ef700 (LWP 3111649)]
                  [New Thread 0x7fffe2fee700 (LWP 3111650)]
                  [New Thread 0x7fffe27ed700 (LWP 3111651)]
                  [New Thread 0x7fffe1fec700 (LWP 3111652)]
                  [New Thread 0x7fffe17eb700 (LWP 3111653)]
                  [New Thread 0x7fffe0fea700 (LWP 3111654)]
                  [New Thread 0x7fffcbfff700 (LWP 3111655)]
                  [New Thread 0x7fffcb7fe700 (LWP 3111656)]
                  [New Thread 0x7fffcaffd700 (LWP 3111657)]
                  [New Thread 0x7fffca7fc700 (LWP 3111658)]
                  [New Thread 0x7fffc9ffb700 (LWP 3111659)]
                  [New Thread 0x7fffc97fa700 (LWP 3111660)]
                  [New Thread 0x7fffc8ff9700 (LWP 3111661)]
                  [New Thread 0x7fffabfff700 (LWP 3111662)]
                  [New Thread 0x7fffab7fe700 (LWP 3111663)]
                  [New Thread 0x7fffaaffd700 (LWP 3111664)]
                  [New Thread 0x7fffaa7fc700 (LWP 3111665)]
                  [New Thread 0x7fffa9ffb700 (LWP 3111666)]
                  [New Thread 0x7fffa97fa700 (LWP 3111667)]
                  [New Thread 0x7fffa8ff9700 (LWP 3111668)]
                  [New Thread 0x7fff87fff700 (LWP 3111680)]
                  [New Thread 0x7fff877fe700 (LWP 3111681)]
                  [New Thread 0x7fff86ffd700 (LWP 3111682)]
                  [New Thread 0x7fff867fc700 (LWP 3111683)]
                  WARNING: radv is not a conformant vulkan implementation, testing use only.
                  MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
                  [New Thread 0x7fff85ffb700 (LWP 3111684)]
                  [New Thread 0x7fff857fa700 (LWP 3111685)]
                  [New Thread 0x7fff84ff9700 (LWP 3111686)]
                  [New Thread 0x7fff67fff700 (LWP 3111687)]
                  M: 1.410 s: Build Date: Jun 18 2021
                  M: 1.410 s: Build Info: release; vk=1; gl=45; gles=32
                  [Detaching after vfork from child process 3111688]
                  [Detaching after vfork from child process 3111690]
                  [Detaching after vfork from child process 3111692]
                  [Detaching after vfork from child process 3111694]
                  M: 1.449 s: System: Linux 5.12.10-xanmod1 x86_64 GNU/Linux
                  M: 1.449 s: Kernel: #0~git20210610.d262d74 SMP PREEMPT Thu Jun 10 13:59:27 UTC 2021
                  M: 1.449 s: Memory: 15.52 GB
                  M: 1.449 s: Uptime: 4 days 12:54
                  M: 1.449 s: CPU: Intel(R) Core(TM) i7-3635QM CPU @ 2.40GHz
                  M: 1.449 s: Device: VEN_8086&DEV_0166&SUBSYS_C0E6144D
                  M: 1.449 s: GPU: [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X]
                  M: 1.449 s: Device: VEN_1002&DEV_6821&SUBSYS_C0E6144D
                  M: 1.449 s: Memory: 2.00 GB
                  M: 1.454 s: Desktop: 1920x1080 1.0
                  M: 1.454 s: Screen 0: 1920x1080 0 0 eDP-1
                  M: 1.454 s: Creating 1600x900 Vulkan Window
                  M: 1.476 s: Using Fetch mode
                  
                  Thread 1 "GravityMark.x64" received signal SIGILL, Illegal instruction.
                  0x00007ffff77d5df1 in ?? () from ./libTellusim_x64.so

                  A bit of stacktrace:
                  Code:
                  (gdb) where
                  #0 0x00007ffff77d5df1 in ?? () from ./libTellusim_x64.so
                  #1 0x00007ffff78a8c2b in ?? () from ./libTellusim_x64.so
                  #2 0x00007ffff78a8a4e in ?? () from ./libTellusim_x64.so
                  #3 0x00007ffff77531a8 in Tellusim::ControlRoot::ControlRoot(Tellusim::Canva s&, bool) () from ./libTellusim_x64.so
                  #4 0x000000000040cf36 in ?? ()
                  #5 0x000000000040fff8 in ?? ()
                  #6 0x000000000040b31d in ?? ()
                  #7 0x00007ffff707fd0a in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
                  #8 0x000000000040b15a in ?? ()

                  The offending instruction's mnemonic:
                  Code:
                  (gdb) x/i 0x00007ffff77d5df1
                  => 0x7ffff77d5df1: [B]vbroadcastss[/B] %xmm2,%xmm1

                  From https://www.felixcloutier.com/x86/vbroadcast we find:
                  VEX.128.66.0F38.W0 18 /r VBROADCASTSS xmm1, m32 A V/V AVX Broadcast single-precision floating-point element in mem to four locations in xmm1.
                  VEX.256.66.0F38.W0 18 /r VBROADCASTSS ymm1, m32 A V/V AVX Broadcast single-precision floating-point element in mem to eight locations in ymm1.
                  VEX.256.66.0F38.W0 19 /r VBROADCASTSD ymm1, m64 A V/V AVX Broadcast double-precision floating-point element in mem to four locations in ymm1.
                  VEX.256.66.0F38.W0 1A /r VBROADCASTF128 ymm1, m128 A V/V AVX Broadcast 128 bits of floating-point data in mem to low and high 128-bits in ymm1.
                  VEX.128.66.0F38.W0 18/r VBROADCASTSS xmm1, xmm2 A V/V AVX2 Broadcast the low single-precision floating-point element in the source operand to four locations in xmm1.
                  VEX.256.66.0F38.W0 18 /r VBROADCASTSS ymm1, xmm2 A V/V AVX2 Broadcast low single-precision floating-point element in the source operand to eight locations in ymm1.
                  VEX.256.66.0F38.W0 19 /r VBROADCASTSD ymm1, xmm2 A V/V AVX2 Broadcast low double-precision floating-point element in the source operand to four locations in ymm1.
                  So the offending instruction is actually an AVX2 instruction, which, indeed, the CPU does not support.
                  Last edited by reba; 19 June 2021, 02:58 AM.

                  Comment


                  • #19
                    Originally posted by reba View Post
                    Thank you for your feedback.
                    Do you plan to release a non-AVX version eventually?
                    No problem. This is SSE4 version (performance should be the same):



                    AVX1 and AVX2 requirement will be removed in the next benchmark updates.

                    Comment


                    • #20

                      Code:
                      M: 0 us: GravityMark Started
                      M: 12.20 ms: ../data.zip: 313 files
                      M: 12.31 ms: Temporal antialiasing
                      WARNING: radv is not a conformant vulkan implementation, testing use only.
                      MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
                      M: 1.071 s: Build Date: Jun 18 2021
                      M: 1.071 s: Build Info: release; vk=1; gl=45; gles=32
                      M: 1.108 s: System: Linux 5.12.10-xanmod1 x86_64 GNU/Linux
                      M: 1.108 s: Kernel: #0~git20210610.d262d74 SMP PREEMPT Thu Jun 10 13:59:27 UTC 2021
                      M: 1.108 s: Memory: 15.52 GB
                      M: 1.108 s: Uptime: 4 days 13:28
                      M: 1.108 s: CPU: Intel(R) Core(TM) i7-3635QM CPU @ 2.40GHz
                      M: 1.108 s: Device: VEN_8086&DEV_0166&SUBSYS_C0E6144D
                      M: 1.108 s: GPU: [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X]
                      M: 1.109 s: Device: VEN_1002&DEV_6821&SUBSYS_C0E6144D
                      M: 1.109 s: Memory: 2.00 GB
                      M: 1.114 s: Desktop: 1920x1080 1.0
                      M: 1.114 s: Screen 0: 1920x1080 0 0 eDP-1
                      M: 1.114 s: Creating 1600x900 Vulkan Window
                      M: 1.126 s: Using Fetch mode
                      M: 1.222 s: Device: AMD RADV VERDE (ACO)
                      M: 1.222 s: MaxViewportCount: 16
                      M: 1.222 s: MaxUniformSize: 4.00 GB
                      M: 1.222 s: MaxStorageSize: 4.00 GB
                      M: 1.222 s: Creating SceneManager
                      M: 2.069 s: Creating RenderManager
                      M: 2.864 s: Creating Scene
                      M: 4.491 s: Creating 200,000 Asteroids
                      M: 4.672 s: Updating Scene
                      M: 5.497 s: GravityMark is Ready in 5.5 s
                      M: 5.497 s: Starting 1600x900 Vulkan Benchmark
                      M: 5.497 s: Count: 1
                      M: 5.497 s: Resizing 1600x900 frame
                      M: 2:53.355: Benchmark Finished
                      M: 2:53.355: AMD RADV VERDE (ACO)
                      M: 2:53.355: API: Vulkan
                      M: 2:53.355: System: Linux
                      M: 2:53.355: Resolution: 1600x900
                      M: 2:53.355: Antialiasing: Temporal
                      M: 2:53.355: Asteroids: 200,000
                      M: 2:53.355: Scores: 1963
                      M: 2:53.355: Time: 167.9 s
                      M: 2:53.355: FPS: 11.7
                      M: 3:36.634: Clearing Scene
                      M: 3:37.028: GravityMark Done
                      M: 3:37.058: GravityMark Finished
                      FPS: 11.7
                      Last edited by reba; 19 June 2021, 02:59 AM.

                      Comment

                      Working...
                      X