Originally posted by bridgman
View Post
Announcement
Collapse
No announcement yet.
AMD GPU-PRO vs. NVIDIA Linux OpenCL Compute Performance
Collapse
X
-
Michael Larabel
https://www.michaellarabel.com/
- Likes 1
-
Originally posted by Michael View Post
Hmm whoops, yes, okay, I see -s 1 set there. Pardon as i overlooked it originally as haven't touched that SHOC test in a while. Will do some testing this weekend on various GPUs this weekend and ensure it's safe to increase to -s 3 univerally. If my memory serves me, I think the reason it was 1 before was that for the max SP FLOPS test, anything greater than 1 was taking like 1+ hours or an extremely long time... But yeah will do some verification soon.
- Likes 1
Comment
-
More feedback from HPC team... fresh github pull of SHOC on Catalyst Hawaii, wallclock times for -s 4:
·FFT: 9s
·MD5Hash: 5s
·MaxFlops: 25s
·DeviceMemory: 35s
Slower GPUs would take longer, but if the run times become too long the recommendation is to reduce the number of sequential passes via -n parameter (default is 10) rather than reducing size (the amount of parallel work).
AFAICS larger sizes (-s) are able to take better advantage of large GPUs, while running multiple passes (-n) is done primarily to average out the impact of startup overhead eg. power management realizing that the GPU is sufficiently busy that clocks should be cranked up to maximum. Ideally I think that means every test should run at -s 4 and any test that runs for too long should have the number of passes over-ridden with something like -n 3.
In a perfect world the number of passes would be reduced on slower GPUs to keep runtime constant but that gets complicated... it's hard to know how long a test is going to run before starting it.
I guess ideally these benchmarks would have a parameter like "execute enough stuff so you run for at least 30 seconds"Last edited by bridgman; 28 March 2016, 01:36 PM.Test signature
- Likes 2
Comment
-
Originally posted by faldzip View PostDoes anyone get this AMD GPU-PRO driver to work with R7 260X? It is GCN1.1 so should be working fine as R9 290 is, but I've installed this on Ubuntu 15.10 (4.2 kernel) and can't set any other resolution than 1024x768 or 800x600 (on my FHD monitor and TV too). Then ran The Talos Principle's built-in benchmark in this 1024x768 (Ultra-High settings) with OpenGL and Vulkan and I got ~7fps in both (while on the Win10 I have 32 and 36 respectively but in 1080p!). I've managed to add the 1080p mode to xrandr but after enabling it with xrandr everything on the screen looks really shitty (like much lower res upscaled - so the fonts are not quite readable). Does anyone faced such issues with AMD GPU-PRO?
Xorg.0.log says:
Code:[ 29.800] (EE) AMDGPU(0): Unknown EDID version 0 [ 30.213] (EE) AMDGPU(0): Unknown EDID version 0
Last edited by faldzip; 25 March 2016, 04:02 PM.
Comment
-
Xorg.0.log says:
Code:[ 18.307] (II) AMDGPU(0): glamor detected, initialising EGL layer. [ 18.308] (II) AMDGPU(0): KMS Pageflipping: enabled [ 18.308] (II) AMDGPU(0): Output DisplayPort-0 using monitor section Monitor0 [ 18.308] (II) AMDGPU(0): Output HDMI-A-0 has no monitor section [ 18.308] (II) AMDGPU(0): Output DVI-D-0 has no monitor section [ 18.308] (II) AMDGPU(0): Output DVI-D-1 has no monitor section [ 18.308] (II) AMDGPU(0): EDID for output DisplayPort-0 [ 18.308] (EE) AMDGPU(0): Unknown EDID version 0 [ 18.308] (II) AMDGPU(0): EDID for output HDMI-A-0 [ 18.308] (II) AMDGPU(0): Printing probed modes for output HDMI-A-0 [ 18.308] (II) AMDGPU(0): Modeline "1024x768"x60.0 65.00 1024 1048 1184 1344 768 771 777 806 -hsyn c -vsync (48.4 kHz e) [ 18.308] (II) AMDGPU(0): Modeline "800x600"x60.3 40.00 800 840 968 1056 600 601 605 628 +hsync +v sync (37.9 kHz e) [ 18.308] (II) AMDGPU(0): Modeline "800x600"x56.2 36.00 800 824 896 1024 600 601 603 625 +hsync +v sync (35.2 kHz e) [ 18.308] (II) AMDGPU(0): Modeline "848x480"x60.0 33.75 848 864 976 1088 480 486 494 517 +hsync +v sync (31.0 kHz e) [ 18.308] (II) AMDGPU(0): Modeline "640x480"x59.9 25.18 640 656 752 800 480 490 492 525 -hsync -vs ync (31.5 kHz e) [ 18.308] (II) AMDGPU(0): EDID for output DVI-D-0 [ 18.308] (II) AMDGPU(0): EDID for output DVI-D-1 [ 18.308] (II) AMDGPU(0): Output DisplayPort-0 disconnected [ 18.308] (II) AMDGPU(0): Output HDMI-A-0 connected [ 18.308] (II) AMDGPU(0): Output DVI-D-0 disconnected [ 18.308] (II) AMDGPU(0): Output DVI-D-1 disconnected [ 18.309] (II) AMDGPU(0): Using exact sizes for initial modes [ 18.309] (II) AMDGPU(0): Output HDMI-A-0 using initial mode 1024x768
Regarding OpenCL - just tried to check the OpenCL performance rendering BMW27.blend with Blender 2.77 but it crashes before I even get to the Devices section in the User preferences, because it crashes on the clGetPlatformIDs:
Code:Program received signal SIGSEGV, Segmentation fault. 0x00007fffe89f9d32 in amdgpu_query_gpu_info () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libdrm_amdgpu.so.1 (gdb) bt #0 0x00007fffe89f9d32 in amdgpu_query_gpu_info () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libdrm_amdgpu.so.1 #1 0x00007fffd9d4ffee in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #2 0x00007fffd9d5065b in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #3 0x00007fffd9d52c2b in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #4 0x00007fffd9d42648 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #5 0x00007fffd9a38d53 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #6 0x00007fffd99badf9 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #7 0x00007fffd99bae57 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #8 0x00007fffd99bbba9 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #9 0x00007fffd9981f54 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #10 0x00007fffd99833e7 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #11 0x00007fffd9983576 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #12 0x00007fffd9942ca0 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #13 0x00007fffd995ceb7 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #14 0x00007fffd992c493 in clIcdGetPlatformIDsKHR () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libamdocl64.so #15 0x00007fffdfd9876e in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so #16 0x00007fffdfd9a647 in ?? () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so #17 0x00007ffff771ea90 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103 #18 0x00007fffdfd98d31 in clGetPlatformIDs () from /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so #19 0x0000000001f938c4 in ?? () #20 0x0000000001f963ac in ccl::device_opencl_info(ccl::vector<ccl::DeviceInfo, ccl::GuardedAllocator<ccl::DeviceInfo> >&) () #21 0x0000000001f7fbf9 in ccl::Device::available_devices() () #22 0x0000000001ed3cf5 in ?? () #23 0x00000000019ff338 in ?? () #24 0x00000000018eef79 in RNA_property_enum_items_ex () #25 0x00000000018eefa5 in RNA_property_enum_items () #26 0x00000000018ef526 in RNA_property_enum_identifier () #27 0x000000000151277b in ?? () #28 0x0000000001519f20 in pyrna_prop_to_py () #29 0x000000000151a270 in ?? () #30 0x00000000029d54e8 in PyEval_EvalFrameEx () #31 0x00000000029d9fa1 in PyEval_EvalFrameEx () #32 0x00000000029dba82 in ?? () #33 0x00000000029dbb88 in PyEval_EvalCodeEx () #34 0x000000000294844f in ?? () ---Type <return> to continue, or q <return> to quit--- #35 0x000000000291f00a in PyObject_Call () #36 0x0000000001519584 in ?? () #37 0x00000000019ea50c in ?? () #38 0x0000000001404529 in ED_region_panels () #39 0x000000000116cca8 in ?? () #40 0x0000000001403746 in ED_region_do_draw () #41 0x0000000001147557 in wm_draw_update () #42 0x0000000001142d58 in WM_main () #43 0x00000000010ea92a in main ()
Comment
-
Originally posted by SystemCrasher View PostWhich is strange. What about this brand new HBM on Fury? It supposed to kick the ass, isn't it?
Michael is going to look into running with "large" settings (-s 3 or better yet -s 4) when time permits.Last edited by bridgman; 25 March 2016, 06:43 PM.Test signature
- Likes 2
Comment
-
Originally posted by bridgman View PostHave you read posts 17-24 yet ?
AFAICS there isn't enough work in the "small" benchmark option (-s 1) to occupy all the shaders on a large GPU or allow much latency hiding.
Michael is going to look into running with "large" settings (-s 3 or better yet -s 4) when time permits.Last edited by SystemCrasher; 25 March 2016, 06:26 PM.
Comment
Comment