Originally posted by Drago
View Post
Announcement
Collapse
No announcement yet.
AMD Radeon HD 6000 Gallium3D Attempts To Compete With Catalyst
Collapse
X
-
I have downclocked the processor to 2.2Ghz from 2.8
There is no change in fps - 131-132 fps
On any demo
I have further again downclocked nvidia driver (prefer maximum performance, GPU585->300, MEM999->600) and retested
There is no change - 131-132 fps
I guess I should recompile kernel, include the powersave demon and force it (2.8->0.8).
But but I don't think CPU is barrier anymore.
It seems that this game has some barrier that starts limiting max fps to curtain level - this is for sure and is independent of CPU or GPU clock rate.
Comment
-
Power profiles
Originally posted by Drago View PostI think radeon profiles, are video BIOS configured profiles. If his card has broken BIOS, then PM will not work the way it should.
I have made an almost proper patch for debugging and manually fixing these problems. If there is any demand I might complete it. The problem is it already works good enough for me, so I have almost stopped development.
Basically the patch allows the user to see AND modify the power states manually. It can be used for underclocking, overclocking and for some other related tasks. It can be used like this:
Dump the table:
# cat /sys/class/drm/card0/device/power_table > power_table
Fix any problems:
# vim power_table
Commit changes (This could be added to boot scripts for more permanent fix):
# cat power_table > /sys/class/drm/card0/device/power_table
Currently on my box the file looks like this (there is perhaps too much information):
Code:# power_state,clock_mode dynpm_skip(boolean) engine(kHz) mem(kHz) vddc(mV) vddci(?) [extra info] 0,0 1 100000 157000 900 0 [Boot] 0,1 0 100000 157000 900 0 [Boot] 0,2 0 100000 157000 900 0[*] [Boot] 1,0 1 100000 150000 900 0 [HIGH_SH_DPMS] [Performance] [Single display only] 1,1 0 450000 800000 950 0 [Performance] [Single display only] 1,2 0 550000 800000 1000 0 [HIGH_SH] [Performance] [Single display only] 2,0 1 550000 800000 1000 0 2,1 0 550000 800000 1000 0 2,2 0 550000 800000 1000 0 3,0 1 400000 800000 950 0 [HIGH_MH_DPMS] [Performance] 3,1 0 450000 800000 950 0 [Performance] 3,2 0 550000 800000 1000 0 [HIGH_MH] [Performance] 4,0 1 100000 150000 900 0 [LOW_SH] [LOW_SH_DPMS] [MID_SH_DPMS] [Battery] [Single display only] 4,1 0 100000 150000 900 0 [MID_SH] [Battery] [Single display only] 4,2 0 300000 300000 900 0 [Battery] [Single display only] 5,0 1 100000 150000 900 0 [LOW_MH] [LOW_MH_DPMS] [MID_MH_DPMS] [Battery] 5,1 0 100000 150000 900 0 [MID_MH] [Battery] 5,2 0 300000 300000 900 0 [Battery] 6,0 1 300000 300000 900 0 [Battery] 6,1 0 300000 300000 900 0 [Battery] 6,2 0 300000 300000 900 0 [Battery] 7,0 1 375000 400000 900 0 [Battery] 7,1 0 375000 400000 900 0 [Battery] 7,2 0 375000 400000 900 0 [Battery] 8,0 1 105000 150000 900 0 8,1 0 105000 150000 900 0 8,2 0 300000 300000 900 0
*_MH = Multi Head modes
*_DPMS = Modes used when all monitors are off in powersaving
I have only 1 monitor so in my case the important lines are the ones which are flagged with LOW_SH, MID_SH and HIGH_SH. As can be seen from the list the power_profiles "low" (4,0) and "mid" (4,1) are actually identical on my system.
Let's say I want to fix that. I want the GPU and memories to run at 300 MHz when in "mid" mode. This one command fixes that:
# echo "4,1 0 300000 300000 900 0" > /sys/class/drm/card0/device/power_table
Now that was only one simple example. I actually use this to fix the modes dynpm uses, but this post will get too long if I start explaining that.
Comment
-
Originally posted by smitty3268 View PostThe CPU bottleneck is less code that's running in the driver, and more making unnecessary kernel calls resulting in slow context switches, unnecessary flushes of data between the GPU/CPU, passing huge structures of data around that aren't very cache friendly, etc.
None of those things are likely to be affected very much by a change to the compiler - they need to be fixed algorithmically.
Oh, and what patents are you referring to? The only two i know about are the floating-textures one (which is a feature, not anything performance related) and the S3TC one (again, just a new feature, and not anything that would impact performance one way or the other).
Comment
-
Originally posted by V!NCENT View PostYeah but the problem is, is that they are everywhere in the pipe and there is latency sending things back and forth between the CPU and the GPU. So the ideal solution would be doing all the patented stuff beforehand on the CPU and then sending all the non patented algorithms/operations to the GPU. But that takes a horrble amount of extra CPU work. It might help speeding it up since the CPU is serial and the vast amount of parallel possible operations are rather simple in themselves. So an algorithm might not improve that much (if it's possible), but rather raw speed.
Comment
-
Originally posted by RealNC View Post+2
I think the demand is actually much higher, mate. If it works, people tend to ignore and use, if it does not work people complain and then ignore and use something else(hardware).
Comment
Comment