Announcement

**LeJimster** · 13 July 2016, 03:30 PM

As far as I'm concerned a shader cache is needed for games like The Talos Principle. Running around a map for several minutes for the stutters to go away drives me bonkers and I got fed up with it after awhile, as such I'm waiting for Vulkan support on my R9 270 so I finally finish the game. (even if I have to go with the "pro" driver stack for awhile)
Also the new Unreal Tournament suffers badly too. I'm all for other improvements, but there is a reason why Nvidia and AMD use shader caches in their propriety drivers...

As smitty says I was under the impression one was in the works for radeonsi or atleast in the thoughts of the developers.. Possibly just not taking priority compared to the remaining OpenGL work.

**marek** · 13 July 2016, 06:51 PM

Originally posted by LeJimster View Post

As far as I'm concerned a shader cache is needed for games like The Talos Principle. Running around a map for several minutes for the stutters to go away drives me bonkers and I got fed up with it after awhile, as such I'm waiting for Vulkan support on my R9 270 so I finally finish the game. (even if I have to go with the "pro" driver stack for awhile)
Also the new Unreal Tournament suffers badly too. I'm all for other improvements, but there is a reason why Nvidia and AMD use shader caches in their propriety drivers...

As smitty says I was under the impression one was in the works for radeonsi or atleast in the thoughts of the developers.. Possibly just not taking priority compared to the remaining OpenGL work.

Did you try the latest radeonsi driver? It doesn't stutter as much as it used to. Mesa 11.2 + LLVM 3.8 completely remove compiler-caused stuttering for good apps and reduce it for bad apps. Mesa 11.2 also skips compilation of duplicated shaders, which reduces the compiler workload by 30% for some games. On top of that, Mesa 12.1-dev has multithreaded compilation, which should make loading times 4x faster (the compiler can use up to 4 cores now), but that doesn't affect the stuttering much though. I just tested Talos Principle today. There is one small stutter at the beginning and one more later, but it's barely noticeable. I didn't notice it until I saw the benchmark where it's more obvious.

EDIT: The improvements above only apply to radeonsi.

**SaucyJack** · 13 July 2016, 06:56 PM

"There have also been conflicting views about how many games would actually benefit from a persistent shader cache (ie how many games compile shaders for the first time during the middle of gameplay)... there are still discussions going on about that."

A lot of games exhibit that behavior. But there really isn't a discussion as all games benefit in loading times if nothing else.

**geearf** · 13 July 2016, 07:57 PM

removed

**tarceri** · 13 July 2016, 09:41 PM

Originally posted by tpruzina View Post

Only real difference is doing sha1 hash prior to parsing shader file (negligible performance hit) and one time load of shader cache lookup table from hdd (on ssd negligible, on hdd matter of reading few disk blocks at most). If you campare loading shader from IR cache (generally few disk blocks) and operation of shader parsing and compilation then shader cache wins hands down every time (except for maybe extreme cases of heavy IO load and extremely simple shader).
One difference ofcourse being state after driver update (that changed intermediate representation thus invalidating shader cache), when it might make it slightly slower.

Many games actually try to prevent "stutter effect" of dynamic shader loading by compiling "everything" every time (new game instance), example of this is dota2. First game load takes _a_lot_ longer than consecutive loads and much of this time is spent on compiling GLSL shaders (also mmaping big .vpk files containing ingame stuff).

Basically, worst case stuff might take few miliseconds longer, best case you can see improvements in load times in seconds (as compared to static shader compilation each runtime). Even trivial GLSL shaders undergo 2-pass parsing.

Anyways, looked at V1 of these patches some time ago and while code itself wasn't all that clean, it was fairly straight forward and trivial.
The problem with it is that IR differs across mesa supported hardware somewhat and they all have their quirks (or at least V1 patches were riddled with workarounds).

This is a pretty good summary thanks. The bit I would disagree with is your comments about different IR, this is not really a problem at all and I'm not sure what workarounds you are talking about.

To keep things simple we don't actually cache the IR. We just check the sha1 and either say yes we have seen this before and skip the creation of the IR or no we haven't and compile as normal. At the end of the line the driver backend also needs to cache/load its own binary format but that's pretty straight forward. The ugly part is that because we keep things simple by not storing the IR it means we have to do extra checks if we need to fallback to a full recompile rather than just using a cached IR we need to recomplie from source but this should be rare and again once its cached it wont happen again until Mesa is upgraded (in which case the cache object is removed and we fallback to a full recompile).

**LeJimster** · 13 July 2016, 10:29 PM

Originally posted by marek View Post

Did you try the latest radeonsi driver? It doesn't stutter as much as it used to. Mesa 11.2 + LLVM 3.8 completely remove compiler-caused stuttering for good apps and reduce it for bad apps. Mesa 11.2 also skips compilation of duplicated shaders, which reduces the compiler workload by 30% for some games. On top of that, Mesa 12.1-dev has multithreaded compilation, which should make loading times 4x faster (the compiler can use up to 4 cores now), but that doesn't affect the stuttering much though. I just tested Talos Principle today. There is one small stutter at the beginning and one more later, but it's barely noticeable. I didn't notice it until I saw the benchmark where it's more obvious.

EDIT: The improvements above only apply to radeonsi.

I'm running a recent Mesa 12.1-devel + LLVM 3.9 from lordheavy's repo on Arch.

I didn't realise how long it has been since I loaded Talos up, 6-9 months anyway it does appear to be much, much better. I just tested it and got 3 short stalls on one level, went through a portal got 2 or 3 stalls in the next section, moved through another portal and got 1 or 2 stalls or hiccups should I say. The rest of the time it is pretty smooth now. So thats really great... But wouldn't a shader disk cache eliminate the stalls completely after first run? or would there always be stalls on Talos just from loading the shader from disk?

I guess I'm interested in the reason if any against a proper shader cache?

Thanks for the work anyway, I might actually go back and finish Talos now.

**tarceri** · 14 July 2016, 12:18 AM

Originally posted by LeJimster View Post

I'm running a recent Mesa 12.1-devel + LLVM 3.9 from lordheavy's repo on Arch.

I didn't realise how long it has been since I loaded Talos up, 6-9 months anyway it does appear to be much, much better. I just tested it and got 3 short stalls on one level, went through a portal got 2 or 3 stalls in the next section, moved through another portal and got 1 or 2 stalls or hiccups should I say. The rest of the time it is pretty smooth now. So thats really great... But wouldn't a shader disk cache eliminate the stalls completely after first run? or would there always be stalls on Talos just from loading the shader from disk?

I guess I'm interested in the reason if any against a proper shader cache?

Thanks for the work anyway, I might actually go back and finish Talos now.

With a shader cache you would get the stalls once the first time you ran the game/level/etc as you do now then the next time they would be gone. Note that with the current implementation upgrading Mesa would cause shader cache objects to be deemed incompatible and recompiled so it might not be great if you are updating mesa daily of something like that, although this could be worked around if it was a big issue.

The cache resolves any noticeable stalls/hiccups on i965 for me.

The time it takes to load from disk is negligible, for me I've seen compile times in the 10s of milliseconds and link times in the 100s of milliseconds. When loading precompiled shaders from disk the times I've seen are in microsecond. Also notable is that this is on a HDD not a SSD.

**geearf** · 14 July 2016, 12:45 AM

Originally posted by tarceri View Post

Note that with the current implementation upgrading Mesa would cause shader cache objects to be deemed incompatible and recompiled so it might not be great if you are updating mesa daily of something like that, although this could be worked around if it was a big issue.

Oh that's interesting.
Would it then be a negative in this case? (assuming you only start the game once a day.).

**tarceri** · 14 July 2016, 01:10 AM

Originally posted by geearf View Post

Oh that's interesting.
Would it then be a negative in this case? (assuming you only start the game once a day.).

You would likely not notice any difference, but you could also disable the cache if you wanted. Also its most likely the cache is still compatible so its possible we could or downstream packages could chose a more manual way to check different versions, we currently just use the sha of the latest commit or in a release the release version.

**geearf** · 14 July 2016, 03:51 AM

Good to know, thank you!

Announcement

Intel's Mesa On-Disk Shader Cache Maturing, Radeon Devs Not Yet Convinced

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment