Originally posted by schmidtbag
View Post
Announcement
Collapse
No announcement yet.
Mesa's Disk Cache Code Now Better Caters To 4+ Core Systems
Collapse
X
-
- Likes 3
-
Originally posted by x_wing View Post
multiprocessing is a Python package that is shipped by default with the interpreter (AFAIK). As we're talking of C code, I think that the only option here is to parse /proc/cpuinfo, but that is probably not portable.
glibc provides an API just as simple and easy to use as the Python one:
Code:#include <stdio.h> #include <sys/sysinfo.h> int main(int argc, char *argv[]) { printf("This system has %d processors configured and " "%d processors available.\n", get_nprocs_conf(), get_nprocs()); return 0; }
Sure, it's glibc-specific, but it's just an extra four or five lines to #ifdef your way to falling back to the current behaviour and "do it properly on the UNIX-like with the largest market share after OSX, fall back to current behaviour elsewhere" is better than "use current behaviour everywhere"
Something like this:
Code:int threads = 4; #ifdef __GLIBC__ if (get_nprocs() < 4) threads = 1; #endif
Last edited by ssokolow; 19 September 2019, 11:16 AM.
Comment
-
But by making its behaviour more dependent of the system config, you risk making it more fragile... You multiply the number of possible states once more, and make possible bugs harder to track and reproduce.
A while back, I'd have agreed that making it dynamic was a good idea. I'm not so sure nowadays. This is at worse a very minor performance hit, on systems that are not that performant to begin with, so a small absolute loss. Yes, one can hypothesize that many such design decisions appear, but that only reinforces my feeling that robustness and predictability needs to be privileged over raw performance numbers, to a certain extent... And there are diminishing returns everywhere, so I'd happily trade 1% perf for 400% stability.
That said, that whole discussion really is nitpicking/bikeshedding, in the literal interpretation: people spending a lot of time on the only topic they can understand in the discussion. If everyone gets that part, everyone wants to share their opinion on it
And this is counterproductive. I'll plead guilty.
- Likes 2
Comment
-
I suppose it doesn't really matter if more threads than available are used, as it's just a tiny copy operation and the rendering is put on halt anyway until the binaries are in VRAM.
When you let x264 7-zip run on 1-2 more cores than being available, it doesn't really impact the performance either.
Comment
-
Originally posted by ssokolow View Post
To be honest, that sounds like an extreme case of premature optimization and being penny-wise and pound-foolish, given that it's apparently as simple as adding #include <sys/sysinfo.h> followed by calling get_nprocs() while, even at lowest priority, spawning more threads still has potential to cause unexpected effects when dealing with a CPU scheduler operating on the system as a whole.
Regardless, if it matters so much to you feel free to submit a patch to mesa updating it to check how many cpus there are. That's part of the beauty of open source.
Just be prepared to show some evidence of a case where this actually helps performance, as that will probably be required to justify adding new code and making it more complicated.
- Likes 1
Comment
-
Originally posted by smitty3268 View PostI think you have your understanding of premature optimization reversed here.
When such a check is so short and simple, the only other interpretation that readily comes to mind doesn't lend itself to a very favourable impression of how the developer goes about their craft.
Originally posted by smitty3268 View PostRegardless, if it matters so much to you feel free to submit a patch to mesa updating it to check how many cpus there are. That's part of the beauty of open source.
Just be prepared to show some evidence of a case where this actually helps performance, as that will probably be required to justify adding new code and making it more complicated.
- Likes 2
Comment
-
Originally posted by cb88 View PostThe problem is stupid code like this gets written every 3mo on Mesa... they should just write it once and be done instead of reoptimising it every time the typical number of CPUs change. Obviously it sholdn't bother spawning more than whatever get_nprocs() returns.
Comment
-
Originally posted by schmidtbag View PostOn the surface, I would normally totally agree with you. But, contrary to what a lot of people believe, more cores doesn't always yield better performance. In some cases, you might actually hurt performance. Some tasks are better off having a cap on how many threads you use. Seeing as the disk cache is probably going to get bottlenecked by disk write speeds, I'm sure 4 threads is more than enough, even for high-end SSDs.
Comment
Comment