Announcement

Collapse
No announcement yet.

Mesa's Disk Cache Code Now Better Caters To 4+ Core Systems

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mesa's Disk Cache Code Now Better Caters To 4+ Core Systems

    Phoronix: Mesa's Disk Cache Code Now Better Caters To 4+ Core Systems

    Most Linux gamers these days should be running at least quad-core systems so Mesa 19.3 has been updated to reflect that reality with the number of CPU threads used by their disk cache...

    http://www.phoronix.com/scan.php?pag...ache-Quad-Core

  • cb88
    replied
    Originally posted by schmidtbag View Post
    On the surface, I would normally totally agree with you. But, contrary to what a lot of people believe, more cores doesn't always yield better performance. In some cases, you might actually hurt performance. Some tasks are better off having a cap on how many threads you use. Seeing as the disk cache is probably going to get bottlenecked by disk write speeds, I'm sure 4 threads is more than enough, even for high-end SSDs.
    I agree. It has been a theme over the past few years though of incrementally bumping these up instead of finding a more intelligent way of doing it.

    Leave a comment:


  • schmidtbag
    replied
    Originally posted by cb88 View Post
    The problem is stupid code like this gets written every 3mo on Mesa... they should just write it once and be done instead of reoptimising it every time the typical number of CPUs change. Obviously it sholdn't bother spawning more than whatever get_nprocs() returns.
    On the surface, I would normally totally agree with you. But, contrary to what a lot of people believe, more cores doesn't always yield better performance. In some cases, you might actually hurt performance. Some tasks are better off having a cap on how many threads you use. Seeing as the disk cache is probably going to get bottlenecked by disk write speeds, I'm sure 4 threads is more than enough, even for high-end SSDs.

    Leave a comment:


  • ssokolow
    replied
    Originally posted by smitty3268 View Post
    I think you have your understanding of premature optimization reversed here.
    Normally, you'd be correct, but I'm thinking of it as a premature optimization in the domain of simplicity of testing and maintenance, not performance.

    When such a check is so short and simple, the only other interpretation that readily comes to mind doesn't lend itself to a very favourable impression of how the developer goes about their craft.

    Originally posted by smitty3268 View Post
    Regardless, if it matters so much to you feel free to submit a patch to mesa updating it to check how many cpus there are. That's part of the beauty of open source.

    Just be prepared to show some evidence of a case where this actually helps performance, as that will probably be required to justify adding new code and making it more complicated.
    I don't have time to familiarize myself with a new codebase and, even if I did, I can't test it. I haven't bought a new GPU in long enough that all my GPUs are still nVidia ones from the days when AMD wasn't a viable option for my needs.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by ssokolow View Post

    To be honest, that sounds like an extreme case of premature optimization and being penny-wise and pound-foolish, given that it's apparently as simple as adding #include <sys/sysinfo.h> followed by calling get_nprocs() while, even at lowest priority, spawning more threads still has potential to cause unexpected effects when dealing with a CPU scheduler operating on the system as a whole.
    I think you have your understanding of premature optimization reversed here.

    Regardless, if it matters so much to you feel free to submit a patch to mesa updating it to check how many cpus there are. That's part of the beauty of open source.

    Just be prepared to show some evidence of a case where this actually helps performance, as that will probably be required to justify adding new code and making it more complicated.

    Leave a comment:


  • ssokolow
    replied
    Originally posted by PuckPoltergeist View Post

    Build-time check for a runtime option?
    If built against glibc, call get_nprocs() at runtime.

    Leave a comment:


  • PuckPoltergeist
    replied
    Originally posted by ssokolow View Post
    Something like this:
    Code:
    int threads = 4;
    #ifdef __GLIBC__
    if (get_nprocs() < 4)
    threads = 1;
    #endif
    Build-time check for a runtime option?

    Leave a comment:


  • aufkrawall
    replied
    I suppose it doesn't really matter if more threads than available are used, as it's just a tiny copy operation and the rendering is put on halt anyway until the binaries are in VRAM.
    When you let x264 7-zip run on 1-2 more cores than being available, it doesn't really impact the performance either.

    Leave a comment:


  • M@yeulC
    replied
    But by making its behaviour more dependent of the system config, you risk making it more fragile... You multiply the number of possible states once more, and make possible bugs harder to track and reproduce.

    A while back, I'd have agreed that making it dynamic was a good idea. I'm not so sure nowadays. This is at worse a very minor performance hit, on systems that are not that performant to begin with, so a small absolute loss. Yes, one can hypothesize that many such design decisions appear, but that only reinforces my feeling that robustness and predictability needs to be privileged over raw performance numbers, to a certain extent... And there are diminishing returns everywhere, so I'd happily trade 1% perf for 400% stability.

    That said, that whole discussion really is nitpicking/bikeshedding, in the literal interpretation: people spending a lot of time on the only topic they can understand in the discussion. If everyone gets that part, everyone wants to share their opinion on it
    And this is counterproductive. I'll plead guilty.

    Leave a comment:


  • atomsymbol
    replied
    Originally posted by tarceri View Post
    Anything more than 4 threads is likely not to help much and just overkill.
    Some notes:
    • The expression "min(4, numCPUs)" is most likely a better choice than the expression "4".
    • <del>Is 1 of the 4 threads started with normal priority? If not, I think it definitely should be.</del>
    Last edited by atomsymbol; 09-19-2019, 11:19 AM. Reason: Disable the 2nd note

    Leave a comment:

Working...
X