Announcement

Collapse
No announcement yet.

A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • yump
    replied
    Originally posted by ryao View Post

    As someone who does that, I can tell you that this does not appear in it. Profiling the kernel is typically done to see what is done when the CPU is busy, not how much it is in power states. You would not see this in a typical perf profile like the kind most kernel developers use.
    Yeah, I would really like to see details of how this was found.

    Leave a comment:


  • yump
    replied
    Originally posted by kozman View Post

    To me it didn't come off sounding like dumb luck to have found it. Are there even tools for profiling all nooks and crannys of the code to find these kinds of things?
    Perf can collect events system-wide with -a, but it sounds like this was discovered with something called "instruction-based sampling", which sounds pretty cool in the whitepaper. IDK if profiling with usual cycles or instructions PMCs would've found it.

    There is apparently support for it in perf-events, but it may not be wired up to be as user friendly as PMC sampling. Unfortunately I don't have any recent enough AMD hardware to play with it. There may be something useful in this recent patch or the discussion around it, or more likely this thread from when the feature was first added. The perf tools seem to change a lot though, and the documentation around how they should be used is not great. I didn't even know about event modifiers or "precise levels" until just now.

    ...actually, it looks like you can just

    perf top -a -e cycles:P

    and get full-system profiling with the maximum precision your hardware is capable of. Nice! Most userspace programs don't have debug symbols, though, and I've not had great luck getting perf to use debuginfod.

    Leave a comment:


  • Jahimself
    replied
    I find this fascinating that someone discovered this after so many years and managed to fix it!

    Leave a comment:


  • ryao
    replied
    Originally posted by carewolf View Post

    Yeah, you would need to run a kernel profiler, and who does that?
    As someone who does that, I can tell you that this does not appear in it. Profiling the kernel is typically done to see what is done when the CPU is busy, not how much it is in power states. You would not see this in a typical perf profile like the kind most kernel developers use.

    Leave a comment:


  • filbo
    replied
    Originally posted by bezirg View Post
    Great find by the AMD engineer, i hope this lands in 6.0!
    Interestingly, the patch which is actually enqueued for inclusion is by an Intel engineer (link near the bottom of Michael's article). That patch's writeup characterizes it as an old Intel bug which is being cleaned up to the benefit of AMD.

    So, kudos to Dave Hansen -- and to Intel for apparently fostering a work environment where Dave feels empowered to issue patches aimed squarely at improving Enemy Number One's product line...

    Leave a comment:


  • Radtraveller
    replied
    Originally posted by coder View Post
    This was obviously found by someone doing workload profiling and then going on a hunt for the offending "dummy reads", as indicated in the patch:

    "Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op"


    The way you phrased it makes it sound like the issue was discovered through code inspection.
    linux suffers irritable bowel syndrome on some versions of bowels…(?)

    not sure why, I am not the coder type, a “modern” distro that limits to hardware no older than say… 2 years wouldn’t be popular enough to garner dev and users.
    I am still attempting to build a kernel without a bunch of stuff that doesn’t look needed…. Until something fails.. oh well.. I know I am an amateur, but it is fun and a learning experience. Lots to read and comprehend.

    Leave a comment:


  • coder
    replied
    Originally posted by Mahboi View Post
    I wonder how much code clarity, speed and maintenance efficiency we'd have if we canned Linux and rebuilt a new OS from scratch with the experience Linux has given us.
    There are plenty of cases where you have a mostly compute-bound workload and a different OS wouldn't make much difference. There are others where you could probably get a multiple of performance, particularly if you radically redesigned the security architecture to reduce the amount of syscalls or certain overheads associated with paging.

    Linux probably has a fair amount of gas left in the tank, but I think the profusion of cores & multithreaded workloads represents a new challenge that it hasn't fully taken onboard. In short, I think Linux relies to heavily on userspace threading for utilization of multiple cores, but that's not the only option.

    Anyway, it's not that people aren't trying. How long has Google been working on Fuchsia? But Linux is going to be extremely difficult to displace, due to the amount of industry support behind it.

    Leave a comment:


  • Mahboi
    replied
    Horrid thought because of the necessary workload, but I wonder how much code clarity, speed and maintenance efficiency we'd have if we canned Linux and rebuilt a new OS from scratch with the experience Linux has given us. OpenVPN vs Wireguard style.

    Yes yes, I'm just dreaming.

    Leave a comment:


  • coder
    replied
    Originally posted by Volta View Post
    It seems Linux competition is so slow AMD didn't even notice such suboptimal performance.
    It's not due to lack of competition. It's because AMD was probably in fire-fighting mode for a long time, as its software team was playing catchup and doing all the server support & enablement work, and only recently started getting into more expansive performance-tuning.

    Leave a comment:


  • NateHubbard
    replied
    Originally posted by Vistaus View Post

    I would love better performance on my AMD system, but I don't want to make the "don't-remove-old-stuff-from-the-kernel" gang mad.
    It's fine, I make them mad every month or so.

    Leave a comment:

Working...
X