Announcement

Collapse
No announcement yet.

SUSE Linux Enterprise / openSUSE Leap Pursuing x86_64-v2 Optimized Libraries

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • coder
    replied
    Originally posted by Slartifartblast View Post
    I believe V3 brings a much more marked improvement than V2.
    Agreed.

    I was a little disappointed that there's not a step for AVX1, but that would make for a pretty uneven jump in terms of the years between steps. V1 maps to about 2005, V2 -> 2009, Sandybridge launched in 2011, V3 -> 2013, and V4 started in 2016 but didn't hit mainstream desktop until 2021. So, the intervals would go: 4, 2, 2, 3+ or even 6, 2, 3+ if you simply moved V2 from Nehalem to Sandybridge. Instead, they're 4, 4, 3+ years.

    Anyway, since Sandybridge is now a decade old, I guess it shouldn't need special treatment.

    Leave a comment:


  • Slartifartblast
    replied
    Originally posted by coder View Post
    Michael sometimes benchmarks -march=native, in addition to other compiler options. The delta between that and the default (which is currently equivalent to v1 or baseline x86-64) should give you an upper-bound on the benefit.

    Here's a recent example:


    It should be noted that some of those benchmarks use hand-optimized architecture-specific paths, based on runtime CPU-detection, meaning they show less difference than expected. However, it does show that where the biggest wins exist for the greatest number of users, we already tend to be harnessing the additional capabilities of our CPUs.

    All of that is to say that I don't expect a very notable improvement from most software, or for most users. On the flip side, this could motivate more packages to build with more aggressive auto-vectorization options, providing a little more upside than I'm expecting. Whatever the case, I expect Michael will publish benchmarks putting this new capability in a the best light possible*.

    * People forget that he cherry-picks which benchmarks to publish, and I think he tends to select ones that show the greatest impact from the independent variable.
    Thanks, from reading I believe V3 brings a much more marked improvement than V2.

    Leave a comment:


  • coder
    replied
    Originally posted by Slartifartblast View Post
    Anyone any idea of expected performance gains ?
    Michael sometimes benchmarks -march=native, in addition to other compiler options. The delta between that and the default (which is currently equivalent to v1 or baseline x86-64) should give you an upper-bound on the benefit.

    Here's a recent example:


    It should be noted that some of those benchmarks use hand-optimized architecture-specific paths, based on runtime CPU-detection, meaning they show less difference than expected. However, it does show that where the biggest wins exist for the greatest number of users, we already tend to be harnessing the additional capabilities of our CPUs.

    All of that is to say that I don't expect a very notable improvement from most software, or for most users. On the flip side, this could motivate more packages to build with more aggressive auto-vectorization options, providing a little more upside than I'm expecting. Whatever the case, I expect Michael will publish benchmarks putting this new capability in a the best light possible*.

    * People forget that he cherry-picks which benchmarks to publish, and I think he tends to select ones that show the greatest impact from the independent variable.

    Leave a comment:


  • Slartifartblast
    replied
    Anyone any idea of expected performance gains ?

    Leave a comment:


  • coder
    replied
    Originally posted by Aryma View Post
    doesreally make a big difference between complied lib or software with or without AVX2 ?
    The main beneficiaries are the usual suspects:
    • software A/V codecs
    • HPC
    • machine learning
    • some graphics & imaging code

    Leave a comment:


  • Aryma
    replied
    doesreally make a big difference between complied lib or software with or without AVX2 ?

    Leave a comment:


  • skeevy420
    replied
    Originally posted by smitty3268 View Post

    v2 doesn't require AVX, only SSE4.2.
    Offhand and in a hurry I always think 2 does. I remember that that's where the cutoff line is only I go down instead of up.

    Then the big.LITTLE x86 with V1 or V2 would work, really. I have to imagine that Zen 3 & Somelake would be even faster and more efficient if they had less instructions to have to consider. I can even see product lines like v3.V2 for home/power-efficient/workstation and v4.V2 for workstation/server/high-end. Not like most home users need the full AVXBBQ.

    AMD filed big.LITTLE x86 patents the other day so it's worth speculating especially in conjunction with HWCAPS.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by skeevy420 View Post

    While I can't disagree with that assessment, it wouldn't at all surprise me to see new low-end or power-saving CPU models come without AVX between now and then. v1 seems to be a perfectly fine spec to turn into a strictly low power x86 CPU core.

    Hmm...v1 little and v3/v4 big cores...not like the power efficient purposed cores need all those instructions...I'd design it like that.
    v2 doesn't require AVX, only SSE4.2.

    Leave a comment:


  • skeevy420
    replied
    Originally posted by smitty3268 View Post

    From what I can tell, the last Atom systems without support were from prior to 2016, and the last server Atoms were 3 years earlier than that.

    If RHEL9 doesn't release until 2023, that's at a minimum 8 year old Atom CPUs.

    I highly doubt they're going to lose much business on that. Any company cheap enough to still be using those machines isn't going to pay for new software licenses either.
    While I can't disagree with that assessment, it wouldn't at all surprise me to see new low-end or power-saving CPU models come without AVX between now and then. v1 seems to be a perfectly fine spec to turn into a strictly low power x86 CPU core.

    Hmm...v1 little and v3/v4 big cores...not like the power efficient purposed cores need all those instructions...I'd design it like that.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by skeevy420 View Post
    If RHEL wants to tell paying customers that new, low power Intel Atom systems are unsupported then that's on them.
    From what I can tell, the last Atom systems without support were from prior to 2016, and the last server Atoms were 3 years earlier than that.

    If RHEL9 doesn't release until 2023, that's at a minimum 8 year old Atom CPUs.

    I highly doubt they're going to lose much business on that. Any company cheap enough to still be using those machines isn't going to pay for new software licenses either.

    Leave a comment:

Working...
X