Announcement

Collapse
No announcement yet.

Linux Kernel Developers Discuss Dropping x32 Support

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by ssokolow View Post

    The Windows ecosystem decided to use "x32" and "x64" as needless contractions of "x86_32" and "x86_64", so they'll find it alright... it just won't mean the same thing.
    which is exactly the point: he is talking about things without understand their meaning while the top is x32 on Linux which is not x86 and why they are dropping it is the whole topic: because only a minority is using it while in needs to be maintained - so why people are talking bullshit exactly 2 posts below it was explained again?

    Comment


    • #72
      Originally posted by hreindl View Post

      which is exactly the point: he is talking about things without understand their meaning while the top is x32 on Linux which is not x86 and why they are dropping it is the whole topic: because only a minority is using it while in needs to be maintained - so why people are talking bullshit exactly 2 posts below it was explained again?
      Hey, I agree. I just think it's more helpful to post a response which manages to re-state the relevant information in some form or other.

      Comment


      • #73
        There are many programs in a conventional Linux distro that might reasonably be built using x32. I'd guess that it would be the vast majority.
        It would be interesting to build a distro with x32 as the default and only selected programs as full x86_64.

        What would benefit from the full pointer width? Programs that might do massive buffering (eg. database programs, Firefox, graphics editors, VM providers). Any others?

        The first fallout would be the discovery of stupid portability bugs. So many C programs erroneously assume that pointers can fit in ints. For no serious benefit. This would be all to the good, but the work would fall on the wrong people. A cheap version of this experiment would be to build most things as x86_32 instead of x32. After all, most programs have already been tested in this architecture. But x32 ought to be more performant.

        The second result would probably be a modest improvement in disk space for programs. My wild guess: 10%.

        I would not be confident that there would be much of a performance improvement since it might turn out that most memory and processor is used by those programs that were left as x86_64. On my desktop, most of the resources are consumed by Firefox most of the time.

        Comment


        • #74
          Originally posted by Hugh View Post
          I would not be confident that there would be much of a performance improvement since it might turn out that most memory and processor is used by those programs that were left as x86_64. On my desktop, most of the resources are consumed by Firefox most of the time.
          How large is your L1 cache? Or perhaps total CPU cache?

          Think about that for a second before you say "memory savings are minuscule" when uncached memory is usually the largest bottleneck in normal desktop apps (not computationally expensive). This applies even to games, like I recall there was a very interesting article about Doom 3's linked lists and how they massively boosted performance by re-orienting them to fit better in memory and more cache friendly (normal linked list wastes a lot of redundant memory and fragmentation).

          Seriously guys.

          Comment


          • #75
            Originally posted by Weasel View Post
            How large is your L1 cache? Or perhaps total CPU cache?

            Think about that for a second before you say "memory savings are minuscule" when uncached memory is usually the largest bottleneck in normal desktop apps (not computationally expensive). This applies even to games, like I recall there was a very interesting article about Doom 3's linked lists and how they massively boosted performance by re-orienting them to fit better in memory and more cache friendly (normal linked list wastes a lot of redundant memory and fragmentation).

            Seriously guys.
            Linked lists are quite the worst case scenario for cache efficiency, though...

            Comment


            • #76
              Originally posted by AsuMagic View Post
              Linked lists are quite the worst case scenario for cache efficiency, though...
              Indeed, the point was to show that cache efficiency is a very important thing even in games (where a lot of stuff happens on GPU also) and a real bottleneck.

              In general, fast memory is not cheap. RAM is slow, very slow. It may seem fast to people, but it's almost two orders of magnitude slower than L1 cache. And all the caches of the CPU are already about half of the entire CPU die. It's not like they can easily "just increase the cache, man".

              Comment


              • #77
                Originally posted by Weasel View Post
                Indeed, the point was to show that cache efficiency is a very important thing even in games (where a lot of stuff happens on GPU also) and a real bottleneck.

                In general, fast memory is not cheap. RAM is slow, very slow. It may seem fast to people, but it's almost two orders of magnitude slower than L1 cache. And all the caches of the CPU are already about half of the entire CPU die. It's not like they can easily "just increase the cache, man".
                Boosting cache efficiency is only important if the app you are using doesn't fit in it, though. That is to say, once you bust the cache you fall off a performance cliff, but if your app fits in less than half the L1 in it's busy loops then it doesn't really matter how much more you shrink it. You either have enough or you don't, and obviously compilers and cpu designers try to ensure most apps do fit already in x64 mode.

                Anyway, the end result is that using x32 to reduce cache usage can massively benefit a small number of apps, but for the most part you'll only see very small benefits.
                Last edited by smitty3268; 01-11-2019, 12:04 AM.

                Comment


                • #78
                  Originally posted by smitty3268 View Post
                  You either have enough or you don't, and obviously compilers and cpu designers try to ensure most apps do fit already in x64 mode.
                  Compilers do absolutely nothing about re-ordering data, except on the stack but that is very small data (since it's per-function, and it doesn't reorder most function calls either). They aren't even allowed to begin with.

                  Sorry to burst that bubble but yes, you need actual programming skill to reorder your data yourself, no magic pill or switch from the compiler as most people oblivious to low level details seem to think (i.e. crappy programmers).

                  This is not just for cache use btw. Even for auto-vectorization. If your data is ordered wrong, the compiler won't be able to properly auto vectorize it at all. It's not because it's not smart enough, it's because the language forces it to use your stupid layout. You told it to use that data layout, that's what it will use. By design.

                  e.g. if you perform operations on one member of the struct but multiple elements, then place them in a separate struct or basic type itself. Make an array of members for each member instead of array of structs. Too bad if this "uglifies" your "pure code" but it's what you have to do if you want proper auto vectorization.

                  Sorry, no magic switch.
                  Last edited by Weasel; 01-11-2019, 01:07 PM.

                  Comment


                  • #79
                    Originally posted by Weasel View Post
                    Compilers do absolutely nothing about re-ordering data, except on the stack but that is very small data (since it's per-function, and it doesn't reorder most function calls either). They aren't even allowed to begin with.

                    Sorry to burst that bubble but yes, you need actual programming skill to reorder your data yourself, no magic pill or switch from the compiler as most people oblivious to low level details seem to think (i.e. crappy programmers).

                    This is not just for cache use btw. Even for auto-vectorization. If your data is ordered wrong, the compiler won't be able to properly auto vectorize it at all. It's not because it's not smart enough, it's because the language forces it to use your stupid layout. You told it to use that data layout, that's what it will use. By design.

                    e.g. if you perform operations on one member of the struct but multiple elements, then place them in a separate struct or basic type itself. Make an array of members for each member instead of array of structs. Too bad if this "uglifies" your "pure code" but it's what you have to do if you want proper auto vectorization.

                    Sorry, no magic switch.
                    I wasn't talking about re-ordering data, but thanks for that...

                    Comment


                    • #80
                      Originally posted by smitty3268 View Post
                      I wasn't talking about re-ordering data, but thanks for that...
                      Yeah you were talking about some magical method to utilize cache more efficiently "automatically" that is pure fantasy.

                      Comment

                      Working...
                      X