Announcement

Collapse
No announcement yet.

RISC-V With Linux 5.19 Preps "COMPAT" Mode For 32-bit Apps On 64-bit Kernels & More

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RISC-V With Linux 5.19 Preps "COMPAT" Mode For 32-bit Apps On 64-bit Kernels & More

    Phoronix: RISC-V With Linux 5.19 Preps "COMPAT" Mode For 32-bit Apps On 64-bit Kernels & More

    With Linux 5.18 expected to be released as stable tomorrow and that opening up the Linux 5.19 merge window, feature work aimed for this next kernel should be largely wrapped up. Within the RISC-V architecture's "for-next" branch is several interesting additions...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    32 bit apps on 64 bit kernels? I wonder who wants that?

    RISC-V went straight to 64 bit on Linux, so there are no legacy 32 bit apps to run. 32 bit is more for microcontrollers. While the RISC-V spec allows a core to support both 64 bit and 32 bit code with a standard way to check if it is supported and to switch, I don't know of anyone who actually makes such a core.

    Comment


    • #3
      Originally posted by brucehoult View Post
      32 bit apps on 64 bit kernels? I wonder who wants that?

      RISC-V went straight to 64 bit on Linux, so there are no legacy 32 bit apps to run. 32 bit is more for microcontrollers. While the RISC-V spec allows a core to support both 64 bit and 32 bit code with a standard way to check if it is supported and to switch, I don't know of anyone who actually makes such a core.
      If your application use less than 4GB or memory, then using 32bit addressing mode can save a fair amount of memory if your program has lots of pointers while still being able to use instructions available in 64 bit mode.

      Comment


      • #4
        Originally posted by NobodyXu View Post

        If your application use less than 4GB or memory, then using 32bit addressing mode can save a fair amount of memory if your program has lots of pointers while still being able to use instructions available in 64 bit mode.
        You seem to be confusing RISC-V and x86. On RISC-V the 32 bit and 64 bit instruction sets are essentially identical. The only difference is RV64 adds a handful of instructions that explicitly work on 32 bit values instead of full registers, to make programs using "int" more efficient.

        Also, there are very few programs in which pointers make up a significant proportion of user data in memory, not least because the kind of data structures that are like that (binary trees, linked lists with small data items) cause a lot of cache misses and TLB misses and so perform very poorly on any CPU made since maybe 1995, so they are strongly avoided outside of student assignments.

        Comment


        • #5
          Originally posted by brucehoult View Post

          You seem to be confusing RISC-V and x86. On RISC-V the 32 bit and 64 bit instruction sets are essentially identical. The only difference is RV64 adds a handful of instructions that explicitly work on 32 bit values instead of full registers, to make programs using "int" more efficient.

          Also, there are very few programs in which pointers make up a significant proportion of user data in memory, not least because the kind of data structures that are like that (binary trees, linked lists with small data items) cause a lot of cache misses and TLB misses and so perform very poorly on any CPU made since maybe 1995, so they are strongly avoided outside of student assignments.
          There's a huge amount of software for interpreted, dynamically typed, and object-oriented languages which use pointers extensively. Java being a very visible example, but also javascript, and anything using arrays of struct, arrays of object, or arrays of short string, ie compilers and interpreters. Some languages implement iterable object types as doubly-linked lists, so then an array of a dynamic-sized element must have a minimum of 3 pointers and a size (prev, next, address, and int size.) So an extra 16 bytes *per element* for 64-bit vs 32-bit. But if it also has garbage collection then also add int reference count and/or a global pointer bringing it to an extra 20 bytes per element. Minimum. Now make it a dynamic array of objects, and you may have multiple pointers for the array bucket plus for the object type/method table and you can easily end up with an extra 48 bytes per element as you do with PHP 7, though they did manage to reduce it a few bytes for 7.1. Oh, and let's not forget xmalloc accounting overhead bytes which again are doubled for 64 vs 32-bit, and the pointers inherent to stack return address pushes, for languages where recursion is popular. And that also eats $I and $D, so it can be harmful where certain functions are struggling to remain resident in cache. Running 32-bit Java on 64-bit kernel is popular.
          Last edited by linuxgeex; 22 May 2022, 04:53 AM.

          Comment


          • #6
            Originally posted by brucehoult View Post

            You seem to be confusing RISC-V and x86. On RISC-V the 32 bit and 64 bit instruction sets are essentially identical. The only difference is RV64 adds a handful of instructions that explicitly work on 32 bit values instead of full registers, to make programs using "int" more efficient.

            Also, there are very few programs in which pointers make up a significant proportion of user data in memory, not least because the kind of data structures that are like that (binary trees, linked lists with small data items) cause a lot of cache misses and TLB misses and so perform very poorly on any CPU made since maybe 1995, so they are strongly avoided outside of student assignments.
            Thanks for clarification on the RISC-V 32bit vs 64bit instruction part.

            However, I disagree that size of pointers is insignificant.

            Python interpreter is an example where pointers are used frequently, since all python objects are allocated on heap and their fields (except primitive types) are allocated on heap and referred to using pointers.

            For data structures, linked list is indeed uncommon and avoided, but trees are still user, albeit not frequently.

            Not every data type is hashable and sometimes order is also very important, i.e., finding all elements larger than/smaller than/between certain values.

            An array can do the job for static data, but it doesn't work for data generated at runtime.

            Comment


            • #7
              Originally posted by linuxgeex View Post

              There's a huge amount of software for interpreted, dynamically typed, and object-oriented languages which use pointers extensively. Java being a very visible example, but also javascript, and anything using arrays of struct, arrays of object, or arrays of short string, ie compilers and interpreters. Some languages implement iterable object types as doubly-linked lists, so then an array of a dynamic-sized element must have a minimum of 3 pointers and a size (prev, next, address, and int size.) So an extra 16 bytes *per element* for 64-bit vs 32-bit. But if it also has garbage collection then also add int reference count and/or a global pointer bringing it to an extra 20 bytes per element. Minimum. Now make it a dynamic array of objects, and you may have multiple pointers for the array bucket plus for the object type/method table and you can easily end up with an extra 48 bytes per element as you do with PHP 7, though they did manage to reduce it a few bytes for 7.1. Oh, and let's not forget xmalloc accounting overhead bytes which again are doubled for 64 vs 32-bit, and the pointers inherent to stack return address pushes, for languages where recursion is popular. And that also eats $I and $D, so it can be harmful where certain functions are struggling to remain resident in cache. Running 32-bit Java on 64-bit kernel is popular.
              I'm aware of all that. Working on compilers, interpreters, and garbage collection is my speciality.

              There is a difference, but it's not significant for the vast majority of programs -- 5% or less.

              The x32 ABI was developed in 2011 and incorporated into the Linux kernel in 3.4 in October 2012. It's seen very little adoption and has been on the verge of being deprecated since 2018.

              Comment


              • #8
                Originally posted by NobodyXu View Post

                However, I disagree that size of pointers is insignificant.
                32bit address spaces go into the "in theory, not in practice" bucket. Most programs never instantiate more than a few thousand objects. Even if there are 10 pointers to each of them, using 32bit will only save 200kb or so.

                If you are working on something where pointers are using a substantial amount of the 4Gb address space, you'll have long since needed to switch to 64 bit address space. In the same way... if a program fits into 4Gb anyway, it isn't likely to be taxing for modern processor memory architectures.


                32bit on 64bit RISC-V is being done because Linux already has support for that pattern, so it presumably isn't a massive job. It'll be used by people debugging microcontroller programs... and that's about it.

                Comment


                • #9
                  Originally posted by brucehoult View Post
                  32 bit apps on 64 bit kernels? I wonder who wants that?

                  RISC-V went straight to 64 bit on Linux, so there are no legacy 32 bit apps to run. 32 bit is more for microcontrollers. While the RISC-V spec allows a core to support both 64 bit and 32 bit code with a standard way to check if it is supported and to switch, I don't know of anyone who actually makes such a core.
                  I believe Guo Ren worked on that as the Allwinner F133 is 64MB RAM in package with the D1. The stats in the patches seemed to show a benefit for an extremely memory constrained environment like that.

                  Comment


                  • #10
                    Originally posted by fustini View Post

                    I believe Guo Ren worked on that as the Allwinner F133 is 64MB RAM in package with the D1. The stats in the patches seemed to show a benefit for an extremely memory constrained environment like that.
                    Hi Drew. First post! Yeah, those chips are very constrained -- almost "embedded". I've tried running modern Linux in such as environment (e.g. adjusting parameters in TinyEmu) and you can just barely run TWM with two xterms and compile hellworld.c without running out of RAM. Just.

                    Comment

                    Working...
                    X