Announcement

Collapse
No announcement yet.

Raspberry Pi OS 32-bit vs. 64-bit Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by svenh View Post
    Oh yes. I am sorry. I confused ilp32 and armhf. The really interesting candidate for new comparisons would be ilp32, of course, but which distros support it?
    Arm64 ilp32 is using aarch64 instructions except with 32 bit pointers this is arm v8 64 bit instructions. armhf is arm32 with hardware float this armv6 or armv7 normally.

    Going back to 32 bit pointers with 4G address space limit resut in aarch64 and the arm32 bits basically taken the same amount of ram. Yes the Linux kernel running in full 64 bit most end up with less complex memory operations for paging in and out this is one of those funny things.

    Linux kernel running in 64 bit mode normally has a smaller memory foot printer than running in 32 bit mode with PAE and arm equal. Yes the user space applications normally end up with a smaller memory foot print by using 32 bit pointers instead of 64 bit ones. arm64 ilp32 and x86 abi are both after to get the max memory effectiness. The need for x86 32 bit point with 64 bit instructions starts going way once you have multi g of ram.

    Debian at this point is the major supporter of arm64 ilp32. But if boards keep on increasing ram this could go the way of x86 version of it where its not worth the bother.

    Comment


    • #52
      Originally posted by Raka555 View Post
      64bit ARM require ARMv8 as minimum, while Raspbian 32bit is ARMv6.
      Well, it uses the hard float ABI and at least some things are built with vfp (Vector Floating Point - not sure if it's enabled by default).
      Last edited by coder; 08 February 2022, 04:51 AM.

      Comment


      • #53
        Originally posted by paulpach View Post
        By removing the predicates, they save a few bits in the instruction code, which were used to double the number of registers an instruction can address.
        Right. To keep the instruction word length at 32-bit, they had to scrounge extra bits from somewhere. And predication probably tends to be used on a small minority of instructions, making it an easy target.

        Originally posted by paulpach View Post
        Predicated instructions also complicate speculative execution, since you can't really tell if an instruction will be executed or not until the predicate is available.
        Nah, it's just a branch by another name. There's really no reason you couldn't handle predicated instructions in the same way, if you wanted to.

        Comment


        • #54
          Originally posted by GI_Jack View Post
          I am going to guess, that they simply aren't just doing 64-bit registers, but rather targeting the newer Arm v8 arch which just happens to be 64-bit, and getting all of those newer features, as opposed to just Arm v5 or Arm v6 of what they are targeting with 32-bit.
          No, it has to run on the A53 cores, which are baseline ARMv8.0-A. Even the A72 cores are still v8.0. To get beyond that, you have to go all the way to A75, which is v8.2.

          Comment


          • #55
            Originally posted by atomsymbol
            Given the particular set of benchmarks in this Phoronix article: The performance difference isn't caused by the ISA being 64-bit and not being 32-bit - but caused by the fact that the 64-bit AArch64 ISA happens to be a redesigned ISA. If AArch64 was ported to 32 bits then, obviously, the port would outperform AArch64 on a Raspberry Pi 4GB.
            Interestingly, ARM's v8 realtime & microcontroller ISAs (ARMv8-R and ARMv8-M) are still 32-bit. I'm reading ARMv8-R later added AArch64 as an optional extension.

            Originally posted by atomsymbol
            Performance advantage of 64-bit integers over 32-bit integers can indeed be demonstrated, but only using a different set of benchmarks than the set used in this Phoronix article.
            The main use case that comes to mind for 64-bit integer arithmetic is file I/O using 64-bit offsets. Maybe some databases and other I/O intensive apps gain a real advantage from it.

            By and large, I tend to agree that most apps which don't need more than 4 GB of virtual memory don't really gain from just the transition from 32-bit to 64-bit, itself. (Edit: removed incorrect statement. See jabl 's reply for correction.)

            It'd be telling if someone did an analysis of 64-bit x86-64 binaries to see just what proportion of register accesses and memory operands were 64-bit and what they're used for. I'm betting it'd be almost all pointers, and that's just by necessity.
            Last edited by coder; 09 February 2022, 03:22 AM.

            Comment


            • #56
              Please add Manjaro 64 bits and other 64 bits ARM distributions or spins - DE matter - to the benchmark.

              Comment


              • #57
                Originally posted by discordian View Post
                MIPS/RiscV is nice as there are no separate instruction sets, 32bit is just a subset of instructions. Not 2 (or 3 with Thumb2) instructionsets and decoders like ARM has to do.
                Not all ARM cores have to support all ISA versions. For instance I think Fujitsu's A64FX cores are AArch64-only. And most ARMv9 cores drop AArch32 support, except for the A710.
                Last edited by coder; 08 February 2022, 04:53 AM.

                Comment


                • #58
                  Originally posted by willmore View Post
                  Clearly there are other aspects involved. You did mention a few of them, but let's enumerate a nice list:
                  1. Double the operand size
                  2. Better designed ISA
                  3. Double the registers
                  4. Improved calling convention
                  5. Newer instructions
                  But there's more to it than that.
                  You forgot the vector improvements. Vectors doubled in size, from 64-bit to 128-bit, there are double the registers, and the vector instructions have other improvements. This is clearly a factor in many of the benchmarks. I'm not even sure if the Pi's toolchain enables vector instructions by default, in the legacy 32-bit target.

                  Comment


                  • #59
                    Originally posted by atomsymbol
                    An interesting problem to solve is: How to design a CPU where the physical size of the address space is invisible to the programmer? Such a CPU is possible (the Turing machine has an infinite tape).
                    It'd be annoying to program, since you couldn't use fixed offsets into structs that contain pointers. And any arrays of pointers or arrays of structs would likewise need runtime computation of offsets, even when the array element is known at compile-time.

                    Comment


                    • #60
                      Originally posted by loganj View Post

                      it works in browsers on linux (including arm sbc) but i know pi has (or had) issues with playing youtube videos even at 1080p. i don't know about netflix and other streaming services (1080p or 4k) though
                      Pi4 plays 1080p videos fine with the official Chromium (raspi os debian 10, mp4 video format).
                      Last edited by nist; 08 February 2022, 08:32 AM.

                      Comment

                      Working...
                      X