Announcement

Collapse
No announcement yet.

Glibc 2.29 Released With getcpu() On Linux, New Optimizations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • F.Ultra
    replied
    Originally posted by cybertraveler View Post

    Firstly: you should say 64 bit "can have" not "does have". One test of one app..

    Secondly: the memory footprint will vary depending on the app(s) you are running, so again; the general statement you made doesn't make sense.

    Thirdly and most importantly: I think you missed my point. Perhaps I didn't explain myself well enough. I will clarify:

    Yes... there may be 64 bit hardware that runs 64 bit apps quicker and there may be apps that benefit from the increased size of types. However, if an app doesn't need or benefit from bigger types (IE doing arithmatic with bigger numbers) and it doesn't need to address more than 4 GB of memory, then a 32 bit CPU will always (as far as I know) be superior to a 64 bit CPU (if all other things equal; e.g. equivalent instruction sets), as you will need less CPU cache and less memory to do the same job.

    I expect that the vast majority of embedded systems and very low power devices would be more efficient if they used a 32 bit CPU architecture. However, I did previous note reasons why embedded device manufacturers aren't doing this.
    Well a wider data bus will allow for faster loads and stores to cache/memory for applications that shuffle lots of memory around.

    Leave a comment:


  • cbxbiker61
    replied
    Originally posted by cybertraveler View Post

    I expect that the vast majority of embedded systems and very low power devices would be more efficient if they used a 32 bit CPU architecture. However, I did previous note reasons why embedded device manufacturers aren't doing this.
    The vast majority of embedded (micro-controllers) are using 32bit. The vast majority of computers running a full OS are using 64bit.

    I don't imagine that the Cortex A7 will disappear anytime soon. So there certainly are 32bit micro options available.
    Last edited by cbxbiker61; 01 February 2019, 05:49 PM.

    Leave a comment:


  • cybertraveler
    replied
    Originally posted by cbxbiker61 View Post
    So yes, 64bit does have a performance advantage…and the memory footprint is within the capabilities of the Pi 3A+.
    Firstly: you should say 64 bit "can have" not "does have". One test of one app..

    Secondly: the memory footprint will vary depending on the app(s) you are running, so again; the general statement you made doesn't make sense.

    Thirdly and most importantly: I think you missed my point. Perhaps I didn't explain myself well enough. I will clarify:

    Yes... there may be 64 bit hardware that runs 64 bit apps quicker and there may be apps that benefit from the increased size of types. However, if an app doesn't need or benefit from bigger types (IE doing arithmatic with bigger numbers) and it doesn't need to address more than 4 GB of memory, then a 32 bit CPU will always (as far as I know) be superior to a 64 bit CPU (if all other things equal; e.g. equivalent instruction sets), as you will need less CPU cache and less memory to do the same job.

    I expect that the vast majority of embedded systems and very low power devices would be more efficient if they used a 32 bit CPU architecture. However, I did previous note reasons why embedded device manufacturers aren't doing this.

    Leave a comment:


  • cbxbiker61
    replied
    Originally posted by cybertraveler View Post
    The mention of the C-SKY 32 bit CPU got me wondering: why aren't more low-power CPUs intended for use in embedded systems, portable systems or SBCs (like the Raspberry Pi), 32 bit only?[/LIST]There's some interesting words about how the Nintendo 64, 64 bit CPU was used here:

    https://en.wikipedia.org/wiki/Ninten...rocessing_unit

    The claim is, it was mostly used in 32 bit mode for the very reasons I would expect.
    Nintendo 64 from 1996, 23 years ago.. what was true of hardware in 1996 is not true now. Compare average ram capacity for example.

    I have a real world example comparing 64bit operations to 32bit operations running MyCroft voice assistant.

    <pre>
    ***** 64bit pi 3b 1.2GHz
    ~16% cpu usage waiting for command
    344M used
    5 min load 0.60
    ***** 32bit pi 3b 1.2GHz
    ~23% cpu usage waiting for command
    267M used
    5 min load 0.82
    So yes, 64bit does have a performance advantage…and the memory footprint is within the capabilities of the Pi 3A+.
    </pre>

    Leave a comment:


  • arjan_intel
    replied
    Originally posted by pal666 View Post
    few cache misses seem less expensive than syscall
    it'll be a vsyscall on many systems... and a syscall is maybe 100 to 200 cycles (depends on cpu); cache miss can be more

    Leave a comment:


  • pal666
    replied
    Originally posted by arjan_intel View Post
    it's statistically not horrible, meaning if you want to use it as an index into, say, an array of "mostly per cpu" structures, you'll get 95% or even 99% or more the "right" answer, and if the cost of being wrong is just a few cache misses/etc (e.g. performance) then that can be a very valid use of this
    few cache misses seem less expensive than syscall

    Leave a comment:


  • milkylainen
    replied
    Originally posted by lkundrak View Post

    Sorry to say so, but you managed to embarrass yourself by forgetting to attach a patch to your strong words.
    If you knew how many times people have tried posting/discussing different optimizations you wouldn't have opened the obviously overfed pie hole, mr. meat sack.
    I made the comment for bloody obvious reasons.
    We can have the discussion about the PPC typical mem backend and how glibc under-uses it. Or you can just google?
    I had to dump glibc memcpy/memmove several times on PPC for performance sensitive apps to run replacement asm functions.

    Leave a comment:


  • gilboa
    replied
    Originally posted by xorbe View Post
    Couldn't the result of getcpu() change like the wind, what good is that info unless the thread is pinned? Random number generator?
    We use it for (non-atomic) statistics counting across multiple processes and threads.

    - Gilboa

    Leave a comment:


  • Weasel
    replied
    Originally posted by cybertraveler View Post
    The main reasons I can think of for this drift towards low power 64 bit CPUs are:
    • Effort has been put into enhancing the 64 bit archiecture but not the 32 bit one. As such, people are just going with the flow to get the faster chips
    • memory mapping of files. Presumably a 32bit system can only memory map a file of up to 4 GB in size. a 64 bit CPU can map out much larger files. This is a very convenient programming feature.
    • many 64 bit CPUs can run in a 32 bit mode, so devs do have that option. Presumably this is not without cost though; even if that cost is just complexity.
    • marketing people
    You forgot: "developer" laziness which is probably the biggest one. Other than that, I agree with your post.

    I'm guessing we'll see a shift as we can't really squeeze much more out of transistors at this point, so can't waste it on laziness as much (good thing!).

    Leave a comment:


  • arjan_intel
    replied
    Originally posted by xorbe View Post
    Couldn't the result of getcpu() change like the wind, what good is that info unless the thread is pinned? Random number generator?
    it's statistically not horrible, meaning if you want to use it as an index into, say, an array of "mostly per cpu" structures, you'll get 95% or even 99% or more the "right" answer, and if the cost of being wrong is just a few cache misses/etc (e.g. performance) then that can be a very valid use of this

    Leave a comment:

Working...
X