Announcement

Collapse
No announcement yet.

READFILE System Call Rebased For More Efficient Reading Of Small Files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 60Hz
    replied
    Originally posted by rene View Post
    It is probably time to ignore this News plumbing website, ...
    It's still above your level though, rene. You're a midwit, at best.

    Leave a comment:


  • oiaohm
    replied
    There is something most people would not even consider that is the case that is.

    The Amdahl's law problem on atomic operations is nightmare effects on performance when you hit it.

    What is the problem with open read close on really small sysfs/procfs even with syscalls there is almost zero time between the open and the close. This result in not enough time for the ulimit or other atomic operations to propagate between the MMUs and cpu caches in your larger core count systems.

    Single threaded event being forced due to doing atomic operations faster than propagation speed in fact results in worst case Amdahl's law and its worse than most people would think. People are not think of the MMU as a core so the MMU is progressing forwards while the CPU/instruction call processors are basically halted.

    Yes atomic operations allowing locking and so on to be off loaded to the MMU generally does give performance gains due to the CPU being able to use other threads while propagation is under way. Problem there are a few corner cases where the instead of providing advantage it gives you major performance headaches. Remember the MMUs and cache systems also have a upper limit on how many atomic operations they can deal with in a particular time frame before again stall.

    Yes simpler 1 to 2 thread CPUs don't have this problem in a big way because the propagation if atomic state is simple. More cores more threads the simpler it is to stall the system with excessive atomic operations. Yes stall being making the system have the performance of 0Hz processor for time frames can bad enough that a 8088 4Mhz processor performing the same task will win against a epyc 128 core because 8088 4Mhz single core and not getting stuck in atomic operation sync problem.

    Readfile syscall is about 1 reducing syscalls. 2 reducing/removing atomic operations that can happen faster than the replication rate MMU that can result in stalling the CPUs in large core count systems at effectively 0Hz processing speed. The second part is the big thing.

    The worst kind is the double operation atomic. You have changed X atomic value you now have to wait on the sync before you can change it again. See the problem with open read close being really fast. open ulimit change may go fine but you get to close before ulimit atomic change from open has propagated you are now running into trouble out of order operations can cover this a limited amount. Best way to fix this problem remove the open and close operations and remove the atomic value change because its a add and subtract of 1 resulting in a zero value change in the end.

    The most likely times to out run the means to propagate atomic values fast enough is on system start up and system shutdown. The most file open and close of a application happens to be in startup and shutdown of applications as well. The most atomic protected values changes for process tracking and so on again happen in startup and shutdown.

    Yes this problem does effect how fast CPU cores can clock up in the boot process. There are issues with increasing usage multi core systems. Amdahl's law gives us maths to calculate the problems of multi core systems but it does require correct application of the maths to understand how the problems will display self. Locating the single thread problems can be quite tricky particularly thinking advanced cpu/mmu design has hidden single thread logic that does come back and bite.

    Leave a comment:


  • oiaohm
    replied
    Originally posted by rene View Post
    Also repeating false claims all day long do not make them more true. I guess there is a reason Linus Torvalds comments on real world tech and not Phoronix. It is probably time to ignore this News plumbing website, ...
    No Linus Torvalds is smarter than you on these topics. Big thing he does not comment on massive multi processing system problems.

    https://lore.kernel.org/lkml/[email protected]/
    reading from one process > >> per core (96 total) was 491x slower.

    There are benchmarks back in 2016 showing exactly what I described where the atomics operations with sysfs/procfs on large core and thread count systems could start causing massive slow downs.

    This is the problem that readfile is attempting to address. False claims my ass. The Amdahl's law problem that comes from atomic operations can be demonstrated with benchmarks we are not talking little bits of slow down here pushed the right way a brand new 4Ghz 128 core Epyc system can make a XT computer appear fast.

    How is your vectored syscall idea going to deal with the fact you have to reduce atomic operations. That right its not going to.

    Leave a comment:


  • coder
    replied
    Originally posted by rene View Post
    I guess there is a reason Linus Torvalds comments on real world tech and not Phoronix. It is probably time to ignore this News plumbing website, ...
    Your words betray your own feelings. If you truly held us in such low regard, why even bother to insult us? I think you're really just trying to convince yourself that we're wrong.

    Leave a comment:


  • rene
    replied
    Originally posted by oiaohm View Post

    Operating system design is not left to the Leyen. This is a being nice and insulting while writing in way most people would not notice.
    https://en.wikipedia.org/wiki/House_of_Leyen

    Yes most english people writing would not know that Leyen is a particular german Nobel family. To be correct one reading of this could be a person with absolute authority remember items like Linux kernel has what is called dictators for life that are people with absolute authority. So like it not a lot of operating system design is left what could be called a noble class.

    The not as insulting version and more correct is:
    Operating system design is not left to the layman. Or the german. Betriebssystem Design nicht den laie überlassen.
    Even these are not 100 percent correct either. When you play in unikernel you learn very quickly some are made by laymen who have no clue what they are really doing yet they are used in production.

    Yes that is layman meaning "a person without professional or specialized knowledge in a particular subject.".

    Really rene you need to take a close look at yourself you could not even write a single line of german here that was correctly aligned to reality.

    Yes I do read german I could have written a answer in german as well but this is a english forum show some respect and don't attempt to be insulting by hiding it in german some of us here will spot it rene and know exactly what it is. I first run across with how expensive atomic operations could be is when I was running a 4096 core system that was almost 2 decades ago. Like it or not I am not a layman on why readfile makes sense and vectored syscalls don't in many cases that readfile will be used. Rene you have been lacking the specialised knowledge remember a lot of the Linux kernel developers are like me who have worked with 4096 core systems or bigger so they have a different point of view.

    Readfile syscall is very different once you get into implement to a vectored syscall and comparing the the results on system operations. There has been call for something like a readfile syscall in Linux as early as the year 2000 by groups running supercomputers with massive number of cores due to the atomic operation costs.

    https://en.wikipedia.org/wiki/Parall...afson's_la w

    Atomic operations are covered under Amdahl's law.
    https://en.wikipedia.org/wiki/File:AmdahlsLaw.svg

    The reality is atomic operations are not 100 percent parallel. Small increases in how much can be done in parallel do have very big effects on performance as your thread/core count goes up. You have to think epyc dual core servers these days are 256 threads. The difference at 1 or 2 threads cpu system on how parallel stuff is basically almost not measurable. Yet even at 256 a 0.1 percent improvement quite a bit of gain.

    Yes Amdahl's law also explains why a syscall like readfile did not make sense historically there was not the performance gain on the table because systems were not parallel enough to have a issue with atomic operations. Yes back on a single core/dual core system your arguement rene for vectored syscall over readfile makes some sense problem the systems we use today are increasing core count and thread count with how parallel things are coming a larger and larger factor. On 32/64 core/thread systems there is more performance gain in the 2 avoided atomic operations with readfile than the 2 syscall savings.

    Readfile is not just saving syscalls readfile saving atomic operations. Going forwards there is going to have to be more consideration in OS design on how have less not parallel stuff because of increasing core and thread count.

    OS design method has had to change with the hardware like it or note rene the way you were pushing vectored syscalls was not taking the modern need for parallel operations into account and how that makes reducing things like atomic operations important.

    Rene error handling is a problem in the vectored designs put forwards in designs so far. Lack of saving in atomic operations and other features that are not fully parallel is another problem with vectored syscalls so far. Remember reading the contents of sysfs or procfs if you have 4096 thread system in theory if the operations you use don't use atomics all 4096 threads could be reading all those files at the same time so true 100 percent parallel where the current syscalls and your proposed vectored syscalls cannot achieve this but proposed readfile syscall can.
    your correction puts interpreting words into my mouth that I internally did not write. I wrote Leyen for a good reason lol.

    Also repeating false claims all day long do not make them more true. I guess there is a reason Linus Torvalds comments on real world tech and not Phoronix. It is probably time to ignore this News plumbing website, ...
    Last edited by rene; 07 April 2021, 03:13 AM.

    Leave a comment:


  • oiaohm
    replied
    Originally posted by rene View Post
    Betriebssystem Design nicht den Leyen überlassen ;-)
    Operating system design is not left to the Leyen. This is a being nice and insulting while writing in way most people would not notice.
    https://en.wikipedia.org/wiki/House_of_Leyen

    Yes most english people writing would not know that Leyen is a particular german Nobel family. To be correct one reading of this could be a person with absolute authority remember items like Linux kernel has what is called dictators for life that are people with absolute authority. So like it not a lot of operating system design is left what could be called a noble class.

    The not as insulting version and more correct is:
    Operating system design is not left to the layman. Or the german. Betriebssystem Design nicht den laie überlassen.
    Even these are not 100 percent correct either. When you play in unikernel you learn very quickly some are made by laymen who have no clue what they are really doing yet they are used in production.

    Yes that is layman meaning "a person without professional or specialized knowledge in a particular subject.".

    Really rene you need to take a close look at yourself you could not even write a single line of german here that was correctly aligned to reality.

    Yes I do read german I could have written a answer in german as well but this is a english forum show some respect and don't attempt to be insulting by hiding it in german some of us here will spot it rene and know exactly what it is. I first run across with how expensive atomic operations could be is when I was running a 4096 core system that was almost 2 decades ago. Like it or not I am not a layman on why readfile makes sense and vectored syscalls don't in many cases that readfile will be used. Rene you have been lacking the specialised knowledge remember a lot of the Linux kernel developers are like me who have worked with 4096 core systems or bigger so they have a different point of view.

    Readfile syscall is very different once you get into implement to a vectored syscall and comparing the the results on system operations. There has been call for something like a readfile syscall in Linux as early as the year 2000 by groups running supercomputers with massive number of cores due to the atomic operation costs.

    https://en.wikipedia.org/wiki/Parall...afson's_la w

    Atomic operations are covered under Amdahl's law.
    https://en.wikipedia.org/wiki/File:AmdahlsLaw.svg

    The reality is atomic operations are not 100 percent parallel. Small increases in how much can be done in parallel do have very big effects on performance as your thread/core count goes up. You have to think epyc dual core servers these days are 256 threads. The difference at 1 or 2 threads cpu system on how parallel stuff is basically almost not measurable. Yet even at 256 a 0.1 percent improvement quite a bit of gain.

    Yes Amdahl's law also explains why a syscall like readfile did not make sense historically there was not the performance gain on the table because systems were not parallel enough to have a issue with atomic operations. Yes back on a single core/dual core system your arguement rene for vectored syscall over readfile makes some sense problem the systems we use today are increasing core count and thread count with how parallel things are coming a larger and larger factor. On 32/64 core/thread systems there is more performance gain in the 2 avoided atomic operations with readfile than the 2 syscall savings.

    Readfile is not just saving syscalls readfile saving atomic operations. Going forwards there is going to have to be more consideration in OS design on how have less not parallel stuff because of increasing core and thread count.

    OS design method has had to change with the hardware like it or note rene the way you were pushing vectored syscalls was not taking the modern need for parallel operations into account and how that makes reducing things like atomic operations important.

    Rene error handling is a problem in the vectored designs put forwards in designs so far. Lack of saving in atomic operations and other features that are not fully parallel is another problem with vectored syscalls so far. Remember reading the contents of sysfs or procfs if you have 4096 thread system in theory if the operations you use don't use atomics all 4096 threads could be reading all those files at the same time so true 100 percent parallel where the current syscalls and your proposed vectored syscalls cannot achieve this but proposed readfile syscall can.

    Leave a comment:


  • 60Hz
    replied
    Originally posted by rene View Post

    I can't follow, let alone read all random stuff discussed on each mailing list. However, all of my proposals would just work with your examples. Of course they would return which syscall were run and their respective error codes. So depending on which proposal you look at, the user space app would either get to know read() failed and close() was not executed and thus do it if needed, or for my other proposal the app would mark open to abort on fail, and read and close not. In either case there user space app would get a vector of return values. As simple as that and certainly more universal than the bloddy readfile() nobody really needs nor asked for and the "open with fixed fd" for io_uiring proposal. I wonder how that should scale and work flawlessly in a sophisticated application.

    It is not rocket science, and certainly all way more generally usable than readfile(). Sigh.
    Low IQ rene strikes again. Dunning-Kruger is a hell of a drug.

    Leave a comment:


  • coder
    replied
    Originally posted by rene View Post
    Betriebssystem Design nicht den Leyen überlassen ;-)
    I've had about enough of your patronizing attitude. If you really think so much less of us, then why bother trying to keep cramming your idea down our throats? You have a convenient explanation for us rejecting it, so your ego should remain intact... unless maybe the one you're really trying to convince that your right is yourself.

    Either way, this is going nowhere. We understand your idea and we don't accept it. But, we're just inconsequential commenters on a news site, remember? So, let's live to see another day... when we can fight over something different. Who knows, maybe we'll even agree? I'm sure there must be a lot we can all agree on, too.

    Leave a comment:


  • coder
    replied
    Originally posted by rene View Post
    I have news for you: readfile() will not affect the speed of which shell scripts and especially non-builtin commands execute at all.
    Not overnight, of course. Stuff has to start using it, first.

    Leave a comment:


  • rene
    replied
    Originally posted by oiaohm View Post

    Really with this answer I can understand why your idea is being rejected by kernel developers. The overhead of atomic operation overhead increases as your core count goes up. A i486 that a single core or a Sgi Octane that at max is only 2 core is not going to have much of a atomic operation overhead.

    Yes you got it wrong its 2 atomic operations removed. 1 add that is the open and 1 subtract that is the close. I did write both. So readfile takes away 2 syscalls and two atomic operations.


    With the atomic memory operations this has very little overhead. Its your value changing that causes overhead with atomic operations. So a ulimit of 256 and you are attempting 257 file this could get interesting. readfile could technically be allowed to proceed because its not increase number of actively open files. This kind of allowance is not possible with the vectored syscall allowance.

    Think for reading system setting values from sysfs or procfs does it really make sense to be increasing ulimit count if you are not keeping those files open for some reason.



    Absolute not a much as a proper writefile command. This is because you have ignored the overhead of atomic operations. Yes your response show you did not understand that a atomic operation cost on a modern 64 core 128 thread cpu is way worse than i486 that is a single thread or a Sgi Octane that at best is 2 threads.

    rene you are being absolute clueless. Vectored syscall idea has major limitations. A direct syscall for readfile and writefile makes more sense today than it did in past due to the increasing number of cores resulting in the increased cost of locking and atomic operations by having more core/threads to keep on the same page.

    Think about it you have 128 threads in a process all doing a readfile operation removing ulimit alteration from that means those operations can be processed inside each thread of the cpu without having to check with any other core if doing this is fine.

    Writefile does get trick again because you do have to have a atomic operation for altering a value in procfs and sysfs but this is from 3 atomic operations to 1.
    Betriebssystem Design nicht den Leyen überlassen ;-)

    Leave a comment:

Working...
X