Announcement

Collapse
No announcement yet.

Is Assembly Still Relevant To Most Linux Software?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Glib sux

    Originally posted by gens View Post
    no, i made it in FASM
    my version was faster cuz glibc one was copying 8bytes at a time, even thou it sayd it is an sse2 version (sse can copy 16bytes at a time)
    still my version would be faster if glibc was using its proper sse2 version, for i made a simpler logic
    now the ssse3 version that is faster then mine is faster only in few cases when the source and dest are 1byte unaligned (and with blocks way bigger then the cpu cache, that i can optimize rather easy but am lazy)

    then there is Agner Fog's version, that i dont quite understand
    from what i seen from it a compiler cant make anything like that, at least not without heavy care from the programer

    bdw, string operations are another case where assembly can make a big difference
    Thanks for the info.
    When it comes to string-operations, I avoid the glibc-string-headers like the plague and rather refrain to writing my own string-functions which are almost always 1.5-2x faster than the glibc-counterpart (mostly, because you can design your functions for your current needs and strip a lot).
    Suckless.org has a good reason to list GlibC as a library considered harmful (http://suckless.org/sucks) and we should all stay away from it if possible and make up our own mind on how to deal with those things effectively, without being infected with C++ STL-crap. (Call me a C++-Hater, that's what I am ).

    Comment


    • #22
      CPU technology is moving too fast, making it hard to justify using assembly for anything other than specialized platform specific libraries (ie: media codecs, etc). Compilers are good enough and ram is now stupid cheap.

      As for C++ stl hater you must not do much work atall at the high level. I'm more irritated by the other than utf-8 crap out there causing problems. String stuff? not a big deal when shaving an iteration off PDE equations or finding a way to better define a PDE interpolation field (to allow more sparse expensive evaluations) is the killer.

      That being said I *have* written my own little toy string class set but haven't recently performance checked it. It seems performance for what I do is more heavily dependent on limiting memory allocations (fun with massively parallel stuff) than just a few instructions. Modern intel cpus seem to stall waiting for data more than they do executing a few instructions.
      Last edited by bnolsen; 02 April 2013, 12:01 PM.

      Comment


      • #23
        C++ STL still sux

        Originally posted by bnolsen View Post
        CPU technology is moving too fast, making it hard to justify using assembly for anything other than specialized platform specific libraries (ie: media codecs, etc). Compilers are good enough and ram is now stupid cheap.

        As for C++ stl hater you must not do much work atall at the high level. I'm more irritated by the other than utf-8 crap out there causing problems. String stuff? not a big deal when shaving an iteration off PDE equations or finding a way to better define a PDE interpolation field (to allow more sparse expensive evaluations) is the killer.

        That being said I *have* written my own little toy string class set but haven't recently performance checked it. It seems performance for what I do is more heavily dependent on limiting memory allocations (fun with massively parallel stuff) than just a few instructions. Modern intel cpus seem to stall waiting for data more than they do executing a few instructions.
        C++ STL-hater here: If I was to completely abandon all high-level Glib-string-functions, why should I even use Glib then and not rewrite it cleanly using good ol' C? A discussion about this topic is rather senseless, because it depends on tastes, no question .

        Comment


        • #24
          Originally posted by gens View Post
          bdw, string operations are another case where assembly can make a big difference
          Ubuntu/Linaro folks, this is your monthly reminder that cortex-strings is still not upstreamed.

          Comment


          • #25
            Originally posted by bnolsen View Post
            CPU technology is moving too fast, making it hard to justify using assembly for anything other than specialized platform specific libraries (ie: media codecs, etc). Compilers are good enough and ram is now stupid cheap.

            As for C++ stl hater you must not do much work atall at the high level. I'm more irritated by the other than utf-8 crap out there causing problems. String stuff? not a big deal when shaving an iteration off PDE equations or finding a way to better define a PDE interpolation field (to allow more sparse expensive evaluations) is the killer.

            That being said I *have* written my own little toy string class set but haven't recently performance checked it. It seems performance for what I do is more heavily dependent on limiting memory allocations (fun with massively parallel stuff) than just a few instructions. Modern intel cpus seem to stall waiting for data more than they do executing a few instructions.
            assembly is also better with atomics and all that modern crap
            memory is limiting on modern hardware, so you have to care about the cache, alignment and more
            read some of Agner Fog's publications on that
            also there is a blog by one x264 dev where you can read why x264 is way faster then other encoders (hint; asm sse/avx and cache optimizations)

            anyway, what i wanted to say:

            "Linaro is a not-for-profit engineering organization consolidating and optimizing open source Linux software and tools for the ARM architecture."

            there you have it
            ARM is a simple instruction set, and is easy for a compiler (and yet still hand written ARM asm can be faster/smaller)
            there is not much to optimize on ARM, it is a simple instruction set
            x86(and such) are complex instruction sets, and are if you ask me way more advanced then ARM
            only way to bring ARM on par with x86 is to add stuff to the architecture (like "vector operations", that is avx/sse)

            if any linaro guys are reading:
            go get musl, get clang, get other key programs that dont need assembly
            but dont force you opinion on others, at least dont without proof (for others reading, there is no proof that a compiler can beat a human in anything but speed, as in speed of making a program)

            PS C has been out for decades and theres still leaps in compilers optimizing it
            what about higher level languages ? centuries ?

            PPS Utf-8 crap ? do you know what is UNICODE for ? if you dont and are a programer, then you are american and narrow minded
            Last edited by gens; 02 April 2013, 12:25 PM.

            Comment


            • #26
              Originally posted by gens View Post
              PPS Utf-8 crap ? do you know what is UNICODE for ? if you dont and are a programer, then you are american and narrow minded
              The whole nasty around UTF8, UTF16 (which variant?) UTF32 (which variant?) is a bigger can of worms of trouble than implementation issues on some string libraries. I'm very painfully aware of internationalization stuff.

              Comment


              • #27
                Originally posted by bnolsen View Post
                The whole nasty around UTF8, UTF16 (which variant?) UTF32 (which variant?) is a bigger can of worms of trouble than implementation issues on some string libraries. I'm very painfully aware of internationalization stuff.
                UTF8 should be the standard
                the plan9 guys, who are responsible for Unix, have made plan9 use UNICODE universally

                everything that manipulates text should be made UTF8 compatible

                actually UTF8 is not that complicated
                it is a superset of ASCII, the last(or first, dont remember) bit designates UNICODE and the encryption is simple and cpu friendly (details on the wiki)
                so extending libraries should be fairly simple, if you got their source

                PS as i understand the next big step for cpu architectures is to add more cores
                cuz there is (probably) not much to better in existing cores
                Last edited by gens; 02 April 2013, 01:34 PM.

                Comment


                • #28
                  Talking about Unicode ...

                  Originally posted by bnolsen View Post
                  The whole nasty around UTF8, UTF16 (which variant?) UTF32 (which variant?) is a bigger can of worms of trouble than implementation issues on some string libraries. I'm very painfully aware of internationalization stuff.
                  Yes, this is a valid reason. But unless you are dependent on the cyrillic alphabet, ASCII is not much of a problem (and easiest to handle in terms of programming) for most users.
                  Still, Unicode is well-designed but lacks a proper implementation. The Locale-system seems half-assed and is too complex for the needs of a dedicated, out-of-the-box user-experience.

                  Comment


                  • #29
                    You don't have to use pure assembler to use simd. Gcc/cl etc exposes intrinsics, that way you can decide exactly what simd instructions to use but letting the compiler manage instruction and register scheduling. Imoh less error prone and much simpler (especially maintenance). You can always read the asm efter compilation to check that it's reasonable. I had a few cases where I had to use -O1 with gcc and do a bit of the instruction scheduling myself (sometimes the gcc register and instruction schedulers interferes with each other on odd architectures), but still a win over full manual asm.

                    If you do stuff like image processing it can be quite worthwhile.

                    Comment


                    • #30
                      Originally posted by Qaz` View Post
                      You don't have to use pure assembler to use simd. Gcc/cl etc exposes intrinsics, that way you can decide exactly what simd instructions to use but letting the compiler manage instruction and register scheduling. Imoh less error prone and much simpler (especially maintenance). You can always read the asm efter compilation to check that it's reasonable. I had a few cases where I had to use -O1 with gcc and do a bit of the instruction scheduling myself (sometimes the gcc register and instruction schedulers interferes with each other on odd architectures), but still a win over full manual asm.

                      If you do stuff like image processing it can be quite worthwhile.
                      Sorry but given that intrinsics are CPU/ISA-specific, you'd still have to port your code, which is the point of the article: what x86 code exists that needs ARM64 porting?

                      Comment

                      Working...
                      X