Announcement

Collapse
No announcement yet.

Is Assembly Still Relevant To Most Linux Software?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • You failed

    Originally posted by oliver View Post
    And this doesn't happen in proprietary software? Now your really just talking horseshit. Really bad horseshit.

    I have worked with proprietary (embedded) software. I have seen leaked (later GPLed) source and still work with it. There is so much crap produced because of 'fast fast deadlines nobody sees anyway' mentality it would make you want to stab your eyes out.

    Yes. There is still lot of bad code EVEN in open source code out there. But you know what? it is waiting to be found, ready to be optimized. This specific radeon example was a hackjob to get something working fast. It was probably overlooked since (well someone found it now). It just needs someone to step up and submit a patch. It is not impossible to fix, nobody _needs_ to suffer. The radeon driver (while production ready I guess) is still under heavy development (or slightly abandoned). There simply aren't enough manhours to spend on these performance optimizations. Once the radeon (and others) are feature complete and are stable, I'm sure more interest will be put into performance optimizations, which with profiling should be spotted. It's only a matter of time and this was a piss poor example just to spread crap.
    Trolls keep trolling. I am also not new to the business, so don't claim anything here you can't prove.

    You misunderstood me, because I wasn't bashing free software, I am one of the strongest supporters of efficient code, and this means leaving out big data-structures (--> C++) and bloated libraries (--> Glibc).
    The suffering of the end-users is based on inefficient code to be found in many situations. This starts with udev, which is a mess. This is the reason I joined the eudev-project to clean it up.
    I am not an expert when it comes to Radeon-drivers (I wasn't even talking about them), so I don't know the reason for this childish rage on your side.

    BTW, the example was not even taken from the Radeon-stack, but you are not worthy to know where they are from. What's the saying? Don't feed the trolls!

    Comment


    • Originally posted by frign View Post
      I don't say in any way you are wrong on this area. You proof to be an expert for microcontrollers. What you must realize is that today's CPU's sadly shifted into being very efficient in itself, but very inefficient when it comes to memory-IO. Keeping your variables small reduces it, unless you can show me that small datatypes are factually stored in memory at the native address-length or more.

      As always, if you see a misconception on my side, please show me accordingly,
      Why thank you, but no I'm far from it. I've only dabbled with microcontrollers

      But I fully agree that today's CPU's are very efficent and inefficient for different scenarios. The internals as to what when and why is probably a mystery to our mere mortals. I do hope compiler writers do know these secrets and do know when is best optimized. You tell the compiler what platform you work with (--march) and based on that it can make certain decisions as to how to optimize best.

      From http://pubs.opengroup.org/onlinepubs.../stdint.h.html I get that
      The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.
      Diving directly into the header file and seeing what gcc does:
      Code:
      /* Fast types.  */
      
      /* Signed.  */
      typedef signed char int_fast8_t;
      #if __WORDSIZE == 64
      typedef long int int_fast16_t;
      typedef long int int_fast32_t;
      typedef long int int_fast64_t;
      #else
      typedef int int_fast16_t;
      typedef int int_fast32_t;
      __extension__
      typedef long long int int_fast64_t;
      #endif
      
      /* Unsigned.  */
      typedef unsigned char uint_fast8_t;
      #if __WORDSIZE == 64
      typedef unsigned long int uint_fast16_t;
      typedef unsigned long int uint_fast32_t;
      typedef unsigned long int uint_fast64_t;
      #else
      typedef unsigned int uint_fast16_t;
      typedef unsigned int uint_fast32_t;
      __extension__
      typedef unsigned long long int uint_fast64_t;
      #endif
      gcc does 'like' those new sizes, but see what it does? It simply maps them to whatever your wordsize is. I don't think that this code is generated depending on my architecture. So uint_fast, in gcc at this moment really doesn't do anything useful or smart. It just says 'use 64bit because that is the native bit width' ignoring the memory issues and only focusing on ALU width. I still think there can be an advantage here, certain calculations take quite some cycles (multiplications for example) and _may_ be faster when using an aligned bit width. But as we both know, modern processors are highly complex and this isn't easily found anywhere. Maybe there's some programming manual.

      So what I said, the compiler should be smart and should know what is happening in the CPU at any given time (except for context switches of course). So if your program is 200 instructions that need to be executed (in your allotted scheduling slice) the compiler should be able to optimize this not only based on ALU size but ALSO on memory bandwith. And I think here we both meet ends and agree, this is more something for the compiler to optimize, then a developer OR a typdef could ever do (bar writing it in asm).

      Comment


      • Originally posted by ciplogic View Post
        So in short, which of you that you consider yourself "an assembly friendly guy" did contribute to GLibC, Gnu As, Fasm or whatever. May you say where did you contribute (in brief)? I mean at least we would all benefit from your work in a form or the other, right?
        Miss quote? Or right quote but wrong accusation?

        I try to do my fair shares of patches to various OpenSource projects. Granted I have never done any assembler patches, but that's simply because I have only read assembly, never written, I trust my compiler to optimize it for me as much as possible.

        Having said that, I do still belive, that (with profiling) certain functions CAN be optimized further and can be important TOO optimize. I think linaro showed this was true for 30% of 'some metric'?

        Comment


        • Originally posted by frign View Post
          Trolls keep trolling. I am also not new to the business, so don't claim anything here you can't prove.

          You misunderstood me, because I wasn't bashing free software, I am one of the strongest supporters of efficient code, and this means leaving out big data-structures (--> C++) and bloated libraries (--> Glibc).
          The suffering of the end-users is based on inefficient code to be found in many situations. This starts with udev, which is a mess. This is the reason I joined the eudev-project to clean it up.
          I am not an expert when it comes to Radeon-drivers (I wasn't even talking about them), so I don't know the reason for this childish rage on your side.

          BTW, the example was not even taken from the Radeon-stack, but you are not worthy to know where they are from. What's the saying? Don't feed the trolls!
          I do apologize for wrongfully understanding you. It 'felt' as bashing, but as always, in textual form, this is hard to say.

          Comment


          • Originally posted by oliver View Post
            Miss quote? Or right quote but wrong accusation?

            I try to do my fair shares of patches to various OpenSource projects. Granted I have never done any assembler patches, but that's simply because I have only read assembly, never written, I trust my compiler to optimize it for me as much as possible.

            Having said that, I do still believe, that (with profiling) certain functions CAN be optimized further and can be important TOO optimize. I think linaro showed this was true for 30% of 'some metric'?
            Thank you for clarifying. Was not intended to target you (or anyone in particular), but the idea that many here (at least other guy that appeared here) basically make a long queue of things that "other should know". I mean we discuss that not using the proper int type, or the latencies.

            At last, the 30% are 30% of assembly based projects. Most of them are using it for atomics, accessing ports or calling OS routines, so the number is like 0.3 % or maximum 1% of projects. And these assembly projects are not assembly only, but assembly based, so the figures are better.

            Comment


            • In the end it's just a typedef

              Originally posted by oliver View Post
              Why thank you, but no I'm far from it. I've only dabbled with microcontrollers

              But I fully agree that today's CPU's are very efficent and inefficient for different scenarios. The internals as to what when and why is probably a mystery to our mere mortals. I do hope compiler writers do know these secrets and do know when is best optimized. You tell the compiler what platform you work with (--march) and based on that it can make certain decisions as to how to optimize best.

              From http://pubs.opengroup.org/onlinepubs.../stdint.h.html I get that


              Diving directly into the header file and seeing what gcc does:
              Code:
              /* Fast types.  */
              
              /* Signed.  */
              typedef signed char int_fast8_t;
              #if __WORDSIZE == 64
              typedef long int int_fast16_t;
              typedef long int int_fast32_t;
              typedef long int int_fast64_t;
              #else
              typedef int int_fast16_t;
              typedef int int_fast32_t;
              __extension__
              typedef long long int int_fast64_t;
              #endif
              
              /* Unsigned.  */
              typedef unsigned char uint_fast8_t;
              #if __WORDSIZE == 64
              typedef unsigned long int uint_fast16_t;
              typedef unsigned long int uint_fast32_t;
              typedef unsigned long int uint_fast64_t;
              #else
              typedef unsigned int uint_fast16_t;
              typedef unsigned int uint_fast32_t;
              __extension__
              typedef unsigned long long int uint_fast64_t;
              #endif
              gcc does 'like' those new sizes, but see what it does? It simply maps them to whatever your wordsize is. I don't think that this code is generated depending on my architecture. So uint_fast, in gcc at this moment really doesn't do anything useful or smart. It just says 'use 64bit because that is the native bit width' ignoring the memory issues and only focusing on ALU width. I still think there can be an advantage here, certain calculations take quite some cycles (multiplications for example) and _may_ be faster when using an aligned bit width. But as we both know, modern processors are highly complex and this isn't easily found anywhere. Maybe there's some programming manual.

              So what I said, the compiler should be smart and should know what is happening in the CPU at any given time (except for context switches of course). So if your program is 200 instructions that need to be executed (in your allotted scheduling slice) the compiler should be able to optimize this not only based on ALU size but ALSO on memory bandwith. And I think here we both meet ends and agree, this is more something for the compiler to optimize, then a developer OR a typdef could ever do (bar writing it in asm).
              I definitely agree on the last point, but no one stops the compiler from optimizing it accordingly. I knew uint_fast8_t just maps the type to an unsigned char, which makes the code more readable and leaves every distributor of the GNU operating system free to set those typedefs in one location to fix certain architectural quirks.

              Sadly, today's languages encourage to pile up a lot of stuff in memory. C++-classes are one example for an insufficient concept. Even though the compiler does a great job on keeping the code good enough, doing the same in C might have it's advantages (also in regards to inline-assembly). With no doubt, this is a religious question .

              So, it was nice discussing with you! Please let me know which language you prefer.

              Comment


              • No problem

                Originally posted by oliver View Post
                I do apologize for wrongfully understanding you. It 'felt' as bashing, but as always, in textual form, this is hard to say.
                Hey, no problem.
                I once in a while also face those situations and considering that more than 90% of communication is based on non-verbal communication, such evident losses are very likely.

                Comment


                • Originally posted by frign View Post
                  Sadly, today's languages encourage to pile up a lot of stuff in memory. C++-classes are one example for an insufficient concept. Even though the compiler does a great job on keeping the code good enough, doing the same in C might have it's advantages (also in regards to inline-assembly). With no doubt, this is a religious question .

                  So, it was nice discussing with you! Please let me know which language you prefer.
                  I thought that a C++ class was the same a a C struct, and as such didn't use more memory than C, and probably not much extra memory anyway (just a one time definition in the binary)?
                  Recently, there was a test using GCC compiled in C and GCC compiled in C++, and the speed of the two versions were compared (using compile time of the linux kernel), with no observable difference.
                  This proved that C++ in itself is not slower than C. Now, you can argue that some C++ specific constructs can be "slower" (often named are virtual functions and exceptions), but implementing these by hand in C would be at least as slow (and probably much slower). If you need fast polymorphism, you can use templates, which are 100x better than macro or copy paste.

                  I personally like C++ a lot as a low but not-too-low language, but it's also mostly because I'm familiar with it.
                  I'm never opposed using whatever language is best to the task at hand (except VBA. VBA sucks.). I'd be tasked on RT routines for embedded systems, I'd use C+assembly, but when writing GUI glue, I'm loving using JS/QML for Qt Quick. Languages are tools, not religion (plus, they won't blame you when flirt with some other ones)

                  Comment


                  • Thanks

                    Originally posted by erendorn View Post
                    I thought that a C++ class was the same a a C struct, and as such didn't use more memory than C, and probably not much extra memory anyway (just a one time definition in the binary)?
                    Recently, there was a test using GCC compiled in C and GCC compiled in C++, and the speed of the two versions were compared (using compile time of the linux kernel), with no observable difference.
                    This proved that C++ in itself is not slower than C. Now, you can argue that some C++ specific constructs can be "slower" (often named are virtual functions and exceptions), but implementing these by hand in C would be at least as slow (and probably much slower). If you need fast polymorphism, you can use templates, which are 100x better than macro or copy paste.

                    I personally like C++ a lot as a low but not-too-low language, but it's also mostly because I'm familiar with it.
                    I'm never opposed using whatever language is best to the task at hand (except VBA. VBA sucks.). I'd be tasked on RT routines for embedded systems, I'd use C+assembly, but when writing GUI glue, I'm loving using JS/QML for Qt Quick. Languages are tools, not religion (plus, they won't blame you when flirt with some other ones)
                    Thanks! Nice to hear other opinions.

                    The test you wrote about doesn't make sense, though, because GCC handles C as C++, so, when you compare two identical programs, you will get identical results.
                    The differences are very small, but we can't talk of the C++-classes as being the same as structs in C. There are definitely big differences between them.

                    Comment


                    • Originally posted by frign View Post
                      Thanks! Nice to hear other opinions.

                      The test you wrote about doesn't make sense, though, because GCC handles C as C++, so, when you compare two identical programs, you will get identical results.
                      The differences are very small, but we can't talk of the C++-classes as being the same as structs in C. There are definitely big differences between them.
                      You're fully wrong! If you don't write virtual you have the same memory footprint. In fact C++ is a stricter than C, so is less error prone. Like:
                      int* array = malloc(sizeof(char)*length); //will give error in C++ not in C so is less likely you will make mistakes in C++.
                      Also as you can override new operation, you can use the same C routines (or a custom memory allocator) if you want the same perf caracteristics.

                      In the past C++ would do many "invisible copying" (like users copying std::vectors everywhere), and people not trained with C++ or people that would not use Ref and Const Ref, would get worse performance than using C, but this is not because C++ was badly designed, but badly used.

                      Still C++ is really the king performance wise as it extensively uses inlining in templates, the "invisible copying" is solved many times by just using constant references or C++ 11 added "copy semantics" so your compiler will remove some copying by itself (with no extra assembly).

                      At last I want to congratulate him of using Qt, is really a great framework, also Qml/JS side.

                      Comment


                      • Nope

                        Originally posted by ciplogic View Post
                        You're fully wrong! If you don't write virtual you have the same memory footprint. In fact C++ is a stricter than C, so is less error prone. Like:
                        int* array = malloc(sizeof(char)*length); //will give error in C++ not in C so is less likely you will make mistakes in C++.
                        Also as you can override new operation, you can use the same C routines (or a custom memory allocator) if you want the same perf caracteristics.

                        In the past C++ would do many "invisible copying" (like users copying std::vectors everywhere), and people not trained with C++ or people that would not use Ref and Const Ref, would get worse performance than using C, but this is not because C++ was badly designed, but badly used.

                        Still C++ is really the king performance wise as it extensively uses inlining in templates, the "invisible copying" is solved many times by just using constant references or C++ 11 added "copy semantics" so your compiler will remove some copying by itself (with no extra assembly).

                        At last I want to congratulate him of using Qt, is really a great framework, also Qml/JS side.
                        When I was fully wrong and C++ the King of performance, why is it that most developers still use C instead of C++?

                        Btw: I didn't write anything about C++ being better or worse, I just said there were big differences in classes and structs, which in my opinion lead to worse memory management. Don't smoke so much crack, you are really tempered!

                        And: I don't see any problem with the example you gave.
                        EDIT: Normally you would use n-times the size of an integer, and not a char. But this is a design-issue and might also be voluntary.
                        Last edited by frign; 04-10-2013, 01:05 PM.

                        Comment


                        • Originally posted by frign View Post
                          Btw: I didn't write anything about C++ being better or worse, I just said there were big differences in classes and structs, which in my opinion lead to worse memory management. Don't smoke so much crack, you are really tempered!
                          In C++ specification, struct is a class with permissions set to public by default!
                          Code:
                          struct MyStruct{} ;
                          is the same with:
                          Code:
                          class MyStruct{ public: } ;
                          If you use classes/structs with virtual, you will have a virtual table pointer, but no one requires you to do so.

                          If you talk about some things that are more verbose in C++, are the C++ name mangling to solve conflicts (because of function overloading) but has no final memory runtime issues as far as I'm aware. If you don't want to have this, you can write all C++ code and export functions with
                          Code:
                          extern "C" {
                          (...)
                          };
                          Originally posted by frign View Post
                          When I was fully wrong and C++ the King of performance, why is it that most developers still use C instead of C++?
                          This is a non-sequitor (http://en.wikipedia.org/wiki/Non_sequitur_%28logic%29 ): most developers may use C for a lot of reasons, and maybe none is about performance. (mistake explained in detail: http://en.wikipedia.org/wiki/Affirming_the_consequent )

                          A link that states that "C faster than C++ is a myth": http://discuss.fogcreek.com/joelonso...ow&ixPost=7461

                          In fact even people that are using assembly don't use it always for performance, just 30% of them would do (I'm using your numbers). GCC rewrite in C++ proved to some extend this as I've never heard that compile times grew as GCC is C++ based now. Or LLVM (which from ground-up was C++ based) has a strength of faster compilation than GCC. Cairo (C vectorial graphics library) was at least some years ago slower than the C++ counterpart in Qt: http://zrusin.blogspot.com/2006/10/benchmarks.html

                          Where you use the biggest number most developers? Tiobe maybe? ( http://www.tiobe.com/index.php/conte...pci/index.html ). If it is so, why the second most developers pick Java? Is also performance? If you looked to the statistics, Dynamically Typed Languages 29.2% (more than C users which are around 18%) are using a dynamic language (which has some performance limitations/implications).

                          The biggest reason I think that C is still powerful is related with interoperability: C is a target language for any language is interoperable with. Java, C#, JavaScript or any other language in Tiobe top 20 (including assembly) with notable exceptions of Transact-SQL and PL-SQL are C aware (sometimes, like in Lua you have to set some hook methods, but is easy to do so). Combined with his long history and that the language is fairly simple to start with (which means in turn that is taught in high-schools and universities) I think is a better reason why people are using C.

                          There are transitions that are slower from C to C++, for example Quake3 was fairly light, when Doom 3 was heavy in resources, but was mostly because they were using an interpreter in the back, is not the fault of C++ (I'm not giving excuses though). Rage game engine was light (compared with their competition of end 2011, like Crysis 2) and both were written in C++. Many parts of fast parts of Windows (DirectX comes first to mind) are C++.
                          Last edited by ciplogic; 04-10-2013, 01:46 PM.

                          Comment


                          • Here is a quick small asm hack I made to the radeon r600_shader.c file in mesa. Note it is only two instructions. It sure beats doing a loop with a test in the middle. lets see if you can get a c compiler to beat this .

                            static int tgsi_last_instruction(unsigned writemask)
                            {
                            #if 1
                            /*AND input with 0xF to confine the search to the four least significant bits */
                            /* use BSR (Bit Scan Reverse) to find the most significant bit position 0 through 3 */
                            /* Note: BSR instruction uses source as its destination to return zero if zero
                            is its input. Officially documented as an undefined output for zero input but
                            behaviour of the BSR instruction just doesn't change the destination register
                            contents in this case */

                            __asm (
                            "and $0x0F, %0\n\t"
                            "bsr %0, %0 \n\t"
                            : "=&r" (writemask)
                            : "0" (writemask)
                            : "cc");

                            return writemask;
                            #else
                            int i, lasti = 0;

                            for (i = 0; i < 4; i++) {
                            if (writemask & (1 << i)) {
                            lasti = i;
                            }
                            }

                            return lasti;
                            #endif
                            }

                            Comment


                            • Religion

                              Originally posted by ciplogic View Post
                              In C++ specification, struct is a class with permissions set to public by default!
                              Code:
                              struct MyStruct{} ;
                              is the same with:
                              Code:
                              class MyStruct{ public: } ;
                              If you use classes/structs with virtual, you will have a virtual table pointer, but no one requires you to do so.

                              If you talk about some things that are more verbose in C++, are the C++ name mangling to solve conflicts (because of function overloading) but has no final memory runtime issues as far as I'm aware. If you don't want to have this, you can write all C++ code and export functions with
                              Code:
                              extern "C" {
                              (...)
                              };

                              This is a non-sequitor (http://en.wikipedia.org/wiki/Non_sequitur_%28logic%29 ): most developers may use C for a lot of reasons, and maybe none is about performance. (mistake explained in detail: http://en.wikipedia.org/wiki/Affirming_the_consequent )

                              A link that states that "C faster than C++ is a myth": http://discuss.fogcreek.com/joelonso...ow&ixPost=7461

                              In fact even people that are using assembly don't use it always for performance, just 30% of them would do (I'm using your numbers). GCC rewrite in C++ proved to some extend this as I've never heard that compile times grew as GCC is C++ based now. Or LLVM (which from ground-up was C++ based) has a strength of faster compilation than GCC. Cairo (C vectorial graphics library) was at least some years ago slower than the C++ counterpart in Qt: http://zrusin.blogspot.com/2006/10/benchmarks.html

                              Where you use the biggest number most developers? Tiobe maybe? ( http://www.tiobe.com/index.php/conte...pci/index.html ). If it is so, why the second most developers pick Java? Is also performance? If you looked to the statistics, Dynamically Typed Languages 29.2% (more than C users which are around 18%) are using a dynamic language (which has some performance limitations/implications).

                              The biggest reason I think that C is still powerful is related with interoperability: C is a target language for any language is interoperable with. Java, C#, JavaScript or any other language in Tiobe top 20 (including assembly) with notable exceptions of Transact-SQL and PL-SQL are C aware (sometimes, like in Lua you have to set some hook methods, but is easy to do so). Combined with his long history and that the language is fairly simple to start with (which means in turn that is taught in high-schools and universities) I think is a better reason why people are using C.

                              There are transitions that are slower from C to C++, for example Quake3 was fairly light, when Doom 3 was heavy in resources, but was mostly because they were using an interpreter in the back, is not the fault of C++ (I'm not giving excuses though). Rage game engine was light (compared with their competition of end 2011, like Crysis 2) and both were written in C++. Many parts of fast parts of Windows (DirectX comes first to mind) are C++.
                              If you look it that way, most developers program in C#. It's not about the technical perspective, why C-programs are faster, it's about how you design your software.
                              I see you like Wikipedia very much, but I know what non sequitur means (not non sequitor).

                              So, it's still a religious question and I guess we'll never rest in a consens.

                              Comment


                              • Originally posted by frign View Post
                                If you look it that way, most developers program in C#. It's not about the technical perspective, why C-programs are faster, it's about how you design your software.
                                I see you like Wikipedia very much, but I know what non sequitur means (not non sequitor).

                                So, it's still a religious question and I guess we'll never rest in a consens.
                                Thanks for fixing my typo, I will double check in future. I use Wikipedia as it states properly most of the times.

                                As C++ is the faster than C as you can use the C subset and have the same (baseline performance), but also other constructs (I'm talking here about templates):
                                A Google engineer perspective: http://lingpipe-blog.com/2011/07/01/why-is-c-so-fast/
                                This guy also seemed to agree that C++ is faster because it can do better inlining: http://radiospiel.org/sorting-in-c-3...faster-than-c/

                                I'm still curios of your classes differences from C++ and C struct (a statement you made )

                                Comment

                                Working...
                                X