Page 13 of 18 FirstFirst ... 31112131415 ... LastLast
Results 121 to 130 of 177

Thread: Is Assembly Still Relevant To Most Linux Software?

  1. #121
    Join Date
    Oct 2012
    Location
    Cologne, Germany
    Posts
    308

    Cool You failed

    Quote Originally Posted by oliver View Post
    And this doesn't happen in proprietary software? Now your really just talking horseshit. Really bad horseshit.

    I have worked with proprietary (embedded) software. I have seen leaked (later GPLed) source and still work with it. There is so much crap produced because of 'fast fast deadlines nobody sees anyway' mentality it would make you want to stab your eyes out.

    Yes. There is still lot of bad code EVEN in open source code out there. But you know what? it is waiting to be found, ready to be optimized. This specific radeon example was a hackjob to get something working fast. It was probably overlooked since (well someone found it now). It just needs someone to step up and submit a patch. It is not impossible to fix, nobody _needs_ to suffer. The radeon driver (while production ready I guess) is still under heavy development (or slightly abandoned). There simply aren't enough manhours to spend on these performance optimizations. Once the radeon (and others) are feature complete and are stable, I'm sure more interest will be put into performance optimizations, which with profiling should be spotted. It's only a matter of time and this was a piss poor example just to spread crap.
    Trolls keep trolling. I am also not new to the business, so don't claim anything here you can't prove.

    You misunderstood me, because I wasn't bashing free software, I am one of the strongest supporters of efficient code, and this means leaving out big data-structures (--> C++) and bloated libraries (--> Glibc).
    The suffering of the end-users is based on inefficient code to be found in many situations. This starts with udev, which is a mess. This is the reason I joined the eudev-project to clean it up.
    I am not an expert when it comes to Radeon-drivers (I wasn't even talking about them), so I don't know the reason for this childish rage on your side.

    BTW, the example was not even taken from the Radeon-stack, but you are not worthy to know where they are from. What's the saying? Don't feed the trolls!

  2. #122
    Join Date
    Jan 2007
    Posts
    418

    Default

    Quote Originally Posted by frign View Post
    I don't say in any way you are wrong on this area. You proof to be an expert for microcontrollers. What you must realize is that today's CPU's sadly shifted into being very efficient in itself, but very inefficient when it comes to memory-IO. Keeping your variables small reduces it, unless you can show me that small datatypes are factually stored in memory at the native address-length or more.

    As always, if you see a misconception on my side, please show me accordingly,
    Why thank you, but no I'm far from it. I've only dabbled with microcontrollers

    But I fully agree that today's CPU's are very efficent and inefficient for different scenarios. The internals as to what when and why is probably a mystery to our mere mortals. I do hope compiler writers do know these secrets and do know when is best optimized. You tell the compiler what platform you work with (--march) and based on that it can make certain decisions as to how to optimize best.

    From http://pubs.opengroup.org/onlinepubs.../stdint.h.html I get that
    The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.
    Diving directly into the header file and seeing what gcc does:
    Code:
    /* Fast types.  */
    
    /* Signed.  */
    typedef signed char int_fast8_t;
    #if __WORDSIZE == 64
    typedef long int int_fast16_t;
    typedef long int int_fast32_t;
    typedef long int int_fast64_t;
    #else
    typedef int int_fast16_t;
    typedef int int_fast32_t;
    __extension__
    typedef long long int int_fast64_t;
    #endif
    
    /* Unsigned.  */
    typedef unsigned char uint_fast8_t;
    #if __WORDSIZE == 64
    typedef unsigned long int uint_fast16_t;
    typedef unsigned long int uint_fast32_t;
    typedef unsigned long int uint_fast64_t;
    #else
    typedef unsigned int uint_fast16_t;
    typedef unsigned int uint_fast32_t;
    __extension__
    typedef unsigned long long int uint_fast64_t;
    #endif
    gcc does 'like' those new sizes, but see what it does? It simply maps them to whatever your wordsize is. I don't think that this code is generated depending on my architecture. So uint_fast, in gcc at this moment really doesn't do anything useful or smart. It just says 'use 64bit because that is the native bit width' ignoring the memory issues and only focusing on ALU width. I still think there can be an advantage here, certain calculations take quite some cycles (multiplications for example) and _may_ be faster when using an aligned bit width. But as we both know, modern processors are highly complex and this isn't easily found anywhere. Maybe there's some programming manual.

    So what I said, the compiler should be smart and should know what is happening in the CPU at any given time (except for context switches of course). So if your program is 200 instructions that need to be executed (in your allotted scheduling slice) the compiler should be able to optimize this not only based on ALU size but ALSO on memory bandwith. And I think here we both meet ends and agree, this is more something for the compiler to optimize, then a developer OR a typdef could ever do (bar writing it in asm).

  3. #123
    Join Date
    Jan 2007
    Posts
    418

    Default

    Quote Originally Posted by ciplogic View Post
    So in short, which of you that you consider yourself "an assembly friendly guy" did contribute to GLibC, Gnu As, Fasm or whatever. May you say where did you contribute (in brief)? I mean at least we would all benefit from your work in a form or the other, right?
    Miss quote? Or right quote but wrong accusation?

    I try to do my fair shares of patches to various OpenSource projects. Granted I have never done any assembler patches, but that's simply because I have only read assembly, never written, I trust my compiler to optimize it for me as much as possible.

    Having said that, I do still belive, that (with profiling) certain functions CAN be optimized further and can be important TOO optimize. I think linaro showed this was true for 30% of 'some metric'?

  4. #124
    Join Date
    Jan 2007
    Posts
    418

    Default

    Quote Originally Posted by frign View Post
    Trolls keep trolling. I am also not new to the business, so don't claim anything here you can't prove.

    You misunderstood me, because I wasn't bashing free software, I am one of the strongest supporters of efficient code, and this means leaving out big data-structures (--> C++) and bloated libraries (--> Glibc).
    The suffering of the end-users is based on inefficient code to be found in many situations. This starts with udev, which is a mess. This is the reason I joined the eudev-project to clean it up.
    I am not an expert when it comes to Radeon-drivers (I wasn't even talking about them), so I don't know the reason for this childish rage on your side.

    BTW, the example was not even taken from the Radeon-stack, but you are not worthy to know where they are from. What's the saying? Don't feed the trolls!
    I do apologize for wrongfully understanding you. It 'felt' as bashing, but as always, in textual form, this is hard to say.

  5. #125
    Join Date
    Nov 2009
    Location
    Madrid, Spain
    Posts
    399

    Default

    Quote Originally Posted by oliver View Post
    Miss quote? Or right quote but wrong accusation?

    I try to do my fair shares of patches to various OpenSource projects. Granted I have never done any assembler patches, but that's simply because I have only read assembly, never written, I trust my compiler to optimize it for me as much as possible.

    Having said that, I do still believe, that (with profiling) certain functions CAN be optimized further and can be important TOO optimize. I think linaro showed this was true for 30% of 'some metric'?
    Thank you for clarifying. Was not intended to target you (or anyone in particular), but the idea that many here (at least other guy that appeared here) basically make a long queue of things that "other should know". I mean we discuss that not using the proper int type, or the latencies.

    At last, the 30% are 30% of assembly based projects. Most of them are using it for atomics, accessing ports or calling OS routines, so the number is like 0.3 % or maximum 1% of projects. And these assembly projects are not assembly only, but assembly based, so the figures are better.

  6. #126
    Join Date
    Oct 2012
    Location
    Cologne, Germany
    Posts
    308

    Red face In the end it's just a typedef

    Quote Originally Posted by oliver View Post
    Why thank you, but no I'm far from it. I've only dabbled with microcontrollers

    But I fully agree that today's CPU's are very efficent and inefficient for different scenarios. The internals as to what when and why is probably a mystery to our mere mortals. I do hope compiler writers do know these secrets and do know when is best optimized. You tell the compiler what platform you work with (--march) and based on that it can make certain decisions as to how to optimize best.

    From http://pubs.opengroup.org/onlinepubs.../stdint.h.html I get that


    Diving directly into the header file and seeing what gcc does:
    Code:
    /* Fast types.  */
    
    /* Signed.  */
    typedef signed char int_fast8_t;
    #if __WORDSIZE == 64
    typedef long int int_fast16_t;
    typedef long int int_fast32_t;
    typedef long int int_fast64_t;
    #else
    typedef int int_fast16_t;
    typedef int int_fast32_t;
    __extension__
    typedef long long int int_fast64_t;
    #endif
    
    /* Unsigned.  */
    typedef unsigned char uint_fast8_t;
    #if __WORDSIZE == 64
    typedef unsigned long int uint_fast16_t;
    typedef unsigned long int uint_fast32_t;
    typedef unsigned long int uint_fast64_t;
    #else
    typedef unsigned int uint_fast16_t;
    typedef unsigned int uint_fast32_t;
    __extension__
    typedef unsigned long long int uint_fast64_t;
    #endif
    gcc does 'like' those new sizes, but see what it does? It simply maps them to whatever your wordsize is. I don't think that this code is generated depending on my architecture. So uint_fast, in gcc at this moment really doesn't do anything useful or smart. It just says 'use 64bit because that is the native bit width' ignoring the memory issues and only focusing on ALU width. I still think there can be an advantage here, certain calculations take quite some cycles (multiplications for example) and _may_ be faster when using an aligned bit width. But as we both know, modern processors are highly complex and this isn't easily found anywhere. Maybe there's some programming manual.

    So what I said, the compiler should be smart and should know what is happening in the CPU at any given time (except for context switches of course). So if your program is 200 instructions that need to be executed (in your allotted scheduling slice) the compiler should be able to optimize this not only based on ALU size but ALSO on memory bandwith. And I think here we both meet ends and agree, this is more something for the compiler to optimize, then a developer OR a typdef could ever do (bar writing it in asm).
    I definitely agree on the last point, but no one stops the compiler from optimizing it accordingly. I knew uint_fast8_t just maps the type to an unsigned char, which makes the code more readable and leaves every distributor of the GNU operating system free to set those typedefs in one location to fix certain architectural quirks.

    Sadly, today's languages encourage to pile up a lot of stuff in memory. C++-classes are one example for an insufficient concept. Even though the compiler does a great job on keeping the code good enough, doing the same in C might have it's advantages (also in regards to inline-assembly). With no doubt, this is a religious question .

    So, it was nice discussing with you! Please let me know which language you prefer.

  7. #127
    Join Date
    Oct 2012
    Location
    Cologne, Germany
    Posts
    308

    Cool No problem

    Quote Originally Posted by oliver View Post
    I do apologize for wrongfully understanding you. It 'felt' as bashing, but as always, in textual form, this is hard to say.
    Hey, no problem.
    I once in a while also face those situations and considering that more than 90% of communication is based on non-verbal communication, such evident losses are very likely.

  8. #128
    Join Date
    Sep 2012
    Posts
    792

    Default

    Quote Originally Posted by frign View Post
    Sadly, today's languages encourage to pile up a lot of stuff in memory. C++-classes are one example for an insufficient concept. Even though the compiler does a great job on keeping the code good enough, doing the same in C might have it's advantages (also in regards to inline-assembly). With no doubt, this is a religious question .

    So, it was nice discussing with you! Please let me know which language you prefer.
    I thought that a C++ class was the same a a C struct, and as such didn't use more memory than C, and probably not much extra memory anyway (just a one time definition in the binary)?
    Recently, there was a test using GCC compiled in C and GCC compiled in C++, and the speed of the two versions were compared (using compile time of the linux kernel), with no observable difference.
    This proved that C++ in itself is not slower than C. Now, you can argue that some C++ specific constructs can be "slower" (often named are virtual functions and exceptions), but implementing these by hand in C would be at least as slow (and probably much slower). If you need fast polymorphism, you can use templates, which are 100x better than macro or copy paste.

    I personally like C++ a lot as a low but not-too-low language, but it's also mostly because I'm familiar with it.
    I'm never opposed using whatever language is best to the task at hand (except VBA. VBA sucks.). I'd be tasked on RT routines for embedded systems, I'd use C+assembly, but when writing GUI glue, I'm loving using JS/QML for Qt Quick. Languages are tools, not religion (plus, they won't blame you when flirt with some other ones)

  9. #129
    Join Date
    Oct 2012
    Location
    Cologne, Germany
    Posts
    308

    Cool Thanks

    Quote Originally Posted by erendorn View Post
    I thought that a C++ class was the same a a C struct, and as such didn't use more memory than C, and probably not much extra memory anyway (just a one time definition in the binary)?
    Recently, there was a test using GCC compiled in C and GCC compiled in C++, and the speed of the two versions were compared (using compile time of the linux kernel), with no observable difference.
    This proved that C++ in itself is not slower than C. Now, you can argue that some C++ specific constructs can be "slower" (often named are virtual functions and exceptions), but implementing these by hand in C would be at least as slow (and probably much slower). If you need fast polymorphism, you can use templates, which are 100x better than macro or copy paste.

    I personally like C++ a lot as a low but not-too-low language, but it's also mostly because I'm familiar with it.
    I'm never opposed using whatever language is best to the task at hand (except VBA. VBA sucks.). I'd be tasked on RT routines for embedded systems, I'd use C+assembly, but when writing GUI glue, I'm loving using JS/QML for Qt Quick. Languages are tools, not religion (plus, they won't blame you when flirt with some other ones)
    Thanks! Nice to hear other opinions.

    The test you wrote about doesn't make sense, though, because GCC handles C as C++, so, when you compare two identical programs, you will get identical results.
    The differences are very small, but we can't talk of the C++-classes as being the same as structs in C. There are definitely big differences between them.

  10. #130
    Join Date
    Nov 2009
    Location
    Madrid, Spain
    Posts
    399

    Default

    Quote Originally Posted by frign View Post
    Thanks! Nice to hear other opinions.

    The test you wrote about doesn't make sense, though, because GCC handles C as C++, so, when you compare two identical programs, you will get identical results.
    The differences are very small, but we can't talk of the C++-classes as being the same as structs in C. There are definitely big differences between them.
    You're fully wrong! If you don't write virtual you have the same memory footprint. In fact C++ is a stricter than C, so is less error prone. Like:
    int* array = malloc(sizeof(char)*length); //will give error in C++ not in C so is less likely you will make mistakes in C++.
    Also as you can override new operation, you can use the same C routines (or a custom memory allocator) if you want the same perf caracteristics.

    In the past C++ would do many "invisible copying" (like users copying std::vectors everywhere), and people not trained with C++ or people that would not use Ref and Const Ref, would get worse performance than using C, but this is not because C++ was badly designed, but badly used.

    Still C++ is really the king performance wise as it extensively uses inlining in templates, the "invisible copying" is solved many times by just using constant references or C++ 11 added "copy semantics" so your compiler will remove some copying by itself (with no extra assembly).

    At last I want to congratulate him of using Qt, is really a great framework, also Qml/JS side.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •