Announcement

Collapse
No announcement yet.

Is Assembly Still Relevant To Most Linux Software?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    No problem

    Originally posted by oliver View Post
    Yes, commit those changes Sometimes things get written in a way (sloppy) and overlooked for years. Its always nice having stuff cleaned up.

    However (u)int_fast* usage in the kernel, I'm not sure sure about. If it really is faster/better; it could be a mission to replace it all Unfortunately, right now, (3.7.10) only has ONE reference in the entire kernel and I don't know if it will even stay there, it's been there for a while I believe.

    Code:
    grep int_fast * -R
    drivers/staging/tidspbridge/dynload/cload.c:    uint_fast32_t sum, temp;
    Your concerns are right!
    You normally have to include stdint.h, so it might be not handy. For Kernel-purposes (which is not the case here, because it's Mesa), you could just replace them with char's.
    It boils down to the point that stdint.h just typesets uint_fast8_t to char on x86_64, it is mostly relevant only to other architectures where other 8 bit datatypes might be faster.

    It's always better to know about these types, because they can really make the difference! People should really be encouraged to think about the variable-ranges and use those smaller integers accordingly, instead of just declaring everything an int.
    This approach comes close to Ada, one of the safest languages in existence, where the range of each variable can be strictly limited.
    Last edited by frign; 09 April 2013, 08:57 AM.

    Comment


    • #92
      Originally posted by frign View Post
      It's always better to know about these types, because they can really make the difference! People should really be encouraged to think about the variable-ranges and use those smaller integers accordingly, instead of just declaring everything an int.
      Again, do you have any number to support these claims? (real world, if possible)

      Comment


      • #93
        Originally posted by frign View Post
        Your concerns are right!
        You normally have to include stdint.h, so it might be not handy. For Kernel-purposes (which is not the case here, because it's Mesa), you could just replace them with char's.
        It boils down to the point that stdint.h just typesets uint_fast8_t to char on x86_64, it is mostly relevant only to other architectures where other 8 bit datatypes might be faster.

        It's always better to know about these types, because they can really make the difference! People should really be encouraged to think about the variable-ranges and use those smaller integers accordingly, instead of just declaring everything an int.
        This approach comes close to Ada, one of the safest languages in existence, where the range of each variable can be strictly limited.
        Lets say that my platform has 64bits as its native 'fastest' integer. The compiler should know this (either by an option, --arch=native) or the like thus should be able to optimize anything smaller, up to an int. I guess where we explicitly work with an overflow (bad design?) of, lets say a char:
        Code:
        char i;
        
        for (i = 1; i; ++i) {
        stuff();
        }
        which may be not the greatest code, but it's an example, work with me here. Now in this case, the compiler would have to notice this explicit behavior. But otherwise, it should be able to scale everything up to an uint64. -O3 really should do this automatically, making the whole int_fast moot. It keeps code cleaner and less useless definitions. If -O3 (or -O4 specifically for this then) breaks your program ... well don't depend on stupid design and fix the code

        Comment


        • #94
          You should reconsider that!

          Originally posted by oliver View Post
          Lets say that my platform has 64bits as its native 'fastest' integer. The compiler should know this (either by an option, --arch=native) or the like thus should be able to optimize anything smaller, up to an int. I guess where we explicitly work with an overflow (bad design?) of, lets say a char:
          Code:
          char i;
          
          for (i = 1; i; ++i) {
          stuff();
          }
          which may be not the greatest code, but it's an example, work with me here. Now in this case, the compiler would have to notice this explicit behavior. But otherwise, it should be able to scale everything up to an uint64. -O3 really should do this automatically, making the whole int_fast moot. It keeps code cleaner and less useless definitions. If -O3 (or -O4 specifically for this then) breaks your program ... well don't depend on stupid design and fix the code
          First off, -O4 doesn't even exist.

          Moreover, your example is insufficient, because it is an endless loop, no matter if you overflow it with a char after ~255 cycles or with a 64 bit integer after ~18446744073709551616 ones.
          The program is bloody broken.

          Again, don't try to trick the compiler, because you won't do it right anyway. Try to write code which makes _sense_. If you expect to iterate less than 255 times in the loop, then why not use an 8bit integer for the counting-variable?

          And, more importantly, you don't know if the compiler would really substitute the char with a 64 bit integer.

          Comment


          • #95
            There you go

            Originally posted by erendorn View Post
            Again, do you have any number to support these claims? (real world, if possible)
            There is an excellent article by the embedded-systems specialist Nigel Jones, where he points out the importance of varying integer sizes.

            Apart from that, you can't expect the compiler to know what ranges your variables have. It is your job to know, and in this matter, you should really consider encouraging the compiler to know it and generate even better output.

            A real world example by the same author can be found here, where he employs the fast data-types to optimise a given program. It is definitely a great read!

            Comment


            • #96
              Originally posted by frign View Post
              There is an excellent article by the embedded-systems specialist Nigel Jones, where he points out the importance of varying integer sizes.

              Apart from that, you can't expect the compiler to know what ranges your variables have. It is your job to know, and in this matter, you should really consider encouraging the compiler to know it and generate even better output.

              A real world example by the same author can be found here, where he employs the fast data-types to optimise a given program. It is definitely a great read!
              Indeed, it makes sense for less than 32bit processors (well, you get 25% speed, which is good considering the readability cost).
              The existence of fast_int and least_int is puzzling, though, as I would find it cleaner to let the compiler to choose between fast or small (within the int size constraints) based on what I tell him to optimize for.

              Comment


              • #97
                Originally posted by frign View Post
                First off, -O4 doesn't even exist.
                I never said it would, I only hinted that if it doesn't do it now, it should and if required only in 'O4'
                Moreover, your example is insufficient, because it is an endless loop, no matter if you overflow it with a char after ~255 cycles or with a 64 bit integer after ~18446744073709551616 ones.
                How is it an endless loop? After 255 comes 0 no? So when then i is no longer true and the loop aborts? Quite a common trick on FPGA's where you define your variable in exact bits and count until it reaches 0 again, since on FPGA's you don't have 'int' but only bit wide variables. Granted, having to count to 256, or the likes is very uncommon, I hope, but it would be a situation of concern.
                The program is bloody broken.

                Again, don't try to trick the compiler, because you won't do it right anyway. Try to write code which makes _sense_. If you expect to iterate less than 255 times in the loop, then why not use an 8bit integer for the counting-variable?

                And, more importantly, you don't know if the compiler would really substitute the char with a 64 bit integer.
                Who is tricking the compiler? It is a valid expression.

                Anyway, If you 'know' you want to do 15 to 20 iterations in a loop, and the compiler can't ever know what the max amount is, you put it in a char (8bits). All normal and sensible. Now your compiler also knows your arch uses 64bits natively and those are the fastest for it to handle. A smart compiler would use an uint64 anyway, because a) it fits, b) it's faster. Yes it uses more memory (but it's always either higher memory usage, or faster execution; can't have it both ways normally).

                So again, using uint8; uint8_least; uint8_fast; shouldn't make any difference whatsoever and really is kinda silly to worry about. The compiler knows best. And I don't think in todays code, you have 90% of the ints, that need to be as small as possible to save on memory footprint, but 10% need to be the fast kind, so having explicit control over them is pointless. gcc test.c -o test --fastint or --leastfit; should be the only tunable.

                Comment


                • #98
                  Not really

                  Originally posted by erendorn View Post
                  Indeed, it makes sense for less than 32bit processors (well, you get 25% speed, which is good considering the readability cost).
                  The existence of fast_int and least_int is puzzling, though, as I would find it cleaner to let the compiler to choose between fast or small (within the int size constraints) based on what I tell him to optimize for.
                  Again, you can't expect the compiler to do that, because C/C++ doesn't have the respective boundary-mechanisms to know which range an integer has.
                  If you for instance handled stdin for a color-processing program, you would know that the RGB-values would be stored in an unsigned 8bit-Integer, because the values or R,G and B respectively do not go beyond 255.
                  The example Nigel Jones gave was of course based on an optimising compiler. If the compiler was that smart to actually know about the integer-size, how come he does get different results?

                  I can not say this often enough: Don't put too much trust into the compiler, do more on your side to make clear what you want without focusing on the compilers quirks in this sick extent.

                  Comment


                  • #99
                    Originally posted by frign View Post
                    I can not say this often enough: Don't put too much trust into the compiler, do more on your side to make clear what you want without focusing on the compilers quirks in this sick extent.
                    We already do that. u8; u16; u32 and u64 (u128 at some point i'm sure). Let the compiler then decide, if < u64 should be scaled up to a u64 to be faster. IF the compiler can 'combined' to u32's somehow to get a u64 (assuming it will be all very valid), then he could scale up two u16's to u32's for this example.

                    Yes, absolutely you should help the compiler a little, but the compiler should also be smart enough to do certain things smart (if allowed to do so).

                    Comment


                    • You misunderstood it

                      Originally posted by oliver View Post
                      We already do that. u8; u16; u32 and u64 (u128 at some point i'm sure). Let the compiler then decide, if < u64 should be scaled up to a u64 to be faster. IF the compiler can 'combined' to u32's somehow to get a u64 (assuming it will be all very valid), then he could scale up two u16's to u32's for this example.

                      Yes, absolutely you should help the compiler a little, but the compiler should also be smart enough to do certain things smart (if allowed to do so).
                      I think there is a strong misconception on your side here: Only because an integer has the same size as the address-scope of the currently employed Operating System (namely, 64 bit), it does not mean these datatypes are faster than smaller ones (8, 16, 32).
                      It is the other way around: The smaller the datatypes, the less ressources are needed to handle them. There are exceptions for 16 bit integers on some old architectures, but we do overall have a consistent speedup in all cases where the integer-size has been limited.
                      Even in the case of the slower 16 bit integers, you can employ the fast integer-types and benefit from well-thought-out typesets for specific architectures, ruling out potential slowdowns.

                      The compiler is a smart guy, but he is not a magician: He would never risk anything and he is no artificial intelligence. There may be constant improvements in this sector, but limiting the integer-size requires you to know that the integer would _never_ overflow.
                      How would a compiler predict that when he has to optimise a stdin-parser?

                      PS: I don't think we will see 128 bit soon, because 64 bit-addresses can span virtual memory with the maximum size of 16 exbibyte, which is 1 trillion gibibytes.
                      But I may only sound like Bill Gates having allegedly stated this in 1981:
                      640K ought to be enough for anybody.
                      Last edited by frign; 09 April 2013, 01:17 PM.

                      Comment

                      Working...
                      X