Why thank you, but no I'm far from it. I've only dabbled with microcontrollers
Originally Posted by frign
But I fully agree that today's CPU's are very efficent and inefficient for different scenarios. The internals as to what when and why is probably a mystery to our mere mortals. I do hope compiler writers do know these secrets and do know when is best optimized. You tell the compiler what platform you work with (--march) and based on that it can make certain decisions as to how to optimize best.
From http://pubs.opengroup.org/onlinepubs.../stdint.h.html I get that
Diving directly into the header file and seeing what gcc does:
The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.
gcc does 'like' those new sizes, but see what it does? It simply maps them to whatever your wordsize is. I don't think that this code is generated depending on my architecture. So uint_fast, in gcc at this moment really doesn't do anything useful or smart. It just says 'use 64bit because that is the native bit width' ignoring the memory issues and only focusing on ALU width. I still think there can be an advantage here, certain calculations take quite some cycles (multiplications for example) and _may_ be faster when using an aligned bit width. But as we both know, modern processors are highly complex and this isn't easily found anywhere. Maybe there's some programming manual.
/* Fast types. */
/* Signed. */
typedef signed char int_fast8_t;
#if __WORDSIZE == 64
typedef long int int_fast16_t;
typedef long int int_fast32_t;
typedef long int int_fast64_t;
typedef int int_fast16_t;
typedef int int_fast32_t;
typedef long long int int_fast64_t;
/* Unsigned. */
typedef unsigned char uint_fast8_t;
#if __WORDSIZE == 64
typedef unsigned long int uint_fast16_t;
typedef unsigned long int uint_fast32_t;
typedef unsigned long int uint_fast64_t;
typedef unsigned int uint_fast16_t;
typedef unsigned int uint_fast32_t;
typedef unsigned long long int uint_fast64_t;
So what I said, the compiler should be smart and should know what is happening in the CPU at any given time (except for context switches of course). So if your program is 200 instructions that need to be executed (in your allotted scheduling slice) the compiler should be able to optimize this not only based on ALU size but ALSO on memory bandwith. And I think here we both meet ends and agree, this is more something for the compiler to optimize, then a developer OR a typdef could ever do (bar writing it in asm).