Originally posted by directhex
View Post
If your application is performance sensitive, then you can begin to assume all kinds of things, but for heavy floating point applications, I ALWAYS optimize all the way up to SSE3 for my pre-compiled binaries, I've found 2x speedups in real world use, and about 2x speedup from various other CFLAGS like floop-optimize, etc... People without SSE3 capable machines have no business running such applications anyways, I'd rather leave them with nothing at all then an awful experience from my "slow" application on their slow PC, rather than screw my 95% of users who came to the party with a proper computer that's less than 6 years old.
Then there's the alternative strategy of compiling multiple binaries, then having /usr/bin/your_application_name select one based on querying processor capability on the current machine, which pretty much destroys your argument. However, 99.999% of applications don't need such elaborate measures because they'll never run a single modern CPU core at 100% ever, so it's still a straw-man argument.
Comment