Originally posted by directhex
View Post
take that code unroll it, add SSE/AVX, check the types, properly use templates and atomics[c++11], optimize the memory handling and use ICC or GCC with at least -O2 -march=native and -msse2 and compare it to java again
Comment