Originally posted by oiaohm
View Post
This test you did is like what you see game developer did with spinlocks to make a point while missing the complete issue while being basically wrong. Div is done way less than a function call on average function call is a big change in x86. Something to remember you can do 32 bit div in 64 bit mode.
I am not saying developer should forgot about the differences between 32 bit and 64 bit there is performance to be got that way but building your program complete 32 bit hurts you. Really the next person has missed the same problem there are a set of major changes when you build for 64 bit that make major performance differences in 64 bit mode favour.
Sorry this sounds like answer but its missed something so critical so is in fact wrong.
The calling convention changes when you move from 32 bit x86 userspace to 64 bit x86 userspace. Most applications are doing a lot of calls. The old calling 32 bit convention has you pushing and poping stuff on and off the stack that beats the living heck out the cache. 64 bit calling convention due to 64 bit mode having more registers avoids this way more often.
There is another problem here. 64bit pointers when performing syscalls to the Linux kernel syscall from 64 bit to 64 bit requires less kernel space processing that 32 bit to 64 bit be it 32 bit x86 native or x32 in 64bit mode calls. This extra code that has to be processed for the translation also worsens your cache utilisation. This here is why in particular benchmarks 64 bit x86 under linux can been 20 to 40 percent faster to x32.
If this is a issue under Linux you should be using x32 abi that is a 64 bit mode. Reality is making a 32 bit x86 application give horrible cache pressure due to the way the calling convention works. This even applies to windows and macos.
Please note I am not saying that you should not optimise you program to use 32 bit stuff in 64 bit mode because this does give a performance uplift. Stupid as it sounds the saving in cache pressure by call convention change going to 64 bit mode fairly much offsets all the 64bit mode performance cost.
https://stackoverflow.com/questions/...x32-vs-x64-jvm
Before either get the idea of arguing this calling convention hurt is not huge people working around java see it a lot. 2 to 5 times slower using a 32 bit x86 application. The calling convention issue makes most other problems look minor.
duby229 if your idea is correct for an application x32 in 64bit mode on Linux will be 40% faster than normal 64 bit mode. But there is a problem the old 32bit mode due to calling convention pain will be about half the speed or slower of the 64 bit mode program. So if you are talking performance under Linux is a arguement between x32 in 64bit mode and 64 mode not the old school 32 bit binaries.
In fact it possible to claw back by program optimisation 80-95 percent the advantage x32 in 64 bit mode has over 64 bit mode by code refactoring and of course this still keeps all the cases where 64 bit code simply out performs 32 bit code and is still way ahead of the old school 32 bit x86 binary before you started optimising it these days. Most of what Linux x32 abi(32 bit in 64 bit mode) does could be done the compiler to avoid people having to completely recode their programs specialist syscalls do help in places.
Only valid arguement for 32 bit mode x86 really has been compatibility. Performance arguement is basically bogus reality 32 bit x86 usermode does not perform. The fact 32 bit x86 usermode does not perform totally explains apple dropping it as better performing code equals less power usage by cpu so longer battery life performing the same tasks at the same speed.
This is wrong. Core Linux kernel runs in 64bit mode the same as windows. 32 bit Linux application or 32 bit windows application on 64 bit versions of Linux or Windows both run in amd64 Compatibility mode on 64 bit x86 platforms. WoW64 is not using cpu emulation on 64 bit x86 platforms to run win32 bit applications.
Reason why Linux has 16bit code support.
Mostly that wine users/developers got really upset and Linus said to the Intel developer you cannot break userspace fix it or risk being banned from submitting code until you do fix it. Yes it was not until 2014 was a fix to the 16bit espfix issue was coded and Linux was first to get this. Microsoft with Windows 64 bit versions took the simple path drop the 16 bit stuff and only implement the 32 bit version of espfix.
OS X dropping 32 bit support allows them to also drop work around like espfix from their kernel space.
CPU instruction set bugs are nothing new. So every time a 32 bit program syscalls under 64 bit windows/Linux/BSD/OS X on x86 they have been in fact running extra code and taking a performance hit to cover up for this bug. espfix corrected issue is really old one.
There are multi places where running a 32bit program on a 64 bit x86 platforms where you performance is getting kicked in like your privates then being asked to walk in a straight line of course you cannot do that at speed and neither can a 32 bit x86 program and this is just the way it is.
There is so much wrong with what you said I doesn't make any sense to try and correct you. You wouldn't listen any damn way. Yes WoW64 -DOES- transform 32bit api calls into 64bit system calls. YES IT DOES!! Windows does -NOT- use compatibility mode -AT ALL- and THAT is the -ONLY- reason why WoW64 -needs- to exist. EVERYTHING else you said was complete nonsense.
Comment