Assembly is fairly simple and it gives you ultimate power, but merely writing in it does not make your code any faster.
Also, inline assembly often produces worse resulting binary than what compiler can make, since compiler don't know / care about context of surrounding code.
Exception to this is gcc inline assembly syntax which is little annoying at first, but you let compiler know what is input/output and how it should handle it.
Similarly, compiler can still optimize stuff with simd intrisics (sse), but yet again, it's pain to write such code (despite advantages, such as type control).
Functions can also be written in assembly and then you can use them in C/C++/other code with ease, but you need to know about calling conventions that you are using (especially x86_64 ABI can be sometimes challenging, though nothing programmer couldn't get used to in few hours/days).
Machine code is fairly silly nowadays, same with hexadecimal (it's possible, but most people don't care about such low level stuff unless they are writing fancy stuff via shellcode such as code injections).
On my college, assembly was taught in first semester, just to give people feel about how x86/arm works).
Overall, I don't think that anyone is writing in assembly nowadays without a good reason to do so (such as performance critical code, low level system/hardware stuff).
Announcement
Collapse
No announcement yet.
Is foolish currently develop in machine code, hexadecimal and assembly?
Collapse
X
-
Guest replied
-
Originally posted by gens View Posti had this discussion on this very forum a long while ago
NOBODY could write a better matrix multiplication loop in C then the one i patched up in asm (and it was far from perfect)
i think one test came to ~70% of performance
But it is simply useless unless it has been proven that the loop, written in a higher level language, is a bottleneck, which is very seldom the case.
Leave a comment:
-
Originally posted by darkblu View PostYour loop has an off-by-one error with regard to the trip count - its branch condition reads 'jump if not greater', whereas you want a 'jump if less'.
jl or jnz work bout in this case
that was a quick example so i didn't think about it at all
i'd also take to say that i wrote about how hard it is to write C vs asm
not how productive you will beLast edited by gens; 30 October 2014, 06:37 PM.
Leave a comment:
-
Originally posted by darkblu View Post...
here's a summary anyway
cpu dispatching
optimized functions, not whole programs
and for the others,
cache trashing is due to data being sparse in memory
for example in standard C++ (like you are thought) there is metadata associated with the data (linked list vs whatever its called in C++) causing more, and some times irregular, memory access
you are gonna have to have my word as i can't find the sony paper on it
here, this explains a part of it
and that comes down to how you organize the data in memory
and a compiler will never tell you how to do it, as it will just do as you tell it
for further understanding on compilers i suggest finding the gcc LRA documentation and lots of thinking about cpu's
(x264 dev's blog also has some interesting insights, and ofc Agner Fog's site)
Leave a comment:
-
Originally posted by oleid View PostIf it's possible, then it's probably mentioned here:
http://www.agner.org/optimize/optimizing_assembly.pdf
also note hes cpu instruction timing tables
they explain some things about modern processors
Leave a comment:
-
Originally posted by gamerk2 View PostThere's these wonderful things called a CPU cache, register renaming, pipeline optimizations, and the like, that turn your hand-crafted assembly code into a really unoptimized mess. Unless your writing very unoptimized code, compilers are always going to create a faster executable over handwritten assembly as a result.
Take my loop iteration example. There's a cost-benefit that goes into taking away a CPU register to constantly keep the loop iterator loaded versus the performance you lose due to loosing access to that register. And for many years, back during the early days of C (when admittedly, the PDP-11 C compiler stank), code typically used the REGISTER keyword tell the compiler to keep the iterator always loaded in a register, because it "avoided a costly memory read". At least, until people started to benchmark and found that freeing up that register and re-loading the iterator when needed often yielded more performance.
Compilers have generated faster code then handwritten assembly for at least 30 years now, and if a certain compiler doesn't, then it should be replaced with one that works better.
use perf if you want to see how much modern programs trash cache
register renaming helps bout the humans and compilers alike
that said, just shuffling the order of instructions around is no problem
"There's a cost-benefit that goes into taking away a CPU register to constantly keep the loop iterator loaded versus the performance you lose due to loosing access to that register."
what ?
there is NO benefit in register spilling
if you run out of registers, then you look what to spill
compilers have generated...
i had this discussion on this very forum a long while ago
NOBODY could write a better matrix multiplication loop in C then the one i patched up in asm (and it was far from perfect)
i think one test came to ~70% of performance
Leave a comment:
-
Originally posted by gens View Postassembly is a fairly simple language
y sure there is a fair bit to learn about how cpu's work till you can program in it, but after that the rest of the details are easy
x86 cpu's have not changed much since i686 with amd64 being the biggest change
so what you write would work on any x86 or amd64 cpu
assembly, like almost any other programming language, is "easy" to maintain if the code is well commented
an example of a for loop
Code:for (int i = 0; i < 100; i++) { //code }
some_label:
//code
inc ecx
cmp ecx, 100
jng some_label
that a compiler usually changes to:
(and its easier for a human)
mov ecx, 100
some_label:
//code
dec ecx
jnz some_label
adding two numbers together is just "add register/memory, memory/register/intermediate"
with the limitation there being that you can not have bout values in memory (but you can do like "add [memory_address], intermediate_value")
and details like that
Leave a comment:
-
Previously, it was best option, among badly optimized compilers and interpreters. I started with Z80, then MC68000, Intel, Atmel and ARM. But that was looooong time ago. Today it is useless, unless you are kernel hacker, or into few other corner cases mentioned above. NEVER use it for general apps, games or utilities, it is just stupid idea, like creating OS kernel in Visual Basic 6. However, understanding assembly language (and CPU cache) is important IMHO, for figuring out properly how computer actually operates.
Leave a comment:
-
Assembly is a programming language like C/C++/PHP/Java/C# and so forth. You select the correct language for the job in hand.
Places where I have seem Assembly used correctly include:
- Discrete Cosine Transforms in video and audio decompression
- SHA512/RSA cryptographic algorithms on a high performance accelerator
- Key graphics processing loops in games
Disadvantages of assembly:
- It has a very, very high developer workload weighting (especially if you need to do better than a modern C compiler)
So, if you want to code everything in Assembly - guess what: It's Turing complete, so you can! If you write it in C, you'd probably be writing 10 times as much functionality in a day. If you wrote it in Java/C# you'd be writing 50 times as much functionality in a day.
So ultimately, the question is: Who's paying you for your time, and do they think that they are getting value for money?
Leave a comment:
Leave a comment: