Originally posted by jacob
View Post
I'm not entirely convinced though. The trend in mainstream CPU design has certainly been to increase complexity of the on die run-time management of program execution while simplifying the execution units themselves and having more of them. This does form a bottleneck in the execution as dispatch can only occur as fast as the dispatcher can handle and offload fetching, decoding, scheduling/reordering etc despite this being parallelised as much as possible. This limits the number of actual execution units that can be integrated together to improve performance. Simplifying CPU design by performing operations like instruction scheduling at compile time is part of reducing the overhead of program flow management so that ideally each execution unit can eventually act as a node in a self-organising network, like the neural network of synapses and neurons in a biological brain. This network on a chip concept is actually what's used with the Sunway TaihuLight supercomputer.
* This seems to have been revised from history, certainly there's no mention of it on the relevant Wikipedia entry. But I'm certain it was the case, and there are still Google hits to back it up.
Comment