Originally posted by F.Ultra
View Post
Originally posted by F.Ultra
View Post
And, just maybe because register and ALU width ain't free. Worse, if you look at how ALU operations are implemented, you're increasing the critical path length by at least log2(n) for n-bit wordlength. So, a wider chip will not only be hotter and bigger (and thus more expensive), but also slower.
Compare that with vector arithmetic, and element-wise operations on a k-element vector only occupy k times as much as the same logic and datapath for operating on a single one of those elements.
Originally posted by F.Ultra
View Post
Comment