Announcement

**vladpetric** · 18 January 2021, 02:42 PM

Originally posted by WorBlux View Post

Risc-V does follow with predicated instructions, and even x86 got CMOV that can let the compiler eliminate some branches in code.

This is more of a sideline argument (I agree with the main gist of the argument), but I don't think RISC-V does either predication or conditional moves.

Personally, I think CMOVs super-helpful for doing min/max (or similar situations of a ~50/50 unpredictable branch), but not much more than that.

**WorBlux** · 18 January 2021, 03:10 PM

Originally posted by vladpetric View Post

This is more of a sideline argument (I agree with the main gist of the argument), but I don't think RISC-V does either predication or conditional moves.

Sorry, my mistake, I got confused by what libreSoC was trying to do with it.

**jabl** · 18 January 2021, 03:22 PM

IIRC 32bit arm had lots of support for predication, but they got rid of almost all of it for aarch64, leaving only csel , roughly similar to cmov on x86

So while it has its uses for poorly predictable branches, "the more the merrier" probably isn't the answer either.

**vladpetric** · 18 January 2021, 03:22 PM

Originally posted by WorBlux View Post

Sorry, my mistake, I got confused by what libreSoC was trying to do with it.

I think it's a good idea to add cmovs at least

. Hope they succeed.

**vladpetric** · 18 January 2021, 04:17 PM

Originally posted by jabl View Post

IIRC 32bit arm had lots of support for predication, but they got rid of almost all of it for aarch64, leaving only csel , roughly similar to cmov on x86

So while it has its uses for poorly predictable branches, "the more the merrier" probably isn't the answer either.

They're great for min/max

. Beyond that ...

**Space Heater** · 18 January 2021, 04:30 PM

Originally posted by WorBlux View Post

Risc-V does follow with predicated instructions, and even x86 got CMOV that can let the compiler eliminate some branches in code.

Since when does RISC-V have any form of predicated move/select? They seem to be religiously against predication and claim that branch prediction is always better.

**Rallos Zek** · 18 January 2021, 04:59 PM

Originally posted by zexelon View Post

Taken as a whole, I would say Itanium was a very impressive engineering undertaking and in some ways it was quite successful. It was a market failure yes, but so are many architectures. I would not fully discount the tech in the Itanium. I am sure that in the probably not to distant future, pieces of it will be resurrected. Tech developed in GPU's is solving a lot of the issues with with Itanium. The one issue that can not be fixed is backwards compatibility.

Probably pieces of Itanium will eventually be unknowingly included in Xeon Phi or some future Intel "accessory" processor or GPU.

How so? I'll disagree just because all major GPU architectures have moved away from being VLIW based to being Simd/RISC based architectures because they saw that VILW sucks for GPU compute (i.e. CUDA/OpenCL) . And it's 2021 making a compiler work for VILW/EPIC is still as shit as it was 20 years ago.

**cb88** · 18 January 2021, 05:19 PM

Originally posted by jabl View Post

This. A VLIW style architecture might work well for a DSP where you can carefully tune the code for the exact workload, but for a general purpose architecture it's a massive failure. Compilers were never able to efficiently pack instructions into bundles for general purpose code, leading to lots of NOPS and thus wasted instruction bandwidth.

And once you go to OoO HW, that instruction encoding with bundles etc. is just a waste.

Not entirely true, a counter example is Transmeta (good transistor density to power and performance) and Nvidia's VLIW ARM CPUs which are quite fast. In fact I won't be the least but surprised to see VLIW with runtime embedded optimization resurface once the patents expire (imminently)

Transmeta's last CPU was roughly a competitor to a P3 and even had SSE3 (TM88xx chips) so it could run up to windows 10 etc... in theory if you had enough ram which granted is unlikely.

The kicker with Transmeta's code morphing was that it could optimism the code as it was running.. similar to how a java virtual machine etc... does except it can do it for any code.

**cb88** · 18 January 2021, 06:15 PM

Originally posted by Rallos Zek View Post

How so? I'll disagree just because all major GPU architectures have moved away from being VLIW based to being Simd/RISC based architectures because they saw that VILW sucks for GPU compute (i.e. CUDA/OpenCL) . And it's 2021 making a compiler work for VILW/EPIC is still as shit as it was 20 years ago.

Actually RDNA1/2 is has brought back some of the VLIW design.... and its likely that the same is true of CDNA. They are calling it Super SIMD.

**vladpetric** · 18 January 2021, 06:21 PM

Originally posted by cb88 View Post

Not entirely true, a counter example is Transmeta (good transistor density to power and performance) and Nvidia's VLIW ARM CPUs which are quite fast. In fact I won't be the least but surprised to see VLIW with runtime embedded optimization resurface once the patents expire (imminently)

Transmeta's last CPU was roughly a competitor to a P3 and even had SSE3 (TM88xx chips) so it could run up to windows 10 etc... in theory if you had enough ram which granted is unlikely.

The kicker with Transmeta's code morphing was that it could optimism the code as it was running.. similar to how a java virtual machine etc... does except it can do it for any code.

Transmeta's performance was lackluster (actually its competitor was P4, a mediocre core; and it didn't manage to make inroads against that). Code morphing - kinda' cool, though in the end not that helpful (20 years later, we have open as in free instruction sets anyway ...). VLIW - bad.

Announcement

Itanium IA-64 Was Busted In The Upstream, Default Linux Kernel Build The Past Month

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment