AVX-512 is the first x86 instruction set extension that is specifically targeted at the SPMD programming model that is also used by GPUs (e.g. it supports predication through dedicated mask registers). To execute 16 loop iterations in parallel, you need compilers to vectorize your code in the SPMD fashion. That's what LLVM is working on. So you don't want to miss out on that.
That said, multi-threading is an equally important aspect of maximizing the CPU's performance. Fortunately Intel recently added the TSX extensions to greatly facilitate and optimize thread synchronization.