The SLP Vectorizer can vectorize memory access, arithmetic operations, comparison operations, and other select operations. Back when it was ready in LLVM Clang 3.3 I did some early benchmarks and explained it in more detail. There's also the LLVM auto-vectorizer documentation.
With LLVM Clang 3.4 SVN it looks like the superword-level parallelism vectorizer will at least be enabled for the -O3 optimization level if not for other optimization levels too. With this upcoming change, from the LLVM/Clang Subversion code as of this weekend I ran some benchmarks when comparing the -fslp-vectorize compiler switch for a range of C/C++ benchmarks. The -O3 -march=native compiler switches were set the entire time.
These test results can be found on OpenBenchmarking.org in 1307291-SO-FSLPVECTO83.
For most of our real-world workload tests on Linux with LLVM/Clang 3.4 SVN, there was little change in performance out of the basic SLP Vectorizer. However, as the benchmarks showed this past weekend, for certain operations and micro-benchmarks there are worthwhile improvements to find with this straight-code vectorizer. There's at least no regressions even though this isn't quite as useful as the Loop Vectorizer.