Announcement

Collapse
No announcement yet.

Debian Wheezy GNU/kFreeBSD: Slower Than Linux

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • phoronix
    started a topic Debian Wheezy GNU/kFreeBSD: Slower Than Linux

    Debian Wheezy GNU/kFreeBSD: Slower Than Linux

    Phoronix: Debian Wheezy GNU/kFreeBSD: Slower Than Linux

    With Debian 7.0 "Wheezy" set to be frozen soon, I took the opportunity to run some new benchmarks of Debian GNU/kFreeBSD, the Debian OS variant using the FreeBSD kernel rather than Linux, to compare it to Debian GNU/Linux as well as Ubuntu Linux and PC-BSD/FreeBSD 9.0.

    http://www.phoronix.com/vr.php?view=17524

  • gamerk2
    replied
    Proving the old adage: Your benchmark results are only as good as your benchmark.

    Leave a comment:


  • stevenc
    replied
    How do I report a bug in the Phoronix Test Suite? Just keep bumping the forum thread?

    It seems that the automatic OS detection built into OpenSSL doesn't support GNU/kFreeBSD, but mis-detects it as GNU/Hurd x86 which means it still gets built, but without the ASM optimisations. Hence the slow OpenSSL benchmark result.

    The Debian packaged OpenSSL doesn't have this problem.

    Leave a comment:


  • stevenc
    replied
    I've found the reason for low OpenSSL scores on GNU/kFreeBSD, in the way it is built by the PTS. The openssl binaries shipped by Debian don't have this problem:

    Code:
    CC= gcc
    CFLAG= -Wa,--noexecstack -O3
    on GNU/kFreeBSD, vs. assembler optimisations are only being enabled for GNU/Linux:

    Code:
    CC= gcc
    CFLAG= -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
    This is clear from the full output of './openssl_/bin/openssl speed' also:

    Code:
    OpenSSL 1.0.0e 6 Sep 2011
    built on: Sat Jul 14 13:28:40 BST 2012
    options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
    compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
                      sign    verify    sign/s verify/s
    rsa 4096 bits 0.020554s 0.000307s     48.7   3254.5
    on GNU/Linux, vs. the following on a similar-ish machine running GNU/kFreeBSD:

    Code:
    built on: Sat Jul 14 13:42:24 BST 2012
    options:bn(64,32) rc4(ptr,int) des(idx,cisc,2,long) aes(partial) idea(int) blowfish(idx) 
    compiler: gcc -Wa,--noexecstack -O3
                      sign    verify    sign/s verify/s
    rsa 4096 bits 0.078346s 0.001050s     12.8    952.6

    Originally posted by nslay View Post
    ... FreeBSD is stuck at gcc 4.2 while the newer Linux distributions are using later versions of gcc.
    FWIW Debian GNU/kFreeBSD builds the kernel with gcc-4.6 and -O2, glibc with gcc-4.4 and -O2, and much of the userland with the new default compiler gcc-4.7.

    Originally posted by nslay View Post
    Shouldn't, for example, John the Ripper run about the same on FreeBSD? Why should the same blowfish implementation be slower on FreeBSD? That shouldn't involve FreeBSD at all.
    I agree, we may find more cases where the PTS differs from Debian's own package building, and this may account for some of the perceived slowness. With that fixed, we may then get a more interesting comparison of the actual kernels involved, their threads implementations (in applicable benchmarks), scheduling and hardware support.

    Kinda cool that GNU/kFreeBSD even managed to lead some of the benchmarks in http://www.phoronix.com/scan.php?pag...ubuntu12&num=1

    Leave a comment:


  • stevenc
    replied
    I've found the reason for low OpenSSL scores on GNU/kFreeBSD, in the way it is built by the PTS. The openssl binaries shipped by Debian don't have this problem:

    Code:
    CC= gcc
    CFLAG= -Wa,--noexecstack -O3
    on GNU/kFreeBSD, vs. assembler optimisations are only being enabled for GNU/Linux:

    Code:
    CC= gcc
    CFLAG= -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
    This is clear from the full output of './openssl_/bin/openssl speed' also:

    Code:
    OpenSSL 1.0.0e 6 Sep 2011
    built on: Sat Jul 14 13:28:40 BST 2012
    options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) 
    compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM
                      sign    verify    sign/s verify/s
    rsa 4096 bits 0.020554s 0.000307s     48.7   3254.5
    on GNU/Linux, vs. the following on a similar-ish machine running GNU/kFreeBSD:

    Code:
    built on: Sat Jul 14 13:42:24 BST 2012
    options:bn(64,32) rc4(ptr,int) des(idx,cisc,2,long) aes(partial) idea(int) blowfish(idx) 
    compiler: gcc -Wa,--noexecstack -O3
                      sign    verify    sign/s verify/s
    rsa 4096 bits 0.078346s 0.001050s     12.8    952.6

    Originally posted by nslay View Post
    ... FreeBSD is stuck at gcc 4.2 while the newer Linux distributions are using later versions of gcc.
    FWIW Debian GNU/kFreeBSD builds the kernel with gcc-4.6 and -O2, glibc with gcc-4.4 and -O2, and much of the userland with the new default compiler gcc-4.7.

    Originally posted by nslay View Post
    Shouldn't, for example, John the Ripper run about the same on FreeBSD? Why should the same blowfish implementation be slower on FreeBSD? That shouldn't involve FreeBSD at all.
    I agree, we may find more cases where the PTS differs from Debian's own package building, and this may account for some of the perceived slowness. With that fixed, we may then get a more interesting comparison of the actual kernels involved, their threads implementations (in applicable benchmarks), scheduling and hardware support.

    Kinda cool that GNU/kFreeBSD even managed to lead some of the benchmarks in http://www.phoronix.com/scan.php?pag...ubuntu12&num=1

    Leave a comment:


  • nslay
    replied
    I think the results are unusual, especially the OpenSSL results. You should check for interrupt storms (vmstat -i?) and check that the CPU is really running at full steam. I've seen weird quirks where booting FreeBSD from battery causes the processor to be stuck in a power-saving state and running at reduced clock speed (even if you plug it back in).

    Also suspect is SCHED_ULE ... this new scheduler was designed to ensure interactive application responsiveness under load. You could try changing it to SCHED_4BSD, however, that's not how GENERIC is shipped.

    EDIT: Check this out: SCHED_ULE should not be the default. However, I'm not convinced this should make such a big difference in these toy benchmarks.

    Lastly, FreeBSD is stuck at gcc 4.2 while the newer Linux distributions are using later versions of gcc. Maybe these later versions are better at optimization? Shouldn't, for example, John the Ripper run about the same on FreeBSD? Why should the same blowfish implementation be slower on FreeBSD? That shouldn't involve FreeBSD at all. You can compile all ports in FreeBSD with the same version of gcc by first building the same version of gcc from ports and then exporting CC and CXX and then building benchmark applications from ports the usual way.

    Lastly, missing from all benchmarks on phoronix is significance levels. Significance should be measured overall and pairwise. The former measures if the overall timing difference is significant while the latter measures significance in consistency. For example, one test may not be significant overall, but be significantly consistently better/faster/etc. Tests should probably also consider a "warm-up" period to discard fluke speed-ups or slow-downs (if not already).
    Last edited by nslay; 07-07-2012, 08:50 PM.

    Leave a comment:


  • curaga
    replied
    Originally posted by stevenc View Post
    I'm also curious how the Phoronix OpenSSL result compares with 'openssl speed', in case it is a problem in the test suite. Perhaps it used all 4 cores on Linux but on BSD may not be detecting the number of cores correctly (the numbers would make sense).
    "openssl speed rsa4096" is a single-threaded test, and ~40 is a completely normal result. For reference my phenom does ~58.

    Leave a comment:


  • XorEaxEax
    replied
    Originally posted by stevenc View Post
    I'm aware that I used a much older Linux kernel for my quick test, but I don't expect that would make much difference. I don't imagine Linux 2.6.32 -> 3.2.x would make it 4x faster and thus bring it in line with the Phoronix result.
    Well 2.6.32 was released in december 2009 so I'd say it's very old by kernel standards and there's every reason to believe there's been alot of optimizations since then. That said I think a 4 x difference in performance is suspiciously extreme (although it was only one test where it was this extreme iirc), and while I'm not surprised Linux performs better than FreeBSD/PC-BSD I'd expect it to be much closer.

    Leave a comment:


  • stevenc
    replied
    Originally posted by kraftman View Post
    The case is not hardware, but BSD and... date specific. Aren't you aware you tested few year old Linux kernel while there's much newer one used in Phoronix comparison?
    I'm aware that I used a much older Linux kernel for my quick test, but I don't expect that would make much difference. I don't imagine Linux 2.6.32 -> 3.2.x would make it 4x faster and thus bring it in line with the Phoronix result.

    If I get the opportunity, I may try to do this properly (Linux 3.2.x vs. GNU/kFreeBSD 9.0, and use the same machine for both tests).

    I'm also curious how the Phoronix OpenSSL result compares with 'openssl speed', in case it is a problem in the test suite. Perhaps it used all 4 cores on Linux but on BSD may not be detecting the number of cores correctly (the numbers would make sense).

    Leave a comment:


  • kraftman
    replied
    Originally posted by stevenc View Post
    The OpenSSL result is particularly interesting. But I couldn't reproduce it on my own (AMD Opteron) systems. Using latest Debian 1.0.1c-3 packages, I ran 'openssl speed' which includes an RSA 4096-bit signing benchmark. The other output from this command (compile options) is helpful to see if trying to compare benchmarks. A 2.6GHz Opteron 285 (twin, dual-core) system with Linux 2.6.32 scored 73.7 sign/s whereas a 2.2GHz Opteron 248 (single, dual-core) GNU/kFreeBSD system (9.0-1-amd64 kernel) scored 62.1 sign/s. I assume the benchmark runs on a single core and thus these results are almost precisely in-line with the relative clock speeds. Certainly no 4x performance reduction seen.

    I don't know how the benchmark for PTS works, but I'm curious why the test result panel says "OpenSSL 1.0.0e" which is not the (recently) packaged Debian version.

    Otherwise, I'd guess the cause of this is something hardware-specific, with this particular CPU not reaching full performance under this FreeBSD kernel version. I would check for anything related to CPU scaling, power draw (which is nice to see in many Phoronix benchmarks), or otherwise see if FreeBSD have already made any changes in HEAD / kfreebsd-10 relating to this hardware.
    The case is not hardware, but BSD and... date specific. Aren't you aware you tested few year old Linux kernel while there's much newer one used in Phoronix comparison?

    Leave a comment:

Working...
X