Phoronix IRC Log: 2009-07-01
Rugxulo: all this benchmarking, but the real question is ... which GCC is the best? :-/
Ivanovic: best for what?
Ivanovic: fast compile time?
Ivanovic: strictest for enforcing "clean" code?
Ivanovic: fastest resulting binaies since making good use of processor features?
Ivanovic: conforming best to the standards?
Ivanovic: (stuff like eg C99 and such)
Rugxulo: I'm well aware that they are all different in various regards
Rugxulo: however, "fastest compile speed and fastest output" would probably suffice ... (I'm assuming 3.4.6)
Ivanovic: oh, the *and* in there is very problematic
Ivanovic: optimizations take time!
Rugxulo: what kills me is that some people say, "I don't mind it taking longer as long as it works better", but then why is the latest GCC still so slow if it compiles itself??
Rugxulo: (yes, I know, it's not as bad as 4.2.x, but still ... worse than 3.4.x)
Ivanovic: and for some (more recent) cpus you won't get the optimizations with a 3.4.x gcc
Rugxulo: such as?
Rugxulo: doubts "core2" helps worth a damn
Ivanovic: all those multimedia stuff like sse4.2
Rugxulo: I know GCC tries to autovectorize a bit now, but I think that's quite difficult
Ivanovic: the profiles are only there to automatically set a bigger set of processor features
Ivanovic: of course it is difficult, and difficult stuff needs time
Rugxulo: I heard there is no P4 scheduler (or whatever)
Rugxulo: or something about the P4 that it doesn't take into account (sorry, don't know exactly)
Ivanovic: ehm, scheduler?
Ivanovic: i'd imagine that scheduler is something the kernel provides
Rugxulo: as you probably know, the P4 is kinda a bastard
Ivanovic: i don't know exactly either, am no gcc expert myself
Rugxulo: instruction scheduler, I presume
Ivanovic: oh, the P4 at that time was a great idea
Rugxulo: on paper ;-)
Ivanovic: the long pipline was (in theory) a nice thing
Rugxulo: only good once ramped up clock speed, and that got pretty hot
Ivanovic: the problem being that it was basically impossible to really get it filled and thus you waste many instructions
Rugxulo: I'm not sure compilers (even GCC) have ever tapped into any chip's true potential
Ivanovic: (that is why they were able to introduce HT and make use of the units that often were not used)
Ivanovic: oh, this is too generic
Rugxulo: not really
Rugxulo: take 486 or 586 for example, what does GCC do differently than 386?
Ivanovic: especially the compiler for embedded stuff (does not have to be gcc!) are *really* good
Rugxulo: 486 = extra alignment, nothing else
Rugxulo: 586 = maybe nothing, I dunno
Rugxulo: reordered instructions, maybe
Rugxulo: but I'm weakly sure (guessing) that 386, 486, 586 don't have separate instruction schedulers anyways
Ivanovic: uhm, you know, if you set your system in the chost to i686 the resulting binary code won't run on a 386 or 486 or 596
Ivanovic: so something they *have* done
Rugxulo: I know it's moot these days, but it still seems odd ... plus no Core 1 (have to use "prescott"), etc.
Rugxulo: 596? never heard of it ;-)
Rugxulo: and yes, i686 uses CMOV.. a lot, that's about it :-P
Rugxulo: maybe less alignment too
Ivanovic: it is too damn early over here to really type correctly
Rugxulo: was just yankin' yer chain
Rugxulo: I'm just saying, I'm not sure GCC does anything significantly different between 386, 486, 586
Ivanovic: and does this really matter?
Rugxulo: well, yes, if you blindly expect it to know what it's doing!
Rugxulo: obviously, I know better, but still ... kinda a shame
Ivanovic: in general x86 is not a nice arch to write compilers
Rugxulo: no, it's definitely not :-(
Ivanovic: so i would *not* expect it to be optimized for every single cpu available
Ivanovic: but more like "good in general on a broader and more common group"
Rugxulo: would've hoped somebody would've manually fixed 'em to work better, but oh well
Ivanovic: that is a hell of a lot of work
Rugxulo: no more than the ton they already do
Ivanovic: though like i said: what you wrote about compilers not using the full potential of cpus is slightly wrong
Rugxulo: how so?
Ivanovic: since when you leave x86 you will find cases where the potential is really used
Ivanovic: just think about tiny embedded arm or mips systems and the like
Ivanovic: rather simple risc chipsets
Rugxulo: I heard AVR works better with GCC 3.x than 4.x
Ivanovic: compiling the program might take several days, but the result will most likely make "good use" of what is available
Rugxulo: several days, ugh
Ivanovic: i am *not* talking about using gcc since you said compilers in general
Rugxulo: well obviously gcc-llvm proves you can compile fairly quickly
Ivanovic: that is: today often gcc is used as basis and on top of it some "special modifications" are used
Ivanovic: eg to be able to specify a worst case execution time and to be able to measure it
Rugxulo: okay, dumb question, what's your main OS and compiler?
Rugxulo: in other words, what do you use?
Ivanovic: my main desktop os is gentoo linux (unstable series) amd64
Ivanovic: Sysinfo for 'rechner1': Linux 2.6.30 running KDE 4.2.4 (KDE 4.2.4), CPU: Intel(R) Core 2 Quad CPU Q9300 @ 2.50GHz at 2000 MHz (4999 bogomips), HD: 546/1162GB, RAM: 1273/3951MB, 165 proc's, 37.1min up
Ivanovic: and as compiler i use what gentoo currently ships in that very line
Rugxulo: which is ... ?
Rugxulo: gcc 4.3.2, I blindly assume
Ivanovic: $ gcc --version
Ivanovic: gcc (Gentoo 4.3.3-r2 p1.1, pie-10.1.5) 4.3.3
Rugxulo: it's just that every GCC gets slower and slower, even -O0 isn't as fast as -O2 used to be :-(
Rugxulo: I guess nobody uses -O0 so they never bothered to keep it lightning fast
Ivanovic: compile time do increase, sure
Rugxulo: but you gotta admit, why bother using -O0 when older -O2 was faster ;-)
Rugxulo: (unless intentional for debugging of course)
Ivanovic: ehm, you know what -O0 is meant for, right?
Ivanovic: that one is *meant* only for debugging, no normal user should rely on it
Rugxulo: obviously not for super-ultra-fast compilation :-P
Ivanovic: if compile time is everything for you, you should stay with an older version of gcc
Rugxulo: well, I use many depending on needs :-)
Ivanovic: though for me it is also important how fast the resulting binary is
Ivanovic: plus if the compiler does follow the iso standards
Rugxulo: why? "newer computers are fast enough" :-P~~
Rugxulo: (especially yours, heh)
Ivanovic: and beacause it is fast enough i can wait these some extra mins for compiling packages
Rugxulo: wonders how long it would take to compile Firefox or GCC on your behemoth of a machine ...
Rugxulo: for laughs, you could try building GCC 184.108.40.206, it would be VERY fast
Rugxulo: (to build or run)
Rugxulo: even only using one core, it'd be super quick
Ivanovic: compiling my whole system with gcc, glibc, kde4 with all the deps like xorg and such takes about 12h
Rugxulo: that's quite reasonable, I bet :-)
Ivanovic: that is: emerge will use as many cores as possible, some packages only allow one thread to be used, some use four
Ivanovic: (you know, the good old "make -jX" command where X in my case is set to 5)
Ivanovic: sadly it is not possible to parallize gcc this way, but clibc is nicely parallizable
Ivanovic: time for a shower, maybe i wake up this way...
Rugxulo: heh, try bootstrapping GNU Emacs sometime ... ;-)
evocallaghan: michaellarabel: Good day
michaellarabel: Hi evocallaghan
evocallaghan: michaellarabel: My patches for X _started_ to make its way back into X, http://lists.freedesktop.org/archives/mesa-commit/2009-June/010136.html , Though you like to know about that stuff. You seem nuts about X -_- .
evocallaghan: michaellarabel: They take there time though !?
michaellarabel: I'll have to look at it later, technically I am not working this week.
Ivanovic: michaellarabel: so have you managed to visit some beer gardens now?
michaellarabel: Ivanovic: Yes, of course :D
evocallaghan: Where are you?
sandeen: michaellarabel, are you around?
Ivanovic: sandeen: by now (looking at the weather in germany where he currently is on holidays) he is most likely in a beer garden drinking his 2ns liter
sandeen: ah germany, oh well
sandeen: michaellarabel, ping?
michaellarabel: Hi sandeen, just got in a few minutes ago.
sandeen: saw you posted to the list so seized the opportunity ;)
sandeen: can I ask you a few questions about the recent fs test?
sandeen: if you have a moment
sandeen: I'm trying to figure out the variations in the iozone tests in particular; were all the tests run on the same partition of the same drive?
michaellarabel: For each testing, the filesystem was formatted to occupy the entire disk. That was then used for testing, sandeen.
sandeen: michaellarabel, ok, hrm
sandeen: for a single sata disk, all of these filesystems really should be going at approximately the disk speed
sandeen: and that disk speed is almost certainly -lower- than the results reported for ext4, so it's pretty odd
sandeen: FWIW, most recent iozone can actually be pointed at the block device itself
sandeen: it'd be awesome to add that to the iozone runs as a baseline of how fast the -disk- can go, then compare filesystems on top of that
michaellarabel: Do you know the options offhand to test purely the block device? What iozone revision was it introduced in?
sandeen: let me take a quick look
sandeen: whiel I do that, on a related note :)
sandeen: for filesystem tests in general, it is probably best to drop the fs caches before running the test; that gets you into a cold cache state and ensures you're testing the fs, not the buffer cache
sandeen: you can do this with echo 3 > /proc/sys/vm/drop_caches
sandeen: so ideally for the iozone read test you'd 'run with -i0 (to create the files), drop caches, then run with -i1 (to read them)
sandeen: but I can't tell if the tests can be split up & run this way or not
sandeen: ok, the iozone you're grabbing can do this:
sandeen: ./installed-tests/iozone/iozone3_323/src/current/iozone -i 0 /dev/sdb3
sandeen: KB reclen write rewrite
sandeen: 512 4 225960 798972
michaellarabel: Okay, thanks for the info sandeen.
sandeen: as to your question about how to get insight into the results w/ your limited resources ... I bet the linux fs community would be willing to review stuff before you publish, if you encounter some strange results
sandeen: to either say "yes that's a regression thanks" or "this happened for this reason, yes it's slower now but it's for data integrity" or whatnot
sandeen: one problelm I have in investigating this stuff, though, is that the tests often aren't that transparent to me
sandeen: it takes a lot of digging to see what's even running
michaellarabel: well what would you like to make it more transparent?
sandeen: for example for the iozone tests, it'd be nice if in the report, it says "this is the iozone cmdline that ran: iozone -r 1 -i0 -i 1 and we are graphing the "read" value from the results"
sandeen: or whatnot
sandeen: for things liike "blogbench" TBH I have no idea what it does, and it would take a lot of effort to find out ;) but that's not your problem ;)
sandeen: another thing I found surprising was that the httpd static serving tests don't actually test the OS's httpd
sandeen: I can see the value in that, but it was not at all what I expected when I saw graphs comparing httpd perf on one OS vs another OS
sandeen: I thought it was the OS's httpd, oddly enough :)
sandeen: anyway, just mentioning that in the writeups would clarify the results. "this is a reference httpd implementation, version BLAH, not the httpd shipped with the OS itself"
sandeen: I know, everyone's a critic, sorry ;)
sandeen: 2 more questions while I have you (maybe this should be on the list .....)
sandeen: would you like a patch to put bonnie++ back in, if I fixthe parsing? :)
sandeen: and when you run the httpd static serving tests, do you notice lots of "permission denied" errors in the logs/error_log under the httpd root?
sandeen: in my testing, the only fs activity was error logging ;) maybe not the best fs test, in the end.
michaellarabel: there will be a new PTS site that clarifies everything a test does, hopefully with PTS 2.2.
michaellarabel: Sure, sandeen, if you want to fix bonnie++ I would be happy with reincluding it, I just haven't had the time.
michaellarabel: I hadn't encountered any permission denied errors at last check
sandeen: michaellarabel, they weren't obvious when the test ran
sandeen: I was using seekwatcher/blktrace to see what the fs access looked like while it was running, because I didn't expect any
sandeen: and traced it all to writing the error log ;)
michaellarabel: hmm odd okay
sandeen: I saw the same thing on fedora & rhel, with and without selinux; I don't have an ubuntu box to test
michaellarabel: Was the permission denied errors within ~/.phoronix-test-suite/installed-tests/apache/ or where?
sandeen: sec, I'lll find it
michaellarabel: So is the error_log file missing or why is the permission being denied?
sandeen: no, the errors are -in- that file
sandeen: [Wed Jul 01 14:19:10 2009] [error] [client 127.0.0.1] (13)Permission denied: access to /test.html denied
sandeen: (and I can't figure out why, but my apache-fu sucks)
sandeen: but TBH the apache test is probably not all that relevant for filesystems; if I can get the errors to stop I bet the test reads the file once and that's it ...
sandeen: as it is you may be benchmarking how fast it can write to the error log, unless that's unique to my setup somehow :)
michaellarabel: when I am back I can take a look to see if it occurs on Ubuntu
sandeen: sorrry, btw, if I came off harsh in the comments, I really would like to help with some of this stuff if I can
sandeen: i'd be very happy to have repeatable relevant benchmarks run regularly & published :)
sandeen: do you have a bugtracker or anything? I could log some of this there rather than leaving it to irc
michaellarabel: Yes, it's no problem and I understand... Just though when I am already working 80~90 hours per week easily, it's hard to dig deeper and to make every article perfect, etc. For now just drop a thread in the PTS section of forums, I usually try to address problems immediately with PTS (when I am not out of the office).
sandeen: I really do think you might be able to leverage the community a little to add some depth to yhour results
sandeen: but they'd need to understand what's being run w/o needing to invest a ton of time in it.... *shrug* just a thought
michaellarabel: There are some features coming to PTS in versions after 2.0 that should make that easier to contribute back, etc :)
sandeen: ok, well, enjoy your vacation ;)
michaellarabel: I don't deny that there are improvements to be made in some areas, etc. That will hopefully all be tackled soon.
michaellarabel: It's not exactly a vacation, but thanks.
sandeen: I'll see if I can send some patches, but I also have the 80-90 workweek problem sometimes
michaellarabel: Thanks :)
Ebdomos: Question - to replace my laptop screen would cost about 400 dollars [its cracked]. Is it worth it [core 2 duo 2.4, nvidia 8400gs, 4g of ram], or should I look at just taking out the screen and buying a cheaper lcd and converting it into a desktop, or should I just donate it, and buy a ultramobile?