Announcement

Collapse
No announcement yet.

GCC To Receive Automatic Parallelization Support

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Another Gentoo user here. I have very conservative CFLAGS so yeah, it's not just for ricers. This sounds yummy but I'm wondering whether it will really be possible to just enable it globally. It might open up a whole can of bugs on various programs?

    Comment


    • #22
      gentoo is where it's really at

      Comment


      • #23
        Originally posted by alec View Post
        Premature optimization is the root of all evil.
        You gentoo users don't test whether it actually gives you any gain...
        Not true. I have a dual boot currently set up between Ubuntu 9.04 (from mid January) and an up-to-date -O3 -march=core2 compiled gentoo system built with a lot of attention to detail to USE flags.

        The difference in performance I experienced in the same version of wine running Guild Wars was pretty substantial. I can't quote exact numbers since it's been a while since I've even booted into the Ubuntu system, but if anyone is dying of curiousity I'll do a check.

        Comment


        • #24
          Yet another Gentoo user here. I'm wondering if the x264 encoder won't be able to take advantage of this as it doesn't fully utilize both cores even with two threads specified.

          CFLAGS="-O2 -march=native -pipe"

          Comment


          • #25
            Originally posted by wswartzendruber View Post
            Yet another Gentoo user here. I'm wondering if the x264 encoder won't be able to take advantage of this as it doesn't fully utilize both cores even with two threads specified.
            Just renice it. By default, it doesn't use much (85% maybe) but if you renice it aggressively you'll get something closer to 97-99% (on 4 cores). Also consider running two or more encodes in parallel.

            Comment


            • #26
              Originally posted by wswartzendruber View Post
              Yet another Gentoo user here. I'm wondering if the x264 encoder won't be able to take advantage of this as it doesn't fully utilize both cores even with two threads specified.
              I recently converted a divx into mpg using ffmpeg (or mencoder, I don't exactly remember sorry). I tried using the -threads option of ffmpeg, just thinking about using all the cores of my C2Q.
              Using "top", I saw that the 4 cores were fully used when "threads" option was set to "16".
              However, with such a value, the resulting file was crappy : full of colorized squares on the border of the screen.

              I think the multithreading option in ffmpeg needs more tuning. Meanwhile, you'll just have to use your Quad core as it was a single core.

              Comment


              • #27
                From the FAQ...

                3.4 Why do I see a slight quality degradation with multithreaded MPEG* encoding?

                For multithreaded MPEG* encoding, the encoded slices must be independent, otherwise thread n would practically have to wait for n-1 to finish, so it's quite logical that there is a small reduction of quality. This is not a bug.
                It does sound like what you saw may be a bug though.

                Comment


                • #28
                  Originally posted by Saist View Post
                  They switch to Debian
                  This

                  Definitely a nice feature to have in the compiler. I'd also hope there's a cpu detection routine that can detect your CPU and adjust the optimizations on the fly when a program compiled with this new GCC starts up. This will then mean a program will run optimally on whatever CPU it finds itself running on, without any recompiles.

                  Comment


                  • #29
                    I'd also hope there's a cpu detection routine that can detect your CPU and adjust the optimizations on the fly when a program compiled with this new GCC starts up. This will then mean a program will run optimally on whatever CPU it finds itself running on, without any recompiles.
                    It'd also mean that instead of ~5kb Hello World we'd have 20, 30, 40, or more..

                    Comment


                    • #30
                      The right answer is to have the specific CPU model encoded in the ELF program header and have the OS launch an object code recompiler the first time you try to execute a non-native binary. (Think about it, it's no harder than LLVM or java jit; easier because you can do whole program analysis instead of trying to discover dependencies J-I-T.)

                      Comment

                      Working...
                      X