Announcement

Collapse
No announcement yet.

ARM Cortex-A9 PandaBoard ES Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Wow, nice! Much more in-line with what I was expecting. And a Pandaboard with hardfp should get in the same ballpark as that, right? Suddenly the power question is in much sharper relief.

    Comment


    • #42
      Originally posted by ssvb View Post
      BTW, here are my benchmark results from gentoo (hardfp) running on origenboard (dual-core ARM Cortex-A9 @1.2GHz): http://openbenchmarking.org/result/1...AR-1112277AR91
      Thanks for doing that! These results make me think the Pandaboard was misconfigured.

      Comment


      • #43
        How much is from hardfp and how much from a tweaked arm core (exynos)? I recall samsung's pr saying the exynos outruns omap4.

        Comment


        • #44
          Originally posted by Wyatt View Post
          Wow, nice! Much more in-line with what I was expecting. And a Pandaboard with hardfp should get in the same ballpark as that, right? Suddenly the power question is in much sharper relief.
          Yes, Pandaboard ES and Origen board should have similar performance.

          Originally posted by curaga View Post
          How much is from hardfp and how much from a tweaked arm core (exynos)?
          Using hardfp API should help on some floating point heavy tests which do lots of function calls. Running on real 1.2GHz clock speed (vs. allegedly 920MHz) should help everywhere. Also if gcc in ubuntu is configured to target thumb2 code generation by default, it also could affect performance depending on the quality of the compiler.

          I recall samsung's pr saying the exynos outruns omap4.
          Exynos4210 may have quite a bit faster memory controller, especially if compared to older OMAP4430 which had some serious problems with memory performance. But OMAP4460 was supposed to resolve the issue.

          Originally posted by ldesnogu View Post
          Thanks for doing that! These results make me think the Pandaboard was misconfigured.
          The question is how could it happen after all the supervision that ubuntu got from linaro? Especially considering linaro focus on the kernel and toolchain areas.

          Also a major problem for ARM when running tests like this is the missing sane support for runtime cpu features detection (-march=native and -mtune=native options support in gcc, reliable neon detection and use in all the neon optimized libraries). ARM Ltd. has been aware of the problem for years, but did nothing to address it

          Comment


          • #45
            Originally posted by ssvb View Post
            hardfp API
            This was a typo and should be "hardfp ABI".

            Comment


            • #46
              As for the benchmarks, "compress-7zip" test compiles the code with -O optimization, which is equivalent to -O1:
              Code:
              $ cat install.log 
              mkdir -p bin
              make -C CPP/7zip/Bundles/Alone all
              make[1]: Entering directory `/mnt/mmcblk0p2/.phoronix-test-suite/installed-tests/pts/compress-7zip-1.6.0/p7zip_9.20.1/CPP/7zip/Bundles/Alone'
              g++ -O -pipe -s -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DNDEBUG -D_REENTRANT -DENV_UNIX -D_7ZIP_LARGE_PAGES -DBREAK_HANDLER -DUNICODE -D_UNICODE -c -I. -I../../../myWindows -I../../../ -I../../../include_windows ../../../myWindows/myGetTickCount.cpp
              g++ -O -pipe -s -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DNDEBUG -D_REENTRANT -DENV_UNIX -D_7ZIP_LARGE_PAGES -DBREAK_HANDLER -DUNICODE -D_UNICODE -c -I. -I../../../myWindows -I../../../ -I../../../include_windows ../../../myWindows/wine_date_and_time.cpp
              ...
              This can be solved by setting EXTRAOPTFLAGS environment variable to something more reasonable, for example at least "-O2".

              The build system for libvpx clearly does not use NEON, which explains poor results for "VP8 libvpx Encoding" test:
              Code:
               # cat install.log 
              Configuring selected codecs
                enabling vp8_encoder
                enabling vp8_decoder
              Configuring for target 'generic-gnu'
                enabling generic
              Creating makefiles for generic-gnu libs
              Creating makefiles for generic-gnu examples
              Creating makefiles for generic-gnu docs
                  [DEP] vpx_config.c.d
                  [DEP] vp8/decoder/reconintra_mt.c.d
                  [DEP] vp8/decoder/idct_blk.c.d
                  [DEP] vp8/decoder/threading.c.d
                  [DEP] vp8/decoder/onyxd_if.c.d
              ...
              Trying to configure libvpx as "./configure --target=armv7-linux-gcc" spits out a funny error message: "Unable to invoke compiler: arm-none-linux-gnueabi-gcc -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64". Why would they expect the compiler to be named this way?

              Some other tests may show suboptimal results too, but I haven't looked there yet.

              Comment


              • #47
                Originally posted by ssvb View Post
                BTW, here are my benchmark results from gentoo (hardfp) running on origenboard (dual-core ARM Cortex-A9 @1.2GHz): http://openbenchmarking.org/result/1...AR-1112277AR91
                Great set of benchmarks... So if your benchmarks are accurate, it shows the Cortex A9 coming in faster than the Intel ATOM series, and even faster than the Pentium 4 in many tests!

                Comment


                • #48
                  Enabled the use of NEON in VP8 LIBVPX ENCODING test by hacking libvpx build scripts:
                  Code:
                  $ ./configure --target=armv7-linux-gcc
                  Configuring selected codecs
                    enabling vp8_encoder
                    enabling vp8_decoder
                  Configuring for target 'armv7-linux-gcc'
                    enabling armv7
                    enabling armv6
                    enabling armv5te
                    enabling fast_unaligned
                  Creating makefiles for armv7-linux-gcc libs
                  Creating makefiles for armv7-linux-gcc examples
                  Creating makefiles for armv7-linux-gcc docs
                  This improves Frames Per second rating from 1.01 to 1.35 for Exynos4210. Though this is still worse than 1.55 shown by Intel Atom.

                  Comment


                  • #49
                    Originally posted by ssvb View Post
                    As for the benchmarks, "compress-7zip" test compiles the code with -O optimization, which is equivalent to -O1
                    And appears that this was not the most broken test in the set. The winner is SMALLPT:
                    Code:
                    $ cat install.sh
                    #!/bin/sh
                     
                    tar -zxvf smallpt-1.tar.gz
                    g++ -fopenmp smallpt.cpp -o smallpt-renderer
                    echo $? > ~/install-exit-status
                     
                    echo "#!/bin/sh
                    ./smallpt-renderer 100 > \$LOG_FILE 2>&1
                    echo \$? > ~/test-exit-status" > smallpt
                    chmod +x smallpt
                    This test program gets built without any optimizations at all! If we append -O3 option to the existing -fopenmp, the result for Exynos 4210 improves from 2489 seconds to 557 seconds! This is very disturbing and shows that phoronix-test-suite needs some major fixes. And a lot of data collected at openbenchmarking.org up to this moment is just useless garbage

                    Comment


                    • #50
                      While that does improve the Exynos results, it doesn't invalidate the results relative to each other. They all got the same optimization.

                      Comment

                      Working...
                      X