Announcement

Collapse
No announcement yet.

Ubuntu 12.04 ARM Performance Becomes Very Compelling

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ubuntu 12.04 ARM Performance Becomes Very Compelling

    Phoronix: Ubuntu 12.04 ARM Performance Becomes Very Compelling

    Last week I delivered benchmarks showing how Ubuntu 12.04 is ARM-ing up for better performance with ARM-based hardware and detailed some of the plans Canonical has for this architecture going forward. While those benchmarks last week illustrated some significant performance improvements with the Ubuntu 12.04 stack -- in large part due to the switch to hard floating-point support -- the gains are not over. In fact, there are already some striking improvements if using the Texas Instruments OMAP4460 SoC as found on the PandaBoard ES.

    http://www.phoronix.com/vr.php?view=17032

  • #2
    Originally posted by phoronix View Post
    Phoronix: Ubuntu 12.04 ARM Performance Becomes Very Compelling

    Last week I delivered benchmarks showing how Ubuntu 12.04 is ARM-ing up for better performance with ARM-based hardware and detailed some of the plans Canonical has for this architecture going forward. While those benchmarks last week illustrated some significant performance improvements with the Ubuntu 12.04 stack -- in large part due to the switch to hard floating-point support -- the gains are not over. In fact, there are already some striking improvements if using the Texas Instruments OMAP4460 SoC as found on the PandaBoard ES.

    http://www.phoronix.com/vr.php?view=17032

    Enabling the hotplug governor will allow cores to be completely powered off (if on separate power planes), and frequency scaling works a bit differently as well.

    Comment


    • #3
      This is awesome. I'd like to see how Kubuntu and Ubuntu compare in terms of performance though.

      Comment


      • #4
        I was wondering how the pandaboard all in all compared to a moderately modern laptop. In this case a first generation i5 (480M):
        http://openbenchmarking.org/result/1...BY-1201286BY26

        Yes, I know that they are not meant to be in the same performance class. I wanted to see it anyway.
        Also, the benchmarks are not ideal since I was surfing / working while running it.
        Last edited by ChrisXY; 02-03-2012, 04:46 PM.

        Comment


        • #5
          Originally posted by ChrisXY View Post
          I was wondering how the pandaboard all in all compared to a moderately modern laptop. In this case a first generation i5 (480M):
          http://openbenchmarking.org/result/1...BY-1201286BY26

          Yes, I know that they are not meant to be in the same performance class. I wanted to see it anyway.
          Also, the benchmarks are not ideal since I was surfing / working while running it.
          is there also a core2 8200 quad and dual tested (with an old DDR2 MB ?) in OB , that might have been a little closer ?.

          it would also be better if someone did a FULL TegraŽ 3 Quad-core CPU test with both this
          Ubuntu 12.04 and the latest and greatest
          The Linaro Kernel Tree https://wiki.linaro.org/WorkingGroups/Kernel/KernelTree
          , sure it would still loose compared to the i5 and even the i3 due to having a slower ram and subsystem on board right now if nothing else all being equal, but it is the best until A15 etc or at least a faster 1.6/8 gig dual A9 comes along, and Tegra3 prime does have NEON on board unlike the Tegra2 and is a current key device for linaro.

          i thought
          Michael was supposed to get a quad TegraŽ 3 prime in the post to test already, what happened there, didnt you follow it up
          Michael ?
          Last edited by popper; 02-03-2012, 06:26 PM.

          Comment


          • #6
            Well it's not about "losing".. It was obvious it would be much slower. But I wanted to see it in perspective.

            I think comparing it with a CULV would be maybe "fairer" on the x86 side...

            Comment


            • #7
              Yes, looks like Ubuntu performance is becoming less broken on ARM thanks to the latest updates But there are still some tests where Ubuntu powered Pandaboard ES (OMAP4460, dual ARM Cortex-A9 1.2GHz) is significantly falling behind Gentoo powered Origenboard (Exynos4210, dual ARM Cortex-A9 1.2GHz).

              This link contains the results from the initial OMAP4460 benchmark article combined with the current results and also with the results from Exynos4210 (which had been posted in the initial article discussion thread). The latest and greatest OMAP4460 results are highlighted in order to make the comparison easier.

              Comment


              • #8
                Originally posted by ssvb View Post
                Yes, looks like Ubuntu performance is becoming less broken on ARM thanks to the latest updates But there are still some tests where Ubuntu powered Pandaboard ES (OMAP4460, dual ARM Cortex-A9 1.2GHz) is significantly falling behind Gentoo powered Origenboard (Exynos4210, dual ARM Cortex-A9 1.2GHz).

                This link contains the results from the initial OMAP4460 benchmark article combined with the current results and also with the results from Exynos4210 (which had been posted in the initial article discussion thread). The latest and greatest OMAP4460 results are highlighted in order to make the comparison easier.

                i do find it a little odd though that the Exynos4210 didn't do even better as apparently its got an lp DDR3 interface at a top speed up to 6.4GB/s memory bandwidth, whereas the OMAP4460 has the older lpDDR2 interface lower than that, and it seems clear slower ram speed is what's really holding these initial SOC back more than anything


                still, given TI worked closely with ARM as the Advanced Lead Partner on the Cortex-A15’s development i guess their OMAP 5 A15 MPCore ​OMAP5432 – 2-channel DDR3 @532 MHz is just around the corner with a Higher bandwidth memory interface with up to 8.5 GB/s
                http://www.ti.com/pdfs/wtbu/SWCT010.pdf
                Last edited by popper; 02-06-2012, 02:48 PM.

                Comment


                • #9
                  Ubuntu on Beaglebone

                  We've been using Ubuntu on the Beaglebone for our Ninja Block ( http://www.kickstarter.com/projects/...d-with-the-web).
                  We've found the performance of Ubuntu equal if not better than Angstrom Linux on the same hardware. The board runs as faster as my 3 year old laptop (insanely fast).
                  Ubuntu beats Angstrom hands down on the number of packages available for the distro.

                  Cheers,

                  Marcus

                  Comment


                  • #10
                    We added our board to this test... http://openbenchmarking.org/result/1...BY-1201286BY42. It was an original Panda (EA3) with a similar hardfp build from the first week of February. While some of the benchies are predictably 20% slower than the ES, it is interesting to note that some actually performed better. We will try and figure out why that is...

                    Looking at cpufreq-info, we were running at 1 Ghz 99.84% of the time. It throttled down to 300Mhz for something but omap4temp stayed well within range.

                    Comment


                    • #11
                      pandaboard benchmarking

                      We have done some ad-hoc benchmarking on the Pandaboard here at the U of Arizona. We have noticed considerable improvement on some of our ad-hoc benchmarks running scientific apps if we load the file system using a SATA to USB file system mounting. Have you tried this in any of your benchmarks at phoronix?

                      drjo

                      Comment


                      • #12
                        We have done a lot of benchmarks that are more HPC-centric than the Phoronix suite; HPC Challenge and HPEC Challenge mostly... see http://meegs.mit.edu/HPEC11.pdf for some results...

                        This is pretty intriguing. We have ported and tested most of the scientific apps associated with the SC Student Cluster Competition, and we are using primarily an NFS mounted filesystem with netbooting across OTG (that paper should be at the CHiMIT site somewhere). I will see if I can recreate your setup and see if it gives us a performance bump.

                        Comment


                        • #13
                          Is the compiler only using the ARM cores?

                          Some of the numbers don't make any sense. For example, the OMAP4460 has hardware H.264 capable of 1080/30P. The GPU doesn't seem to be being used on the graphics routines. The NEON or VSP doesn't appear to be being used on some of the routines involving floating point.

                          Comment


                          • #14
                            Haven't looked at the sources yet...

                            but the performance would lead me to believe NEON is in use... these are extraordinarily good numbers...

                            I just reran the suite ... the biggest change is that I used the latest benchmarks... PTS 3.8 .... most of the individual tests had the same version numbers as the previous PTS test so I will look at that as well... I am still segfaulting on 4 of the tests (hence the lack of data on the graphic-centric tests)...

                            http://openbenchmarking.org/result/1...BY-1201286BY00

                            I will try this one more time with gcc 4.7 and then I think we will have squeezed as much performance out of the ES as we can... I preset my clock with "cpufreq-set -g performance" ... not sure if it stayed there; I had a lot of omap overheating warnings during the ogg encode before it crashed....

                            Comment


                            • #15
                              That's the odd thing

                              Some of the performance needs NEON running to get those numbers, but other ones, it doesn't seem to be running. It just seemed odd that some of the things were better than others that should have been close. I'll have to go back over the article to pick up on which items. I'm busy tonight, but I think I can get at it tomorrow.

                              It looks some externals have to be loaded into Linux to get the GPU to work. I found some PowerVR SDK stuff for the OMAP4 on

                              http://www.imgtec.com/powervr/inside...oads/index.asp
                              (you have to sign-up for it)

                              and this:
                              http://code.google.com/p/math-neon/

                              I haven't tried either, yet. I don't have my Pandaboard es (s) yet. I have some VAR-SOM-OM44 but have only played with them and not compiled anything for them. They have the 1.5 GHz OMAP4460. It'll be interesting to see if the benchmarks increase by 25% (1.5/1.2) since the the GPU does not increase (that I know of).

                              Comment

                              Working...
                              X