Page 4 of 5 FirstFirst ... 2345 LastLast
Results 31 to 40 of 49

Thread: ARM On Ubuntu 12.04 LTS Battling Intel x86?

  1. #31
    Join Date
    May 2009
    Location
    Finland
    Posts
    12

    Default

    The situation with Android is ridiculous. I have HTC Desire and this is my first and probably last Android phone(unless the situation changes). It's rooted and running oxygen rom. I like it now, but I'm not going to buy a new one unless the manufacturer guarantees that they will support it atleast 2 years. These phones are so full of proprietary crap(graphics,radio,camera) that the rom-community have very hard time dealing with these things.

  2. #32
    Join Date
    Feb 2012
    Posts
    8

  3. #33
    Join Date
    Jul 2009
    Posts
    31

    Default

    Quote Originally Posted by Milli View Post
    Very interesting. This basically nullifies this review.
    Does it? I re-read the review and it doesn't say NEON was used for the PandaBoard (i doubt Ubuntu for ARM comes in default with it, not everyone has NEON implemented on metal, i.e Tegra 2) , so things are pretty much equal, and we should expect equal speed-ups for everyone if SIMD was used.

  4. #34
    Join Date
    Feb 2012
    Posts
    8

    Default

    Quote Originally Posted by WillyThePimp View Post
    Does it? I re-read the review and it doesn't say NEON was used for the PandaBoard (i doubt Ubuntu for ARM comes in default with it, not everyone has NEON implemented on metal, i.e Tegra 2) , so things are pretty much equal, and we should expect equal speed-ups for everyone if SIMD was used.
    SSEx is default fp instruction set for x64 Operating Systems. Also scalar SSE is much faster then outdated x87 and should be used over x87 when possible. ARM neon can not be compared to SSE at this point since it does not support for double precision. Also I'm not sure that neon was not used in the benchmark.

  5. #35
    Join Date
    Feb 2012
    Posts
    6

    Default

    Quote Originally Posted by WillyThePimp View Post
    Does it? I re-read the review and it doesn't say NEON was used for the PandaBoard (i doubt Ubuntu for ARM comes in default with it, not everyone has NEON implemented on metal, i.e Tegra 2) , so things are pretty much equal, and we should expect equal speed-ups for everyone if SIMD was used.
    Why do you say that these two verdors' different SIMD implementations result in the same performance gains? Because they don't and that's why it's important that it's used on both platforms. Intel's implementation is more powerful than ARM's. It's just in a different league. The Atom supports SSE, SSE2, SSE3 and SSSE4 so it's not just SSE.

  6. #36
    Join Date
    Feb 2012
    Posts
    6

    Default

    Quote Originally Posted by atom01 View Post
    I didn't expect such a big difference on x64. The Atom N450 is one generation newer but basically the same as the N270.
    I saw a gain of 2.5x(!) on one test, a 2x gain on another one but generally around 40% faster.

  7. #37
    Join Date
    Oct 2008
    Posts
    106

    Default

    Quote Originally Posted by Milli View Post
    I didn't expect such a big difference on x64. The Atom N450 is one generation newer but basically the same as the N270.
    I saw a gain of 2.5x(!) on one test, a 2x gain on another one but generally around 40% faster.
    Not sure it makes sense to compare against Atom 64-bit given that the low power versions (Medfield included) don't have it. Also it seems SSE is not that much faster than x87 according to Agner Fog tables, though I agree it should be used.

  8. #38
    Join Date
    Feb 2012
    Posts
    8

    Default

    Quote Originally Posted by ldesnogu View Post
    Not sure it makes sense to compare against Atom 64-bit given that the low power versions (Medfield included) don't have it. Also it seems SSE is not that much faster than x87 according to Agner Fog tables, though I agree it should be used.
    Instruction latency is only part of the story. Even if we put aside potential vectorization, using SSE over x87 should produce denser code because of the better register availability which should result in less pressure on narrow Atom decoder. But the real gain should come from using xmm registers for memory move/copy. And again, I'm not sure that neon was not used in pandaboard benchmarks.

  9. #39
    Join Date
    Jul 2009
    Posts
    31

    Default

    Quote Originally Posted by Milli View Post
    Why do you say that these two verdors' different SIMD implementations result in the same performance gains? Because they don't and that's why it's important that it's used on both platforms. Intel's implementation is more powerful than ARM's. It's just in a different league. The Atom supports SSE, SSE2, SSE3 and SSSE4 so it's not just SSE.
    Let's go with the antithesis. Let's say we should expect no (significant) gains on either platform from compiling with non-vectorized SIMD instructions. Neither platform is using a 64 bits userspace, nor SIMD, anyways. This is for compatibility's sake, of course. If Canonical wants its OS on ARM devices, they have to support the most basic feature: A FP unit, because as I said, a so much of a leadeing SoC as Tegra 2 is, it hasn't got NEON. That's why I'm sure no SIMD instrucctions were used on the ARM machine.

    Also, VFP and NEON are two very elegant SIMD instruction sets. While we cannot claim NEON implementation superiority over SSE(x), doing so the other way is equally wrong, it's a lie. Anyways, the default SSE2 in x86_64 is the first SSE that introduced, as far as I know, double precision formats for integer and fp operations, but VFP, in the other hand, is baseline for every modern ARM core and supports it. Shall we compare SSE vs NEON on 32 bits kernel? I'm pretty sure Atom is gonna keep loosing. ANd this is, with a much more mature support for it's architecture at compiler level, overall better system specs and higher power consumption.

  10. #40
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    5,034

    Default

    Even Nvidia acknowledged lacking neon was a big issue, it's there in Tegra 3. So the easy solution would be to just set hardfp + neon as the minimum baseline, and completely ignore inferior hw such as Tegra 2.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •