Announcement

Collapse
No announcement yet.

MIPS Loongson 3A Benchmarks On Debian

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kano
    replied
    Very simple way to compare single core speed. Your opterons will be slow as hell. Intel i7-3770S with Dragonfire/wheezy:
    Code:
    35.11user 0.26system 0:35.40elapsed 99PU (0avgtext+0avgdata 1496maxresident)k
    Single core speed matters for most apps, what you always want to compare are things that scale well with more cores like compiling with more threads or so. You can be sure even webrowsing is faster when you have got 2 fast cores against 4-16 slow cores.

    Leave a comment:


  • maldorordiscord
    replied
    Originally posted by Kano View Post
    How about that real world "benchmark":
    Code:
    echo 'define f(x){r=1;while(x>1){r*=x;x-=1};return r};f(50000)'|time bc
    where is the utility value? you need this every day ? like browsing Internet ?

    you need this to watch porn ?

    Leave a comment:


  • Kano
    replied
    How about that real world "benchmark":
    Code:
    echo 'define f(x){r=1;while(x>1){r*=x;x-=1};return r};f(50000)'|time bc

    Leave a comment:


  • maldorordiscord
    replied
    "Some rejects a good idea from the only reason, because it is not of him" Luis Bu?uel Portol?s

    Leave a comment:


  • maldorordiscord
    replied
    Originally posted by ldesnogu View Post
    You made my day

    I was being naive thinking you could learn something, I'll just ignore you. Plonk!
    I thought the same about you
    My argument is technically based on Gustafson's law
    http://en.wikipedia.org/wiki/Gustafson%27s_law
    "which says that computations involving arbitrarily large data sets can be efficiently parallelized. "
    Your single thread argument is bullshit on large data sets. and in reality you always do have large data sets.

    And the Loongson cpu do have "out-of-order execution" because of this to speed up single-thread tasks by multicore tricks.

    Leave a comment:


  • ldesnogu
    replied
    Originally posted by maldorordiscord View Post
    your babbling makes no sense !
    You made my day

    I was being naive thinking you could learn something, I'll just ignore you. Plonk!

    Leave a comment:


  • maldorordiscord
    replied
    Originally posted by ldesnogu View Post
    And Loongson doesn't have anything like that.
    Loongson do have 4mb cache instead of 1mb cache and loongson do have a second HT link instead of 1 HT link this means it can communicate with a second core this means the second core can hold the double RAM this means its a NUMA system the second core can assist with there cache this means a dualsocket loongson system do have 8mb cache and you do have the double ram speed than only 1 cpu and you do have the half parallelized ram latency.

    and you don't understand what thread level speculation (TLS) is this is not a CPU feature.
    you can do this on ALL cpus with "out-of-order execution"!
    http://en.wikipedia.org/wiki/Speculative_multithreading

    Originally posted by ldesnogu View Post
    Then your application is not single threaded any more.
    This is wrong because all other tasks do only assistance helping jobs.
    Technically the original task is single-threaded.

    Originally posted by ldesnogu View Post
    We were talking about very simple benchmarks.
    No you dream about simple useless benchmarks and I talk about real world usage of the CPU in the reality the result is different compared to your dream world.

    Originally posted by ldesnogu View Post
    Go re-read the article you linked yourself. And lookup with Google what these benchmarks are. Instead of talking, read and learn.
    I already did it these benchmarks are useless synthetically benchmarks no real world useful application will be that stupid than these syntactical tests.

    Originally posted by ldesnogu View Post
    I'm close to think you are just a troll: as soon as you can't answer something, you derail the discussion. That's a real pain and makes me sad for you because you could learn a lot of things by thinking and reading what others have to say instead of wanting to be right at any cost.
    You don't understand my arguments your arguments do not match my proposals.
    example: I talk about thread level speculation/speculative multithreading and you talk about Loongson do not have these features but the loongson do have the "out-of-order execution" feature
    Wikipedia about LTS:"(TLS), is a dynamic parallelization technique that depends on out-of-order execution to achieve speedup on multiprocessor CPUs."
    Wikipedia about Loongson feature list:"out-of-order execution"
    your babbling makes no sense ! you can write software for the loongson who do use the "out-of-order execution" feature of the loongson cpu to use the multicore cpu to speed up single-thread execution with Speculative multithreading.

    This is the prove I just know more than you about the tropic: Speed up single-thread-tasks on multicore systems. because of this my answers do not fit into your expectations.

    My answer is technically correct and I can prove it!

    "out-of-order execution"+"speculative multithreading" break your neck!

    Leave a comment:


  • maldorordiscord
    replied
    Originally posted by smitty3268 View Post
    Explain why my arguments are wrong. They aren't.

    You lose.
    what arguments? you have no arguments.

    Leave a comment:


  • ldesnogu
    replied
    Originally posted by maldorordiscord View Post
    Be sure i did not fell into his trap! I know many tests single core vs dual-core cpu with only single-thread workload.
    Example? With hard numbers on a web site with good reputation.

    I gave him the right answer not the answer he expected and not the answer he think what is right.
    No you gave a bogus answer.

    Higher clock speed is a invalid argument for an cpu like the Loongson because the cpu design do not allow you to clock higher and back in the time of the athelon64 dualcores vs singlecore the cpu design also do not allow a higher clock speed you can clock the dualcore versions as high as the single clock version if your cooling system is good. because its the cpu design what limit your clock speed.
    If you have multiple cores, you have to take care of coherence and you have to share hardware resources (such as external bus), so this might have an impact on maximum clock because you might add delay on a critical path (look up that term with google).

    Your algorithms argument is also invalid because you can speedup a single thread so much by multicore assisted calculations that no single-core cpu can compete with that.
    You can do Speculative calculations on the second core to speed up.
    You can use the second core to assist caching the hard-drive and pre-caching the hard-drive.
    Then your application is not single threaded any more.

    We were talking about very simple benchmarks. Go re-read the article you linked yourself. And lookup with Google what these benchmarks are. Instead of talking, read and learn.

    You can also improve the single thread performance with multicore by using thread level speculation (TLS)
    You can Boosting Single-thread Performance in Multi-core Systems through Fine-Grain Multi-Threading.
    And Loongson doesn't have anything like that.

    I'm close to think you are just a troll: as soon as you can't answer something, you derail the discussion. That's a real pain and makes me sad for you because you could learn a lot of things by thinking and reading what others have to say instead of wanting to be right at any cost.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by maldorordiscord View Post
    just explain me the other arguments to:

    (1)pre-caching with the second core.
    (2)all other system OS tasks fit into the second core.
    (3)on linux the encryption of the hard-drive fit into the other cores.
    (4) NUMA effects Single-thread apps on multi socket systems the L3 cache of all cpus are used to cache the single threaded process.
    (5) more ram over NUMA more ram means more speed if the working task is bigger.
    (6) 4mb L2 cache vs 1mb L1 cache in the emulation.

    A single socket system like the amd 8150 do have at maximum 32gb ram a dual socket g34 socket system do have 128gb ram with the same ram 8gb ram modules full loaded on all places.
    its 32gb ram vs 128 gb ram and you can use this in a single "thread"
    Same with the cache a opteron 6300 dualsocket system do have 32mb L3 cache useable over the numa architecture in a single thread.
    A modern OS do have many threads this means the multi core system can handle all other threads this speed up the main used thread.
    Explain why my arguments are wrong. They aren't.

    You lose.

    Leave a comment:

Working...
X