No announcement yet.

Himeno Benchmark

  • Filter
  • Time
  • Show
Clear All
new posts

  • Himeno Benchmark

    I saw the Himeno results in the article: Something seemed way off, but then I saw the Forton version of Himeno doesn't behave the same at all. It seems the performance dip has nothing to do with AVX2 capability, but that the matrix indexing in the C version is very inefficient. All the scores are greatly improved by just fixing the indexing. A blog post here mentions this: You can also make this multithreaded using OpenMP as per the same blog post, but that's another discussion to be had. I made a repo with the indexing fix to make the C version of Himeno perform closer to its original Fortran variant and the performance anomaly on modern AMD CPUs goes away entirely. I put it in a repo here for review

  • #2
    Thanks! The PTS test profile for it is now updated.
    Michael Larabel