Announcement

Collapse
No announcement yet.

More Vulkan NCNN Inference Benchmarks On AMD Radeon vs. NVIDIA GeForce Under Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • More Vulkan NCNN Inference Benchmarks On AMD Radeon vs. NVIDIA GeForce Under Linux

    Phoronix: More Vulkan NCNN Inference Benchmarks On AMD Radeon vs. NVIDIA GeForce Under Linux

    Given the interest from the RealSR-NCNN Vulkan benchmarks on various NVIDIA and AMD Radeon graphics cards looking at this neural network inference framework with the task of upscaling an image by 4x the resolution using RealSR, here are some more benchmarks of the NCNN framework accelerated by Vulkan on different GPUs under Ubuntu Linux...

    http://www.phoronix.com/scan.php?pag...ore-NVIDIA-AMD

  • #2
    Interesting results! Thanks for testing!
    Just wondering, are those networks all fp32 or are maybe a few of them fp16?

    Would it be possible to check out the performance of the llvm backend of radav for networks where radav didn't perform that good?

    Comment


    • #3
      Hmm. How did the Radeon VII go from boss in the previous test to bitch in this one?
      Getting crushed by a 1650 seems rather strange. What am I missing?

      Comment


      • #4
        Would be nice to see benchmarks with batching - especially on theses smaller networks. That would be somewhat more realistic, as well as overcome some of the overhead that might be holding back AMD cards in this benchmark.

        However, this isn't a complaint aimed at Michael. As you can see from the sourcecode, the benchmark program doesn't expose a parameter for this. If their framework even supports it, I think it would have to be configured in the individual .param files.

        https://github.com/Tencent/ncnn/blob.../benchncnn.cpp

        Comment


        • #5
          Originally posted by oleid View Post
          Just wondering, are those networks all fp32 or are maybe a few of them fp16?
          I'm not sure how to tell. They have int8 versions of some, which should run well on Radeon VII and Pascal + Turing Nvidia GPUs, though it seems those were disabled due to a bug.

          https://github.com/Tencent/ncnn/tree/master/benchmark

          Comment


          • #6
            Originally posted by dpanter View Post
            Hmm. How did the Radeon VII go from boss in the previous test to bitch in this one?
            Getting crushed by a 1650 seems rather strange. What am I missing?
            Yeah, take a closer look at both articles. The first was super heavy-weight, with results taking on the order of tens of seconds. This article features results all in the single-digit milliseconds! So, we're looking at significantly different workloads.

            As for why, it could have something to do with driver overhead, or it could be primarily down to how efficiently the work is distributed to the GPUs' respective compute elements. Someone could probably gain some useful insight into the dominant factor, with a Vulkan performance analyzer tool.

            Or, as I previously suggested, a larger batch sized would help amortize some of the overheads, while also giving us a more realistic picture of inferencing performance in non-realtime scenarios. It seems that burden is on the benchmark authors, if not the framework, itself. I don't see any knob for PTS to twiddle.

            Comment


            • #7
              More on the state of the Vulkan backend, here: https://github.com/Tencent/ncnn/blob...ncnn-vulkan.md

              It seems still fairly immature.

              Comment


              • #8
                The only thing one can tell looking at how slower cards finish in front of faster cards on more than one occasion is that either the benchmark or the software itself is messed up.

                Comment

                Working...
                X