RadeonSI Change Allows For Balancing RDNA3 Video Transcoding Between Multiple Engines
A change merged today for the Mesa 23.2 graphics driver stack benefits video transcoding performance for new Radeon RX 7000 series "RDNA3" graphics cards.
The change merged to the RadeonSI Gallium3D driver benefits RDNA3 (GFX11) graphics processors that sport multiple Video Core Next (VCN) engines. By creating an additional context during video transcoding, it's now able to be load balanced across multiple VCN engines.
AMD engineer Leo Liu explained with the RadeonSI patch:
More details for those interested via this merge request that is now in Mesa 23.2 for debuting next quarter. In that request it was also raised why ultimately create two contexts rather than having the AMDGPU kernel scheduler better handle the situation, to which Leo Liu explained: "GPU scheduler is not aware whether the job is decode or encode with VCN4 unified queue(previous aka vcn_enc ring). Instead of 2 rings(vcn_dec and vcn_enc with legacy VCN3), for transcode case, there is only one context for scheduler, so all the jobs are scheduled to the same engine from this unified queue. It would be with big changes if getting this from kernel."
When it comes to Video Core Next, the other exciting aspect of VCN 4.0 with RDNA3 GPUs is the addition of AV1 video encoding.
The change merged to the RadeonSI Gallium3D driver benefits RDNA3 (GFX11) graphics processors that sport multiple Video Core Next (VCN) engines. By creating an additional context during video transcoding, it's now able to be load balanced across multiple VCN engines.
AMD engineer Leo Liu explained with the RadeonSI patch:
For CHIP_GFX1100, there are 2 VCN instances but using unified queue i.e. decode and encode will go to HW via same ring type. With AMDGPU kernel scheduler, since the trancode is sharing the same pipe context, so that the gpu scheduler assign the decode and encode into the same VCN engine. In order to use both engines with transcode case, the new pipe context will be created when the case being detected, with that the transcode can be load balanced with multiple VCN engines.
More details for those interested via this merge request that is now in Mesa 23.2 for debuting next quarter. In that request it was also raised why ultimately create two contexts rather than having the AMDGPU kernel scheduler better handle the situation, to which Leo Liu explained: "GPU scheduler is not aware whether the job is decode or encode with VCN4 unified queue(previous aka vcn_enc ring). Instead of 2 rings(vcn_dec and vcn_enc with legacy VCN3), for transcode case, there is only one context for scheduler, so all the jobs are scheduled to the same engine from this unified queue. It would be with big changes if getting this from kernel."
When it comes to Video Core Next, the other exciting aspect of VCN 4.0 with RDNA3 GPUs is the addition of AV1 video encoding.
3 Comments