After upgrading my ddx, I finally got 2d tiling on my RV710. It was supposed to be the thing to increase fillrate on bw-limited cards.
The mesa-demos fill bench had the exact same numbers with and without 2d tiling. Adding SB on top of 2d tiling improved some numbers, but that too had some curious results in the last test.
This card, according to specs, is capable of 2.3 gigapixels/sec. It has only gotten about half that on the open drivers for years, tiling was supposed to improve it, it didn't. Any ideas on why it had no difference welcome.
Everything was measured on the default power profile, which equals high profile on this card.
3.7.10, mesa 9.1.1, ddx 7.1.0, libdrm 2.4.44
The numbers, both with and without 2d tiling:
With SB:
SB gave some minor improvement. However, note the shader2 value: almost exactly half of shader1.
Shader2 consists of shader1 + many no-ops that should be optimized out. By printing the results with R600_DEBUG=sb,sbstat,ps I could see both shaders were optimized to the exact same instructions.
So, we have two curious things here:
- why is the fillrate still only half of hw ability
- why is the exact same shader half the speed, when only the pre-optimized shader differs
The mesa-demos fill bench had the exact same numbers with and without 2d tiling. Adding SB on top of 2d tiling improved some numbers, but that too had some curious results in the last test.
This card, according to specs, is capable of 2.3 gigapixels/sec. It has only gotten about half that on the open drivers for years, tiling was supposed to improve it, it didn't. Any ideas on why it had no difference welcome.
Everything was measured on the default power profile, which equals high profile on this card.
3.7.10, mesa 9.1.1, ddx 7.1.0, libdrm 2.4.44
The numbers, both with and without 2d tiling:
Simple fill: 1.3 billion pixels/second
Blended fill: 1.1 billion pixels/second
Textured fill: 1.1 billion pixels/second
Shader1 fill: 1.1 billion pixels/second
Shader2 fill: 543.8 million pixels/second
Blended fill: 1.1 billion pixels/second
Textured fill: 1.1 billion pixels/second
Shader1 fill: 1.1 billion pixels/second
Shader2 fill: 543.8 million pixels/second
Simple fill: 1.3 billion pixels/second
Blended fill: 1.1 billion pixels/second
Textured fill: 1.2 billion pixels/second
Shader1 fill: 1.2 billion pixels/second
Shader2 fill: 588.0 million pixels/second
Blended fill: 1.1 billion pixels/second
Textured fill: 1.2 billion pixels/second
Shader1 fill: 1.2 billion pixels/second
Shader2 fill: 588.0 million pixels/second
Shader2 consists of shader1 + many no-ops that should be optimized out. By printing the results with R600_DEBUG=sb,sbstat,ps I could see both shaders were optimized to the exact same instructions.
So, we have two curious things here:
- why is the fillrate still only half of hw ability
- why is the exact same shader half the speed, when only the pre-optimized shader differs
Comment