Yuki Abe, Hiroshi Sasaki, Martin Peres, Koji Inoue, Kazuaki Murakami, and Shinpei Kato wrote the paper and did the work covering the power and performance of GPU-based systems. The two names in particular you should recognize are Martin Peres and Shinpei Kato.
Martin Peres is the one that has been working on power management and re-clocking support for the open-source Nouveau driver. He's also talked about open-source graphics security and other topics.
Shinpei Kato meanwhile has done work on PathScale's open-source NVIDIA compute driver, proposed GPU command scheduling, and most recently has been working on Gdev that is an open-source NVIDIA CUDA run-time.
Here's their abstract on the USENIX HotPower paper:
Graphics processing units (GPUs) provide significant improvements in performance and performance-per-watt as compared to traditional multicore CPUs. This energy-efficiency of GPUs has facilitated use of GPUs in many application domains. Albeit energy efficient, GPUs still consume non-trivial power independently of CPUs. It is desired to analyze the power and performance charateristic of GPUs and their causal relation with CPUs. In this paper, we provide a power and performance analysis of GPU-accelerated systems for better understandings of these implications. Our analysis discloses that system energy could be reduced by about 28% retaining a decrease in performance within 1%. Specifically, we identify that energy saving is particularly significant when (i) reducing the GPU memory clock for compute- intensive workload and (ii) reducing the GPU core clock for memory-intensive workload. We also demonstrate that voltage and frequency scaling of CPUs is trivial and even should not be applied in GPU-accelerated systems. We believe that these findings are useful to develop dynamic voltage and frequency scaling (DVFS) algorithms for GPU-accelerated systems.And their conclusion:
We have presented a power and performance analysis of GPU-accelerated systems based on the NVIDIA Fermi architecture. Our findings include that the CPU is a weak factor for energy savings of GPU-accelerated systems unless power gating is supported by the GPU. In contrast, voltage and frequency scaling of the GPU is significant to reduce energy consumption. Our experimental results demonstrated that system energy could be reduced by about 28% retaining a decrease in performance within 1%, if the performance level of the GPU can be scaled effectively.A PDF of their GPU power/performance paper that was published last week can be found at USENIX.org.
In future work, we plan to develop DVFS algorithms for GPU-accelerated systems, using the characteristic identified in this paper. We basically consider such an approach that controls the GPU core clock for memory-intensive workload while controls the GPU memory clock for compute-intensive workload. To this end, we integrate PTX code analysis into DVFS algorithms so that energy optimization can be provided at runtime. We also consider a further dynamic scheme that scales the performance level of the GPU during the execution of GPU code, whereas we restricted a scaling point at the boundary of GPU code in this paper.