M3 Architecture Research Group
Computer Systems Laboratory
361 Frank H.T. Rhodes Hall
Ithaca, NY 14853 USA
m3 at csl.cornell.edu
Power-aware Parallel Architectures
Over the last years, power has become a top-priority concern for chip designers and manufacturers. The simultaneous drive toward multicore chips poses very interesting opportunities and challenges to optimizing parallel processing on multiprocessors in general, and on multicore chips in particular, under varying power, performance, and application characteristics.
In [HPCA'04] we propose the Thrifty Barrier, which strives to counter the power waste resulting from inefficient synchronization, but minimizing the impact on performance. We leverage low-power sleep or active states in existing processors, and use prediction mechanisms to force waiting processors into an appropriate low-power sleep mode, and wake them up on time to resume computation with little impact on execution time.
In [ISPASS'05,TACO'05] we propose to characterize a program's parallel efficiency, and combine it with the multicore chip's power budget to assign the optimum number of processors, in combination with the appropriate levels of voltage and frequency scaling. In that work we analyze the power-performance implications of questions such as: (1) For a particular performance target, what is the number of processors that allows the chip to reach that target at minimum power consumption? (2) What configuration yields the maximum performance within a certain power/temperature budget?
In [HPCA'06] we leverage the insights developed in the earlier study to develop a mechanism that dynamically finds an optimum power-performance operating point within the two-dimensional space of number of processors and voltage/frequency levels of a given parallel region of a program. We present ways to greatly speed up such a search, by cutting down on the number of levels explored in each of these two dimensions. We show that, for the applications and system configuration studied, our proposed mechanism virtually converges to the global optimum.
One of the major hurdles toward the scalability of multicore architectures is the power-performance scalability of its interconnect. In [MICRO '06] we conduct an initial study on the technological challenges, design trade-offs, and overall impact of utilizing photonics to implement a hybrid optical-electrical bus-based multicore processor. Our results highlight the favorable power-performance trade-off that photonic technology has the potential to offer. This work was selected to 2007 IEEE Micro Top Picks in Computer Architecture.
Support
This work is supported in part by NSF CAREER award CCF-0545995, and equipment donated by Intel.