skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fighting Fire with Fire: Superlattice Cooling of Silicon Hotspots to Reduce Global Cooling Requirements

Abstract

The running costs of data centers are dominated by the need to dissipate heat generated by thousands of server machines. Higher temperatures are undesirable as they lead to premature silicon wear-out; in fact, mean time to failure has been shown to decrease exponentially with temperature (Black's law). Although other server components also generate heat, microprocessors still dominate in most server configurations and are also the most vulnerable to wearout as the feature sizes shrink. Even as processor complexity and technology scaling have increased the average energy density inside a processor to maximally tolerable levels, modern microprocessors make extensive use of hardware structures such as the load-store queue and other CAM-based units, and the peak temperatures on chip can be much worse than even the average temperature of the chip. In recent studies, it has been shown that hot-spots inside a processor can generate {approx} 800W/cm{sup 2} heat flux whereas the average heat flux is only 10-50W/cm{sup 2}, and due to this disparity in heat generation, the temperature in hot spots may be up to 30 C more than average chip temperature. The key problem processor hot-spots create is that in order to prevent some critical hardware structures from wearing outmore » faster, the air conditioners in a data center have to be provisioned for worst case requirements. Worse yet, air conditioner efficiencies decrease exponentially as the desired ambient temperature decreases relative to the air outside. As a result, the global cooling costs in data centers, which nearly equals the IT equipment power consumption, are directly correlated with the maximum hot spot temperatures of processors, and there is a distinct requirement for a cooling technique to mitigate hot-spots selectively so that the global air conditioners can operate at higher, more efficient, temperatures. We observe that localized cooling via superlattice microrefrigeration presents exactly this opportunity whereby hot-spots can be cooled selectively and allow global coolers to operate at much more efficient temperatures. Recent advances in processor cooling technologies have demonstrated that thermoelectric coolers (TEC), which use a Peltier effect to form heat pumps, can be used to reduce the temperature of hot spots. By applying a thermoelectric cooler between the heat spreader and the processor die and applying current selectively at the hot spots, heat from the hot-spots can be spread much more efficiently. The ability to implement such thermoelectric coolers on a real silicon device has been demonstrated recently, albeit for small prototype chips. The key question then, that needs to be answered before such thermoelectric coolers can be integrated in commodity server processors, is 'What is the potential for superlattice microrefrigeration to reduce global cooling costs in data centers?'. In order to answer this question, we present a comprehensive analysis of the impact of thermoelectric coolers on global cooling costs. Our thermal analysis covers all aspects of cooling a server in a data center, and integrates on-chip dynamic and leakage power sources with a detailed heat diffusion model of a processor (that models the silicon to the thermoelectric cooler to the heat spreader and the heat sink) and finally the computer room air conditioner (CRAC) efficiency, as shown in Figure 1. In Section II, we present the components of the system model.« less

Authors:
; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
991816
Report Number(s):
LLNL-CONF-458952
TRN: US201021%%389
DOE Contract Number:  
W-7405-ENG-48
Resource Type:
Conference
Resource Relation:
Conference: Presented at: Graduate Student Workshop 2010, Santa Barbara, CA, United States, Oct 08 - Oct 08, 2010
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS; AIR CONDITIONERS; AMBIENT TEMPERATURE; COMPUTERS; DIFFUSION; EFFICIENCY; ENERGY DENSITY; HEAT EXCHANGERS; HEAT FLUX; HEAT PUMPS; HEAT SINKS; HOT SPOTS; MICROPROCESSORS; QUEUES; SILICON; SUPERLATTICES; THERMAL ANALYSIS; THERMOELECTRIC COOLERS

Citation Formats

Biswas, S, Tiwari, M, Sherwood, T, Theogarajan, L, and Chong, F T. Fighting Fire with Fire: Superlattice Cooling of Silicon Hotspots to Reduce Global Cooling Requirements. United States: N. p., 2010. Web.
Biswas, S, Tiwari, M, Sherwood, T, Theogarajan, L, & Chong, F T. Fighting Fire with Fire: Superlattice Cooling of Silicon Hotspots to Reduce Global Cooling Requirements. United States.
Biswas, S, Tiwari, M, Sherwood, T, Theogarajan, L, and Chong, F T. Tue . "Fighting Fire with Fire: Superlattice Cooling of Silicon Hotspots to Reduce Global Cooling Requirements". United States. https://www.osti.gov/servlets/purl/991816.
@article{osti_991816,
title = {Fighting Fire with Fire: Superlattice Cooling of Silicon Hotspots to Reduce Global Cooling Requirements},
author = {Biswas, S and Tiwari, M and Sherwood, T and Theogarajan, L and Chong, F T},
abstractNote = {The running costs of data centers are dominated by the need to dissipate heat generated by thousands of server machines. Higher temperatures are undesirable as they lead to premature silicon wear-out; in fact, mean time to failure has been shown to decrease exponentially with temperature (Black's law). Although other server components also generate heat, microprocessors still dominate in most server configurations and are also the most vulnerable to wearout as the feature sizes shrink. Even as processor complexity and technology scaling have increased the average energy density inside a processor to maximally tolerable levels, modern microprocessors make extensive use of hardware structures such as the load-store queue and other CAM-based units, and the peak temperatures on chip can be much worse than even the average temperature of the chip. In recent studies, it has been shown that hot-spots inside a processor can generate {approx} 800W/cm{sup 2} heat flux whereas the average heat flux is only 10-50W/cm{sup 2}, and due to this disparity in heat generation, the temperature in hot spots may be up to 30 C more than average chip temperature. The key problem processor hot-spots create is that in order to prevent some critical hardware structures from wearing out faster, the air conditioners in a data center have to be provisioned for worst case requirements. Worse yet, air conditioner efficiencies decrease exponentially as the desired ambient temperature decreases relative to the air outside. As a result, the global cooling costs in data centers, which nearly equals the IT equipment power consumption, are directly correlated with the maximum hot spot temperatures of processors, and there is a distinct requirement for a cooling technique to mitigate hot-spots selectively so that the global air conditioners can operate at higher, more efficient, temperatures. We observe that localized cooling via superlattice microrefrigeration presents exactly this opportunity whereby hot-spots can be cooled selectively and allow global coolers to operate at much more efficient temperatures. Recent advances in processor cooling technologies have demonstrated that thermoelectric coolers (TEC), which use a Peltier effect to form heat pumps, can be used to reduce the temperature of hot spots. By applying a thermoelectric cooler between the heat spreader and the processor die and applying current selectively at the hot spots, heat from the hot-spots can be spread much more efficiently. The ability to implement such thermoelectric coolers on a real silicon device has been demonstrated recently, albeit for small prototype chips. The key question then, that needs to be answered before such thermoelectric coolers can be integrated in commodity server processors, is 'What is the potential for superlattice microrefrigeration to reduce global cooling costs in data centers?'. In order to answer this question, we present a comprehensive analysis of the impact of thermoelectric coolers on global cooling costs. Our thermal analysis covers all aspects of cooling a server in a data center, and integrates on-chip dynamic and leakage power sources with a detailed heat diffusion model of a processor (that models the silicon to the thermoelectric cooler to the heat spreader and the heat sink) and finally the computer room air conditioner (CRAC) efficiency, as shown in Figure 1. In Section II, we present the components of the system model.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {10}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: