1、Ensemble-level Power Management for Dense Blade Servers,Partha Ranganathan, Phil Leech Hewlett Packard David Irwin, Jeff Chase Duke University,HP Confidential,2,The problem,Power density key challenge in enterprise environments Blades increasing power density; Data center pushback on cooling,Increas
2、ed thermal-related failures if not addressedProblems exacerbated with data center consolidation,HP Confidential,3,Challenges with Traditional Solutions,Pure infrastructure solutions reaching limits Forced air cooling to liquid cooling? 60+Amps per rack?Large costs for power and cooling Capital costs
3、: E.g., 10MW data center, $2-$4 million for cooling equipment Recurring costs: At data center, 1W of cooling for 1W of power For 10MW data center, $4-$8 million for cooling powerCan we address this problem at system design level?,HP Confidential,4,This Talk: Contributions,Address power density at sy
4、stem levelEnsemble-level architecture for power management Manage power budget across collections of systems Recognize trends across multiple systems Address compounded overprovisioning inefficienciesPower trends from 130+ servers in real deployments Extract power efficiencies at larger scale Archit
5、ecture and implementation Simple hardware/software support; preemptive and reactive policies Prototype and simulation at blade enclosure level Significant power savings; no performance loss,HP Confidential,5,Workload Behavior Trends,Nominal different from peak (and nameplate),Data from ,HP Confident
6、ial,6,Workload Behavior Trends,Sum-of-peaks peak-of-sums (system-of-system) Non-synchronized burstiness across systems,Data from ,300,150,HP Confidential,7,Workload Behavior Trends,Similar trends on 132 servers in 9 different sitesWhat does this mean? Compounded inefficiencies Managing power budget
7、for individual peak 20W blades, 500W enclosures, 10KW racks, Managing power budget for ensemble typical-case 20W blades, 250W enclosures, 4KW racks, ,HP Confidential,8,Functional Architecture,Hardware-software coordination for power control Provision system for lower power budget Intelligent softwar
8、e agent Monitors power of individual blades Ensures that total power of enclosure not threshold Use power throttling hooks in system in rare case of violations,Application requirements,HP Confidential,9,Enclosure-level Implementation,* Initialization and setup * data gathering/heartbeat checking * E
9、vent response,HP Confidential,10,Implementation Choices,Selection of system power budget What value? Enforcement strictness? Thermal provisioning: relaxed Power provisioning: strict Power monitoring and control Power/Temp? Polling/interrupts? Components? P-states?,HP Confidential,11,Implementation C
10、hoices (2),Policies for power throttling Assigning power budgets Preemptive: “ask before you can use more power” Reactive: “use as much as you want until told you cant” Choice of servers to (un)throttle Round-robin, lowest-performance, highest-power, fair-share, Power level to (un)throttle Increment
11、al, deep, Resource estimation and polling heuristics,HP Confidential,12,Outline,Introduction Characterizing real-world power trends Architecture & Implementation Evaluation Conclusions,HP Confidential,13,Prototype Experiments,Experimental test bed with 8 proto blades 1 GHz TM8000, 256MB, 40GB, Windo
12、ws(533MHz/0.8V, 600MHz/.925V, 700MHz/1V, 833MHz/1.1V, 1000MHz/1.25V) Prior blade design + power monitoring support Firmware changes to BIOS and blade/enclosure controllersBenchmarks: VNCplay and batch simulationsMeasured power and performanceTradeoffs + Validates implementation + Actual performance
13、and power results - Hard to model real enterprise traces - Hard to do detailed design space exploration,HP Confidential,14,Simulator Experiments,High-level model of blade enclosure Input resource utilization traces Power/performance models Configurable architecture parameters Results validated on pr
14、ototypeBenchmarks 9 real enterprise site traces for 132 servers Synthetic utilization traces of varying concurrency, load, Metrics Total workload performance, per-server performance Changes in utilization, frequency, MIPS for peak/idle Usage of different P-states, impact of delays,HP Confidential,15
15、,Results,Significant enclosure power budget reductions 10-20% enclosure level, 25-50% processor level Higher savings possible with other P-state controls Marginal impact on performance (less than 5%) Preemptive competitive to reactive,HP Confidential,16,Interactive Applications,Minimal impact on lat
16、ency Vncplay interactive latency CDFs within measurement error,HP Confidential,17,Sensitivity Experiments,Other policy choices No impact on real workload traces Throttling few servers at high P-states preferable (vs. throttling many servers at low P-states),Sensitivity to workload characteristics,HP
17、 Confidential,18,Other Benefits,Beyond the enclosure Cascading benefits at rack, data center, etc. “Soft” component power budgets for lower cost e.g.,high-volume high-power vs high-cost low-power CPUAdaptive power budget control Heterogeneous power supplies for low-cost redundancyAverage power reduc
18、tion e.g., 90th% enclosure vs. multiple 90th% blades,HP Confidential,19,Summary,Critical power density problem in enterprises Ensemble-level architecture for power management Manage power budget across collections of systems Recognize trends across multiple systems Address compounded overprovisionin
19、g inefficienciesReal world power analysis (130+ servers in 9 sites) Dramatic differences between sum of peaks and peak of sums Architecture and implementation Simple hardware/software support; preemptive and reactive policies Prototype and simulation at blade enclosure level Significant power saving
20、s; no performance loss Other benefits in component flexibility, resiliency, ,HP Confidential,20,Questions?,Speaker contact: Partha.R,HP Confidential,21,Backup Slides,HP Confidential,22,H,sap1,desktop1,ecomm2,ecomm1,desktop2,pharma,worldcup,sap2,HP Confidential,23,HP Confidential,24,Backup on Simulat
21、ion,HP Confidential,25,Pre-emptive and Reactive Policies,Start with all servers unthrottledAt each control period or on interruptCompute total power consumptionCheck if power above thresholdIf yesPrioritize which servers to throttleThrottle each server to decided levelStop when power budget below th
22、resholdIf noPrioritize which server to unthrottleUnthrottle each server to decided levelStop if power budget likely exceeded,Start with all servers throttledAt each control period or on interruptCompute total power consumptionIdentify servers with “low” utilizationPrioritize which servers to throttl
23、eThrottle each server to decided levelCheck if room in power budgetIf yes Identify servers with “high” utilizationPrioritize which servers to unthrottleUnthrottle each server to decided levelStop if power budget likely exceededIf noStop,HP Confidential,26,Related Work,Single-server power capping Bro
24、oks et al Capping Processor level Felter et al Power ShiftingCluster-level power budget Femal et al Throughput per budget, local control IBM, Duke, Rutgers work on average powerResource provisioning Urgaonkar et al Overbooking resources Yuan et al OS-level CPU scheduling for batteriesCooling work Mo
25、ore et al temperature-aware workload placement Patel et al Smart Cooling Uptime recommendations, ,HP Confidential,27,Future Work,More exploration E.g., geographically distributed servers More policies High-performance workloadsAdaptive power budget variationInterface with other local and global loop
26、s,HP Confidential,28,The problem,HP Confidential,29,A growing problem Server power densities up 10x in last 10 yrs,Source: Datacom Equipment Power Trends and Cooling Applications, ASHRAE, 2005, http:/www.ashrae.org,HP Confidential,30,90th Percentile Utilization,HP Confidential,31,Enterprise power ch
27、allenges: Compute equipment consume power,Electricity costs For large data center, recurring costs: $4-$8 million/yr “ energy costs for data center building $1.7 million last year.”, Cincinnati Bell, 2003 “ electricity costs large fraction of data center operations,” Google 2003Environmental friendl
28、iness Compute equipment energy use: 22M GJ + 3.9M tons CO2 EnergyStar (US), TopRunner (Japan), FOE (Switzerland), “goal to increase computer energy efficiency by 85% by 2005.” Japans “TopRunner” energy program, 2002,HP Confidential,32,Scratch slides,HP Confidential,33,The problem,Power density key c
29、hallenge in enterprise environments Blades increasing power density; Data center pushback on cooling,Increased thermal-related failures if not addressed 50% server reliability degradation for 10oC over 20oC 50% decrease in hard disk lifetime for 15oC increase Problems exacerbated with data center co
30、nsolidation,HP Confidential,34,Costs of Addressing Power Density,Cooling costs large fraction of TCO Capital costs: For 10MW data center, $2-$4 million for cooling equipment Recurring costs: At data center, 1W of cooling for 1W of power For 10MW data center, $4-$8 million for cooling power,Similar i
31、ssues with power delivery Challenges with routing more than 60 amps per rackProblems exacerbated by consolidation & blades growthNeed to go beyond traditional facilities-level solutions,HP Confidential,35,Our Approach,“Ensemble-level” architecture for power managementInsight: systems designed for pe
32、ak usage of individual box but end-user focus on long-term usage of entire solutionSolution: Manage power budget across collections of systems Recognize trends across multiple systems Extract power efficiencies at larger scaleSignificant power budget savings,HP Confidential,36,Significant Power Savi
33、ngs,Processor power down from 100W to 15W (6X) System power down from 350W to 280W (20%) Additional benefits if corresponding hooks for memory, etc. What about performance?,Original power budget 100W,New power budget 22.5,New power budget 15,HP Confidential,37,Simulator Demo of Operation,Rich simula
34、tion infrastructure Facilitates more extensive design space exploration,HP Confidential,38,Questions?,HP Confidential,39,The problem,Power density key challenge in enterprise environments Blades increasing power density; Data center pushback on coolingIncreased thermal-related failures if not addres
35、sed Problems exacerbated with data center consolidation,HP Confidential,40,The problem,Power density key challenge in enterprise environments Blades increasing power density; Data center pushback on cooling,Increased thermal-related failures if not addressedProblems exacerbated with data center consolidation,