The Pattern of Overbuilding: Server Utilization and Energy Productivity for Green IT

The team at Microsoft has outlined their latest analysis of data center energy usage and opportunities for efficiencies in a recent white paper entitled The IT Energy Efficiency Imperative. We have partnered with Microsoft to run key findings from the white paper as a ten-part series. To read the full series, click here. The white paper can be downloaded here in PDF format.

Ensuring the reliability of mission-critical systems has been a top priority since the early days of IT. As a result, computer systems have routinely been overbuilt to reduce the likelihood of unplanned disruptions due to hardware or software failures or system slowdowns caused by unexpected user demand.

This pattern of overbuilding is a major cause of poor IT energy productivity.

For every 100 watts of power consumed in a typical data center, fewer than 3 watts are used to do actual computing. In other words, a typical data center consumes more than 30 times the energy that is required to perform computations. Most of the remaining energy is wasted due to server underutilization, with power and cooling inefficiencies accounting for the rest.

The potential energy efficiency gains from increased server utilization as compared to traditional efficiency measures are illustrated the figure below:

The Power Usage Effectiveness (PUE), shown in green, refers to the ratio of the total amount of power used by the data center facility to the power delivered to the IT equipment; Power Supply Efficiency (PSE), shown in red refers to the efficiency of conversion from AC to DC power.

As you can see, increasing server utilization (blue), clearly offers greater potential to improve a data center’s overall IT energy productivity compared to PUE and PSE improvements. This is far greater than conventional wisdom has recognized.

In fact, until average server utilization approaches 50%, server utilization improvements are the biggest potential gains for data centers because servers consume a significant amount of energy when idle—anywhere between 30% and 60% of full power.

The measurement of server utilization, down to the subsystem (CPU, memory, disk, and network) level is complex, requiring sophisticated management infrastructure to carry out. A good place to start saving, without spending a lot of money is by identifying “abandoned” or unused servers which might still be drawing power. This can be easily done by simply querying whether there is any user activity going in and out of the server at all. On the other hand, IT departments with sophisticated operational practices might measure utilization of individual servers and their subsystems in near-real-time to dynamically optimize workload placement.

The cost of powering chronically underutilized IT equipment can be a significant percentage of an organization’s energy bill, and a substantial contributor to data center capacity constraints. These constraints include limits on available utility power, limited power and cooling capacity within the building, and lack of physical space for computers.

The manufacturing of computers that will be underutilized also wastes a significant amount of energy, water, and raw materials (including so called “conflict minerals.”) Such equipment typically becomes e-waste within just a few years. In the European Union, for instance, no more than one-third of e-waste is responsibly recycled in a verifiable way. The remaining e-waste often ends up in landfills or is shipped to developing countries, where it is typically dismantled using methods that can often contaminate the surrounding land, air, and water with toxic metals and chemical compounds, threatening the health of unprotected workers and others in the surrounding areas.

Furthermore, given current and projected electric power and resource constraints, failure to significantly improve the utilization of IT equipment will likely limit the ability of IT to help address the world’s pressing economic, societal, and environmental challenges.

We’ll be exploring these issues in depth in the coming weeks. Follow along here