We have partnered with Microsoft to run key findings from a recent white paper entitled The IT Energy Efficiency Imperative as a ten-part series. To read the full series, click here. The white paper can be downloaded here in PDF format.
Most IT energy efficiency efforts have traditionally focused on physical infrastructure. They deploy more energy-efficient computer hardware and cooling systems, use operating system power management features, and reduce the number of servers in data centers through hardware virtualization.
But a significant amount of IT energy inefficiency stems from how applications are designed and operated. Most applications are allocated far more IT resources (servers, CPU, memory, etc.) than they need, as a buffer to ensure acceptable performance and to protect against hardware failure. Most often, the actual needs of the application are simply never measured. This practice naturally results in poor hardware utilization across the data center.
While virtualization can help improve hardware utilization to a certain degree, many organizations find that utilization is lower now than before they started virtualizing. How is that possible? Because their use of IT resources is not scaled dynamically, applications in virtualized servers are often just as idle as they were when they ran on dedicated infrastructure. This results in more instances of active virtual machines than necessary. This phenomenon is often referred to as virtual server sprawl, with virtual machines being allocated more IT resources than they really need. The problem is compounded by the fact that hardware performance generally grows faster than virtual server consolidation does, so with each new generation of hardware, utilization actually decreases.
Considering the enormous number of existing applications, virtualization management systems can help optimize the placement and configuration of virtual machines to improve utilization of the virtualized server infrastructure. These kinds of systems management solutions can also be used to report and even cap the electric power draw of the host infrastructure, effectively constraining an application’s use of IT resources.
But fundamental issues remain. Applications are generally not instrumented to expose metrics that could help IT management systems determine the type and quantity of IT resources required to meet certain performance criteria, and they do not provide mechanisms to externally control their use of resources.
Administrators must often guess the quantity and type of IT resources an application needs to operate with acceptable performance. This uncertainty is caused by lack of information about how an application consumes resources, the likely demand for the application over time, and the manner in which resource use scales with demand.
Developers can significantly reduce such uncertainty by instrumenting their applications to expose usage and performance metrics. The metrics can be used to dynamically assign and withdraw IT resources as needed. In recognition of this need, the Marlowe project at Microsoft Research is experimenting with a framework to optimize IT resource allocation for applications by using developer instrumented usage and performance metrics, in addition to more abstract utilization metrics such as overall CPU utilization. When these metrics are correlated with the underlying hardware performance, it is possible to dynamically assign the appropriate quantity and type of IT resources to operate the application with desired performance characteristics (as specified by a service level agreement) at any given scale of use. Past usage data can also be analyzed to provide near-real-time use forecasts to assist with IT resource provisioning for the application. Excess capacity can be temporarily repurposed for “opportunistic” computing tasks (such as batch jobs) until it is needed again, or it can be shut down entirely if there is no additional work or if there is a need to reduce energy consumption.
In addition, breaking down applications into fine-grained units of service delivery—as demonstrated by another Microsoft Research initiative, the Orleans project—enables far more effective IT resource usage and control than is typically possible with large, monolithic application components and services. The higher level of abstraction away from the computing resources that Orleans provides allows developers to focus on business value rather than on the complexities of scaling and reliability, and it should markedly increase overall IT resource utilization.
Because the demand for many applications is initially unknown and unpredictable, developers should consider implementing administrator controls that can limit the number of simultaneous users and perhaps dynamically degrade the fidelity of the application experience if cost containment is important or if IT resources are limited. In this way, operating costs can be better managed, and active users of an application will have a predictable experience rather than experiencing random slowdowns resulting from more simultaneous active users than the allocated IT resources can accommodate. Applications should also be able to defer noncritical work such as batch processing and other maintenance tasks when specified by the IT operator. This will provide additional “virtual capacity” for other applications that have unanticipated spikes in demand. It will also provide additional options for IT departments to temporarily reduce power use, such as when responding to a demand response event from a utility.
Applications that are designed with these IT efficiency goals in mind are also generally much easier to operate in degraded or partially recovered states during disaster recovery situations. Similarly, applications do not need to have dedicated capacity to sustain service during maintenance events. IT resources can be claimed and released as required, keeping the costs of maintenance events to a minimum. It can also be cost effective for developers and testers of resource intensive applications to invest in increasing the amount of work produced per watt-hour, with the goal of making energy use scale roughly linearly with useful output.
In addition, nearly all applications should be tested to ensure that they do not waste energy while performing little or no useful work. Clearly, for applications that will be rarely used or are experimental in nature, these investments are probably not necessary. Existing management infrastructures are more than sufficient to take care of scaling, and inadvertently starving these applications of IT resources on a temporary basis will not cause significant harm. However, for the many applications that are likely to be long lived and periodically resource intensive, designing them so their use of IT resources can be dynamically scaled based on demand and other constraints can easily pay off.
We’ll be exploring these issues in depth in the coming weeks. Follow along here