What is IPMI? An Overview of the Intelligent Platform Management Interface
If you are an IT professional or system administrator, you may have heard about IPMI. IPMI is a standard for system management and is being implemented by a large number of manufacturers and vendors. IPMI is growing in popularity. It is very possible that it will come to your server very soon if you have not yet seen it. Amongst the major manufacturers that use it in their servers are HP, Dell, IBM, Intel, Oracle, Supermicro, and more. The following is a basic overview of IPMI to give you an understanding of what it can do.
What is IPMI?
IPMI stands for Intelligent Platform Management Interface. The IPMI standard was jointly created by Intel, Hewlett-Packard, NEC, and Dell. The specification for IPMI 1.0 was released in 1998, with major new versions released in 2001 (the 1.5 specification) and 2004 (the 2.0 specification). The last major version, IPMI 2.0, is what most manufacturers and vendors support today.
IPMI specifies a set of interfaces for standardized platform management. It supports the ability for customers to manage their systems, monitor them, and diagnose problems. These management features function even if an operating system is not working or if the system is not turned on. This is accomplished by putting the platform management in system firmware, rather than the operating system, and using standby power when the system is turned off. This allows system administrators to manage the system or diagnose problems regardless of issues with the operating system.
In most IPMI implementations, IPMI is managed by a controller called the Baseboard Management Controller or BMC. The BMC sits on the motherboard and is connected to the ethernet port of the system. IPMI communication between user software and system hardware components runs through the BMC. Some vendors do not implement a BMC, but instead have an alternate controller that usually supports additional management functionality. For example, HP instead supports IPMI through their ILO chip (integrated lights out), Oracle supports IPMI via ILOM (integrated lights out management), and Dell via DRAC (Dell remote access controller). While not called BMCs, they server a nearly identical function.
There are two major interfaces for IPMI communication, in-band and out-of-band. In-band communication is done locally on the system, usually through the operating system. An IPMI driver is typically required for this, but may not be necessary for some operating systems.
Out-of-band communication is done remotely via IPMI over LAN. It is done entirely over the network and communication is done directly to the BMC via ethernet. The operating system is not involved with IPMI over LAN, allowing IPMI features to operate even if the operating system is not working or the system is powered off. IPMI over LAN comes in two protocols, IPMI 1.5 and IPMI 2.0. The latter provides additional encryption, necessary for some of the features listed below.
What are the features of IPMI?
IPMI specifics the interfaces for a number of platform management features. The majority of vendors support the following features, although the depth of the support varies by the vendor.
Sensors - IPMI supports the ability to read sensors such as temperature, voltage, fan speed, power supply status, and many more. Sensors allow users to monitor the current state of the system.
System Event Logging (SELL) - IPMI supports the ability to read a log of hardware events stored in firmware. The log may store the times and dates when CPUs failed, temperature exceeded normal ranges, memory errors, failed redundancy, and more. The SELL allows users to diagnose problems on the system after they have occurred.
Watchdog Timer - IPMI supports the ability to manage a system watchdog timer. In the event of a system error, the watchdog timer may be able to perform a timeout action, such as system shutoff or system restart.
Power Control - IPMI supports the ability to remotely turn on or turn off the system. This may allow users to restart systems in the event of system errors.
Field Replaceable Unit (FRU) Inventory - IPMI supports the ability to read FRU information, which may be useful for inventory management or a RMA process. For example, FRUs may have details on hardware type, hardware details, serial numbers, etc.
Serial Over LAN (SOL) - Beginning with IPMI 2.0, IPMI supports a protocol for reading and writing serial traffic over Ethernet. This allows users to connect to a system console for system error diagnosis.
Boot management - IPMI supports the ability to configure boot processes. This is useful so that it can be done outside of the BIOS.
Most of the features listed above may be performed in-band or out-of-band, with obvious exceptions (e.g. remote power control will not work in-band).
What are the Pros of IPMI?
As a standard interface, software that you use on the servers of one manufacturer should also work on the servers of another manufacturer. There should be no need to learn or use new software whenever different hardware is purchased.
The cost of IPMI, compared to traditional hardware management, should prove to be much lower. Many manufacturers and vendors supply IPMI for free, it is simply something that exists on the system. If your organization previously purchased remote controllable power strips or KVM switches for remote management, you may no longer need them. If your organization did not previously purchase remote management hardware, you'll find you have a significant amount of remote management and diagnostic features you didn't have before. Either way, IPMI should help reduce total cost of ownership.
What are the Cons of IPMI?
While IPMI is a standard, the implementation by vendors is widely varying. For example, one system may log a CPU failure to the SELL, while another may not. So hardware error diagnosis may be more or less useful dependent on vendor. In addition, IPMI has been implemented differently or incorrectly by many vendors. IPMI software that may function on the hardware of one vendor, may not work on the hardware of another. Also, the IPMI specification supports the ability for vendors to add OEM specific extensions onto their platforms. These OEM extensions are done to support hardware specific functionality or error reporting that may not be in the IPMI specification. Therefore, software supported by the vendor may be the only way to utilize those extensions. Potential users of IPMI should be aware that just because a platform supports IPMI, does not mean your software of choice may function on it.
The reliability of IPMI has been poorer than traditional remote management. The BMC must implement and perform far more complex tasks than older serial chips and sensor chips. This complexity has made the reliability of BMCs far worse. Many users of IPMI have complained of BMC failures that require a system reboot. Since the BMC manages IPMI, it is frustrating that one of the critical times a reboot is required, remote power control is not functional.
In addition, IPMI over LAN has shown itself to be less reliable than traditional remote management. IPMI over LAN is typically implemented on the same ethernet port that the server uses for its own personal traffic. Therefore, IPMI over LAN traffic will compete with the traffic of the server's operational use. If the server or network is busy, IPMI over LAN may be slow or unresponsive. IPMI over LAN also requires multiple components of the platform and network to not fail in order for it to function. For example, if a power supply or ethernet port fail, many of the remote management features listed above may not be available. Many users of IPMI consider this frustrating, since IPMI is most needed during system failures.