advertisement
System Event Log Troubleshooting Guide for Intel
®
Introduction
The BMC allows access to SEL from in-band and out-of-band mechanisms. There are various tools and utilities that can be used to access the SEL. There is the Intel
®
SELViewer and multiple open sourced IPMI tools.
1.2.3 Intel
®
Intel
®
Intelligent Power Node Manager version 1.5 (NM) is a platform-resident technology that enforces power and thermal policies for the platform. These policies are applied by exploiting subsystem knobs (such as processor P and T states) that can be used to control power consumption. Intel
®
Intelligent Power Node Manager enables data center power and thermal management by exposing an external interface to management software through which platform policies can be specified. It also enables specific data center power management usage models such as power limiting.
The configuration and control commands are used by the external management software or
BMC to configure and control the Intel
®
Intelligent Power Node Manager feature. Because
Platform Services firmware does not have any external interface, external commands are first received by the BMC over LAN and then relayed to the Platform Services firmware over IPMB channel. The BMC acts as a relay and the transport conversion device for these commands. For simplicity, the commands from the management console might be encapsulated in a generic
CONFIG packet format (config data length, config data blob) to the BMC so that the BMC doesn’t even have to parse the actual configuration data.
The BMC provides the access point for remote commands from external management SW and generates alerts to them. Intel
®
Intelligent Power Node Manager on Intel
®
Manageability Engine
(Intel
®
Intel
®
ME) is an IPMI satellite controller. A mechanism needs to exist to forward commands to
ME and send response back to originator. Similarly events from Intel
®
ME have to be sent as alerts outside of the BMC. It is the responsibility of BMC to implement these mechanisms for communication with Intel
®
Intelligent Power Node Manager.
The full specification can be downloaded from the following link: http://www.intel.com/content/dam/doc/technical-specification/intelligent-power-node-manager-1-
5-specification.pdf
Revision 1.1 Intel order number G74211-002 3
advertisement
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Related manuals
advertisement
Table of contents
- 11 Introduction
- 11 Purpose
- 11 Industry Standard
- 11 Intelligent Platform Management Interface (IPMI)
- 12 Baseboard Management Controller (BMC)
- 13 Intelligent Power Node Manager Version
- 14 Basic Decoding of a SEL Record
- 14 Default Values in the SEL Records
- 18 Sensor Cross Reference List
- 18 BMC owned Sensors (GID = 0020h)
- 22 BIOS POST owned Sensors (GID = 0001h)
- 22 BIOS SMI owned Sensors (GID = 0033h)
- 24 Hot Swap Controller Firmware owned Sensors (GID = 00C0h/00C2h)
- 25 Node Manager / ME Firmware owned Sensors (GID = 002Ch or 602Ch)
- 26 Microsoft* OS owned Events (GID = 0041)
- 26 Linux* Kernel Panic Events (GID = 0021)
- 27 Power Subsystems
- 27 Voltage Sensors
- 31 Power Unit
- 31 Power Unit Status Sensor
- 32 Power Unit Redundancy Sensor
- 34 Power Supply
- 34 Power Supply Status Sensors
- 35 Power Supply AC Power Input Sensors
- 36 Power Supply Current Output % Sensors
- 37 Power Supply Temperature Sensors
- 39 Cooling Subsystem
- 39 Fan Sensors
- 39 Fan Speed Sensors
- 40 Fan Presence and Redundancy Sensors
- 43 Temperature Sensors
- 43 Regular Temperature Sensors
- 45 Thermal Margin Sensors
- 46 Processor Thermal Control % Sensors
- 47 Discrete Thermal Sensors
- 49 Processor Subsystem
- 49 Processor Status Sensor
- 50 Catastrophic Error Sensor
- 51 Catastrophic Error Sensor – Next Steps
- 51 CPU Missing Sensor
- 52 CPU Missing Sensor – Next Steps
- 52 QuickPath Interconnect Error Sensors
- 52 QPI Correctable Error Sensor
- 53 QPI Non-Fatal Error Sensor
- 54 QPI Fatal and Fatal
- 56 Memory Subsystem
- 56 Memory RAS Mirroring and Sparing
- 56 Mirroring Configuration Status
- 57 Mirrored Redundancy State Sensor
- 59 Sparing Configuration Status
- 60 Sparing Redundancy State Sensor
- 63 ECC and Address Parity
- 63 Memory Correctable and Uncorrectable ECC Error
- 65 Memory Address Parity Error
- 68 PCI Express* and Legacy PCI Subsystem
- 68 PCI Express* Errors
- 68 PCI Express* Correctable Errors
- 69 PCI Express* Fatal Errors
- 71 Legacy PCI Errors
- 73 System BIOS Events
- 73 System Events
- 73 System Boot
- 73 Timestamp Clock Synchronization
- 74 System Firmware Progress (Formerly Post Error)
- 75 System Firmware Progress (Formerly Post Error) – Next Steps
- 81 Chassis Subsystem
- 81 Physical Security
- 81 Chassis Intrusion
- 81 LAN Leash Lost
- 83 FP (NMI) Interrupt
- 83 FP (NMI) Interrupt – Next Steps
- 84 Button Press Events
- 85 Miscellaneous Events
- 85 IPMI Watchdog
- 87 SMI Timeout
- 87 SMI Timeout – Next Steps
- 88 System Event Log Cleared
- 88 System Event – PEF Action
- 89 System Event – PEF Action – Next Steps
- 90 Hot Swap Controller Events
- 90 HSC Backplane Temperature Sensor
- 91 HSC Drive Slot Status Sensor
- 92 HSC Drive Slot Status Sensor – Next Steps
- 92 HSC Drive Presence Sensor
- 93 HSC Drive Presence Sensor – Next Steps
- 95 Manageability Engine (ME) Events
- 95 Node Manager Exception Event
- 96 Node Manager Exception Event – Next Steps
- 96 Node Manager Health Event
- 97 Node Manager Health Event – Next Steps
- 98 Node Manager Operational Capabilities Change
- 99 Node Manager Operational Capabilities Change – Next Steps
- 100 Node Manager Alert Threshold Exceeded
- 101 Node Manager Alert Threshold Exceeded – Next Steps
- 101 ME Firmware Health Event
- 102 ME Firmware Health Event – Next Steps
- 103 Microsoft Windows* Records
- 103 Boot-up Event Records
- 104 Shutdown Event Records
- 107 Bug Check / Blue Screen Event Records
- 109 Linux* Kernel Panic Records