System Event Log Troubleshooting Guide for Intel® S5500/S3420


Add to my manuals
110 Pages

advertisement

System Event Log Troubleshooting Guide for Intel® S5500/S3420 | Manualzz

Cooling Subsystem

Table 28: Fan Speed Sensor – Event Trigger Offset – Next Steps

Event Trigger Offset

Hex Description

00h Lower non-critical going low

02h Lower critical going low

Assertion

Severity

Deassert

Severity

Description

Degraded non-fatal

OK The fan speed has dropped below its lower non-critical threshold.

Degraded The fan speed has dropped below its lower critical threshold.

Next Steps

A fan speed error on a new system build is typically not caused by the fan spinning too slowly, instead it is caused by the fan being connected to the wrong header (the BMC expects them on certain headers for each chassis and will log this event if there is no fan on that header).

1. Refer to the Quick Start Guide or the Service Guide to identify the correct fan headers to use.

2. Ensure the latest FRUSDR update has been run and the correct chassis was detected or selected.

3. If you are sure this was done, the event may be a sign of impending fan failure (although this will only normally apply if the system has been in use for a while). Replace the fan.

5.1.2 Fan Presence and Redundancy Sensors

Fan presence sensors are only implemented for hot-swap fans, and require an additional pin on the fan header. Fan redundancy is an aggregate of the fan presence sensors and will warn when redundancy is lost. Typically the redundancy mode on Intel

®

servers is an n+1 redundancy (if one fan fails there are still sufficient fans to cool the system, but it is no longer redundant) although other modes are also possible.

Table 29: Fan Presence Sensors Typical Characteristics

Byte Field

11 Sensor Type

12 Sensor Number

13 Event Direction and

Event Type

Description

04h = Fan

40h-45h (Chassis specific)

[7] Event direction

0b = Assertion Event

1b = Deassertion Event

[6:0] Event Type = 08h (Generic “digital” Discrete)

30 Intel order number G74211-002 Revision 1.1

System Event Log Troubleshooting Guide for Intel

®

Cooling Subsystem

Byte Field

14 Event Data 1

15 Event Data 2

16 Event Data 3

Description

[7:6] – 00b = Unspecified Event Data 2

[5:4] – 00b = Unspecified Event Data 3

[3:0] – Event Trigger Offset as described in Table 30

Not used

Not used

The following table describes the severity of each of the event triggers for both assertion and deassertion.

Table 30: Fan Presence Sensors – Event Trigger Offset – Next Steps

Event Trigger Offset

Hex Description

01h Device

Present

Assertion

Severity

OK

Deassert

Severity

Description

Degraded Assertion – A fan was inserted. This event may also get logged when the

BMC initializes when AC is applied.

Deassert – A fan was removed, or was not present at the expected location when the BMC initialized.

Informational only

Next Steps

These events only get generated in systems with hot-swappable fans, and normally only when a fan is physically inserted or removed. If fans were not physically removed:

1. Use the Quick Start Guide to check whether the right fan headers were used.

2. Swap the fans round to see whether the problem stays with the location, or follows the fan.

3. Replace the fan or fan wiring/housing depending on the outcome of step 2.

4. Ensure the latest FRUSDR update has been run and the correct chassis was detected or selected.

Table 31: Fan Redundancy Sensors Typical Characteristics

Description Byte

11 Sensor Type

Field

12 Sensor Number

04h = Fan

46h

Revision 1.1 Intel order number G74211-002 31

Cooling Subsystem

Byte Field

13 Event Direction and

Event Type

14 Event Data 1

15 Event Data 2

16 Event Data 3

Description

[7] Event direction

0b = Assertion Event

1b = Deassertion Event

[6:0] Event Type = 0Bh (Generic Discrete)

[7:6] – 00b = Unspecified Event Data 2

[5:4] – 00b = Unspecified Event Data 3

[3:0] – Event Trigger Offset as described in Table 32

Not used

Not used

The following table describes the severity of each of the event triggers for both assertion and deassertion.

Table 32: Fan Redundancy Sensor – Event Trigger Offset – Next Steps

Event Trigger Offset

Hex

00h Fully redundant

Description

01h Redundancy lost

02h Redundancy degraded

03h Non-redundant, sufficient from redundant

System has lost one or more fans and is running in non-redundant mode. There are enough fans to keep the system properly cooled, but fan speeds will boost.

Description

04h Non-redundant, sufficient from insufficient

05h Non-redundant, insufficient

06h Non-redundant, degraded from fully redundant

System has lost fans and may no longer be able to cool itself adequately. Overheating may occur if this situation remains for a longer period of time.

System has lost one or more fans and is running in non-redundant mode. There are enough fans to keep the system properly cooled, but fan speeds will boost.

07h Redundant, degraded from non-redundant System has lost one or more fans and is running in a degraded mode, but still is redundant. There are enough fans to keep the system properly cooled.

Next Steps

Fan redundancy loss indicates failure of one or more fans.

Look for lower (non) critical fan errors, or fan removal errors in the SEL, to indicate which fan is causing the problem, and follow the troubleshooting steps for these event types.

32 Intel order number G74211-002 Revision 1.1

advertisement

Was this manual useful for you? Yes No
Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Related manuals

Download PDF

advertisement

Table of contents