Problem Determination and Service Guide for the IBM Power PS700

Power Systems
Problem Determination and Service
Guide for the
IBM Power PS700 (8406-70Y)
IBM
GI11-9831-00
Power Systems
Problem Determination and Service
Guide for the
IBM Power PS700 (8406-70Y)
IBM
GI11-9831-00
Note
Before using this information and the product it supports, read the information in “Notices,” on page 271, “Safety notices”
on page v, the IBM Systems Safety Notices manual, G229-9054, and the IBM Environmental Notices and User Guide, Z125–5823.
This edition applies to IBM Power Systems servers that contain the POWER7 processor and to all associated
models.
© Copyright IBM Corporation 2010, 2011.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Safety notices . . . . . . . . . . . . v
Chapter 1. Introduction . . . . . . . . 1
Related documentation . . . .
Notices and statements . . . .
Features and specifications. . .
Supported DIMMs . . . . .
Blade server control panel buttons
Turning on the blade server . .
Turning off the blade server . .
System-board layouts . . . .
System-board connectors . .
System-board LEDs . . . .
. . . .
. . . .
. . . .
. . . .
and LEDs
. . . .
. . . .
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
2
4
5
6
7
8
8
9
Chapter 2. Diagnostics . . . . . . . . 11
Diagnostic tools . . . . . . . . . . . . . 11
Collecting dump data . . . . . . . . . . . 13
Location codes . . . . . . . . . . . . . 14
Reference codes . . . . . . . . . . . . . 15
System reference codes (SRCs) . . . . . . . 16
1xxxyyyy SRCs . . . . . . . . . . . 17
6xxxyyyy SRCs . . . . . . . . . . . 21
A1xxyyyy service processor SRCs . . . . . 24
AA00E1A8 to AA260005 Partition firmware
attention codes . . . . . . . . . . . 25
Bxxxxxxx Service processor early termination
SRCs . . . . . . . . . . . . . . 28
B200xxxx Logical partition SRCs . . . . . 29
B700xxxx Licensed internal code SRCs . . . 39
BA000010 to BA400002 Partition firmware
SRCs . . . . . . . . . . . . . . 48
POST progress codes (checkpoints) . . . . . 84
C1001F00 to C1645300 Service processor
checkpoints . . . . . . . . . . . . 85
C2001000 to C20082FF Virtual service
processor checkpoints . . . . . . . . . 93
IPL status progress codes . . . . . . . 102
C700xxxx Server firmware IPL status
checkpoints . . . . . . . . . . . 102
CA000000 to CA2799FF Partition firmware
checkpoints . . . . . . . . . . . . 102
D1001xxx to D1xx3FFF Service processor
dump codes . . . . . . . . . . . . 120
D1xx3y01 to D1xx3yF2 Service processor
dump codes . . . . . . . . . . . 125
D1xx900C to D1xxC003 Service processor
power-off checkpoints . . . . . . . . 128
Service request numbers (SRNs) . . . . . . 129
Using the SRN tables . . . . . . . . . 129
101-711 through FFC-725 SRNs . . . . . 129
A00-FF0 through A24-xxx SRNs . . . . . 157
SCSD Devices SRNs (ssss-102 to ssss-640)
177
Failing function codes 151 through 2E33 . . 181
Error logs . . . . . . . . . . . . . . 183
Checkout procedure . . . . . . . . . . . 184
© Copyright IBM Corp. 2010, 2011
About the checkout procedure. . . . . .
Performing the checkout procedure . . . .
Verifying the partition configuration. . . . .
Running the diagnostics program . . . . .
Starting AIX concurrent diagnostics . . . .
Starting stand-alone diagnostics from a CD .
Starting stand-alone diagnostics from a NIM
server . . . . . . . . . . . . . .
Using the diagnostics program . . . . .
Boot problem resolution . . . . . . . . .
Troubleshooting tables . . . . . . . . .
General problems . . . . . . . . . .
Drive problems. . . . . . . . . . .
Intermittent problems . . . . . . . .
Management module service processor
problems . . . . . . . . . . . . .
Memory problems . . . . . . . . . .
Microprocessor problems . . . . . . .
Network connection problems . . . . . .
PCI expansion card (PIOCARD) problem
isolation procedure . . . . . . . . .
Optional device problems . . . . . . .
Power problems . . . . . . . . . .
POWER Hypervisor (PHYP) problems . . .
Service processor problems . . . . . . .
Software problems. . . . . . . . . .
Universal Serial Bus (USB) port problems . .
Light path diagnostics . . . . . . . . .
Viewing the light path diagnostic LEDs . .
Light path diagnostics LEDs . . . . . .
Isolating firmware problems . . . . . . .
Save vfchost map data . . . . . . . . .
Restore vfchost map data . . . . . . . .
Recovering the system firmware . . . . . .
Starting the PERM image . . . . . . .
Starting the TEMP image . . . . . . .
Recovering the TEMP image from the PERM
image . . . . . . . . . . . . . .
Verifying the system firmware levels . . .
Committing the TEMP system firmware image
Solving shared BladeCenter resource problems .
Solving shared media tray problems. . . .
Solving shared network connection problems
Solving shared power problems . . . . .
Solving shared video problems . . . . .
Solving undetermined problems . . . . . .
Calling IBM for service . . . . . . . . .
.
.
.
.
.
.
184
184
186
186
186
187
.
.
.
.
.
.
.
188
189
190
191
191
192
192
.
.
.
.
193
193
194
194
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
194
196
196
198
200
213
213
214
214
215
218
218
219
220
220
221
. 221
. 221
222
. 222
. 223
225
. 225
. 226
. 227
. 228
Chapter 3. Parts listing, Type 8406
229
Chapter 4. Removing and replacing
blade server components . . . . . . 233
Installation guidelines . . . . .
System reliability guidelines . .
Handling static-sensitive devices .
.
.
.
.
.
.
.
.
.
.
.
.
. 233
. 234
. 234
iii
Returning a device or component . . . . .
Removing the blade server from a BladeCenter
unit . . . . . . . . . . . . . . . .
Installing the blade server in a BladeCenter unit
Removing and replacing Tier 1 CRUs . . . . .
Removing the blade server cover . . . . . .
Installing and closing the blade server cover . .
Removing the bezel assembly . . . . . . .
Installing the bezel assembly . . . . . . .
Removing a drive . . . . . . . . . . .
Installing a drive . . . . . . . . . . .
Removing a memory module . . . . . . .
Installing a memory module . . . . . . .
Removing and installing an I/O expansion card
Removing a CIOv form-factor expansion card
Installing a CIOv form-factor expansion card
Removing a combination-form-factor
expansion card . . . . . . . . . . .
Installing a combination-form-factor
expansion card . . . . . . . . . . .
Removing the battery . . . . . . . . .
Installing the battery . . . . . . . . . .
Removing the disk drive tray . . . . . . .
Installing the disk drive tray . . . . . . .
Removing the tier 2 management card . . . . .
Installing the tier 2 management card . . . . .
iv
234
235
236
237
237
239
240
240
241
242
244
245
246
247
247
249
249
250
251
252
253
255
256
Obtaining a PowerVM Virtualization Engine
system technologies activation code . . . .
Replacing the FRU system-board and chassis
assembly . . . . . . . . . . . . .
.
. 257
.
. 260
Chapter 5. Configuring . . . . . . . 263
Updating the firmware . . . . . . . .
Configuring the blade server . . . . . .
Using the SMS utility. . . . . . . . .
Starting the SMS utility . . . . . . .
SMS utility menu choices . . . . . .
Creating a CE login . . . . . . . . .
Configuring the Gigabit Ethernet controllers .
Blade server Ethernet controller enumeration .
MAC addresses for host Ethernet adapters .
Configuring a RAID array . . . . . . .
Updating IBM Director . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
263
264
265
265
265
266
266
267
268
269
269
Appendix. Notices . . . . . . . . . 271
Trademarks . . . . .
Electronic emission notices
Class A Notices. . .
Class B Notices . . .
Terms and conditions. .
.
.
.
.
.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
272
273
273
277
280
Safety notices
Safety notices may be printed throughout this guide:
v DANGER notices call attention to a situation that is potentially lethal or extremely hazardous to
people.
v CAUTION notices call attention to a situation that is potentially hazardous to people because of some
existing condition.
v Attention notices call attention to the possibility of damage to a program, device, system, or data.
World Trade safety information
Several countries require the safety information contained in product publications to be presented in their
national languages. If this requirement applies to your country, a safety information booklet is included
in the publications package shipped with the product. The booklet contains the safety information in
your national language with references to the U.S. English source. Before using a U.S. English publication
to install, operate, or service this product, you must first become familiar with the related safety
information in the booklet. You should also refer to the booklet any time you do not clearly understand
any safety information in the U.S. English publications.
German safety information
Das Produkt ist nicht für den Einsatz an Bildschirmarbeitsplätzen im Sinne § 2 der
Bildschirmarbeitsverordnung geeignet.
Laser safety information
IBM® servers can use I/O cards or features that are fiber-optic based and that utilize lasers or LEDs.
Laser compliance
IBM servers may be installed inside or outside of an IT equipment rack.
© Copyright IBM Corp. 2010, 2011
v
DANGER
When working on or around the system, observe the following precautions:
Electrical voltage and current from power, telephone, and communication cables are hazardous. To
avoid a shock hazard:
v Connect power to this unit only with the IBM provided power cord. Do not use the IBM
provided power cord for any other product.
v Do not open or service any power supply assembly.
v Do not connect or disconnect any cables or perform installation, maintenance, or reconfiguration
of this product during an electrical storm.
v The product might be equipped with multiple power cords. To remove all hazardous voltages,
disconnect all power cords.
v Connect all power cords to a properly wired and grounded electrical outlet. Ensure that the outlet
supplies proper voltage and phase rotation according to the system rating plate.
v Connect any equipment that will be attached to this product to properly wired outlets.
v When possible, use one hand only to connect or disconnect signal cables.
v Never turn on any equipment when there is evidence of fire, water, or structural damage.
v Disconnect the attached power cords, telecommunications systems, networks, and modems before
you open the device covers, unless instructed otherwise in the installation and configuration
procedures.
v Connect and disconnect cables as described in the following procedures when installing, moving,
or opening covers on this product or attached devices.
To
1.
2.
3.
4.
Disconnect:
Turn off everything (unless instructed otherwise).
Remove the power cords from the outlets.
Remove the signal cables from the connectors.
Remove all cables from the devices
To
1.
2.
3.
4.
5.
Connect:
Turn off everything (unless instructed otherwise).
Attach all cables to the devices.
Attach the signal cables to the connectors.
Attach the power cords to the outlets.
Turn on the devices.
(D005)
DANGER
vi
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Observe the following precautions when working on or around your IT rack system:
v Heavy equipment–personal injury or equipment damage might result if mishandled.
v Always lower the leveling pads on the rack cabinet.
v Always install stabilizer brackets on the rack cabinet.
v To avoid hazardous conditions due to uneven mechanical loading, always install the heaviest
devices in the bottom of the rack cabinet. Always install servers and optional devices starting
from the bottom of the rack cabinet.
v Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top
of rack-mounted devices.
v Each rack cabinet might have more than one power cord. Be sure to disconnect all power cords in
the rack cabinet when directed to disconnect power during servicing.
v Connect all devices installed in a rack cabinet to power devices installed in the same rack
cabinet. Do not plug a power cord from a device installed in one rack cabinet into a power
device installed in a different rack cabinet.
v An electrical outlet that is not correctly wired could place hazardous voltage on the metal parts of
the system or the devices that attach to the system. It is the responsibility of the customer to
ensure that the outlet is correctly wired and grounded to prevent an electrical shock.
CAUTION
v Do not install a unit in a rack where the internal rack ambient temperatures will exceed the
manufacturer's recommended ambient temperature for all your rack-mounted devices.
v Do not install a unit in a rack where the air flow is compromised. Ensure that air flow is not
blocked or reduced on any side, front, or back of a unit used for air flow through the unit.
v Consideration should be given to the connection of the equipment to the supply circuit so that
overloading of the circuits does not compromise the supply wiring or overcurrent protection. To
provide the correct power connection to a rack, refer to the rating labels located on the
equipment in the rack to determine the total power requirement of the supply circuit.
v (For sliding drawers.) Do not pull out or install any drawer or feature if the rack stabilizer brackets
are not attached to the rack. Do not pull out more than one drawer at a time. The rack might
become unstable if you pull out more than one drawer at a time.
v (For fixed drawers.) This drawer is a fixed drawer and must not be moved for servicing unless
specified by the manufacturer. Attempting to move the drawer partially or completely out of the
rack might cause the rack to become unstable or cause the drawer to fall out of the rack.
(R001)
Safety notices
vii
CAUTION:
Removing components from the upper positions in the rack cabinet improves rack stability during
relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a
room or building:
v Reduce the weight of the rack cabinet by removing equipment starting at the top of the rack
cabinet. When possible, restore the rack cabinet to the configuration of the rack cabinet as you
received it. If this configuration is not known, you must observe the following precautions:
– Remove all devices in the 32U position and above.
– Ensure that the heaviest devices are installed in the bottom of the rack cabinet.
– Ensure that there are no empty U-levels between devices installed in the rack cabinet below the
32U level.
v If the rack cabinet you are relocating is part of a suite of rack cabinets, detach the rack cabinet from
the suite.
v Inspect the route that you plan to take to eliminate potential hazards.
v Verify that the route that you choose can support the weight of the loaded rack cabinet. Refer to the
documentation that comes with your rack cabinet for the weight of a loaded rack cabinet.
v Verify that all door openings are at least 760 x 2030 mm (30 x 80 in.).
v Ensure that all devices, shelves, drawers, doors, and cables are secure.
v Ensure that the four leveling pads are raised to their highest position.
v Ensure that there is no stabilizer bracket installed on the rack cabinet during movement.
v Do not use a ramp inclined at more than 10 degrees.
v When the rack cabinet is in the new location, complete the following steps:
– Lower the four leveling pads.
– Install stabilizer brackets on the rack cabinet.
– If you removed any devices from the rack cabinet, repopulate the rack cabinet from the lowest
position to the highest position.
v If a long-distance relocation is required, restore the rack cabinet to the configuration of the rack
cabinet as you received it. Pack the rack cabinet in the original packaging material, or equivalent.
Also lower the leveling pads to raise the casters off of the pallet and bolt the rack cabinet to the
pallet.
(R002)
(L001)
(L002)
viii
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
(L003)
or
All lasers are certified in the U.S. to conform to the requirements of DHHS 21 CFR Subchapter J for class
1 laser products. Outside the U.S., they are certified to be in compliance with IEC 60825 as a class 1 laser
product. Consult the label on each part for laser certification numbers and approval information.
CAUTION:
This product might contain one or more of the following devices: CD-ROM drive, DVD-ROM drive,
DVD-RAM drive, or laser module, which are Class 1 laser products. Note the following information:
v Do not remove the covers. Removing the covers of the laser product could result in exposure to
hazardous laser radiation. There are no serviceable parts inside the device.
v Use of the controls or adjustments or performance of procedures other than those specified herein
might result in hazardous radiation exposure.
(C026)
Safety notices
ix
CAUTION:
Data processing environments can contain equipment transmitting on system links with laser modules
that operate at greater than Class 1 power levels. For this reason, never look into the end of an optical
fiber cable or open receptacle. (C027)
CAUTION:
This product contains a Class 1M laser. Do not view directly with optical instruments. (C028)
CAUTION:
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following
information: laser radiation when open. Do not stare into the beam, do not view directly with optical
instruments, and avoid direct exposure to the beam. (C030)
Power and cabling information for NEBS (Network Equipment-Building System)
GR-1089-CORE
The following comments apply to the IBM servers that have been designated as conforming to NEBS
(Network Equipment-Building System) GR-1089-CORE:
The equipment is suitable for installation in the following:
v Network telecommunications facilities
v Locations where the NEC (National Electrical Code) applies
The intrabuilding ports of this equipment are suitable for connection to intrabuilding or unexposed
wiring or cabling only. The intrabuilding ports of this equipment must not be metallically connected to the
interfaces that connect to the OSP (outside plant) or its wiring. These interfaces are designed for use as
intrabuilding interfaces only (Type 2 or Type 4 ports as described in GR-1089-CORE) and require isolation
from the exposed OSP cabling. The addition of primary protectors is not sufficient protection to connect
these interfaces metallically to OSP wiring.
Note: All Ethernet cables must be shielded and grounded at both ends.
The ac-powered system does not require the use of an external surge protection device (SPD).
The dc-powered system employs an isolated DC return (DC-I) design. The DC battery return terminal
shall not be connected to the chassis or frame ground.
x
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 1. Introduction
This problem determination and service information helps you solve problems that might occur in your
PS700 blade server. The information describes the diagnostic tools that come with the blade server, error
codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If IBM
installs a Tier 1 CRU at your request, you are charged for the installation.
v Tier 2 customer replaceable unit: You can install a Tier 2 CRU yourself or request IBM to install it, at
no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service technicians.
The serial number for the PS700 blade server can be found in the following locations:
v The bottom front of the blade server in the right corner on the 1S label.
v
The bottom rear of the blade server in the right corner.
v Under the front cover door.
For information about the terms of the warranty and getting service and assistance, see the information
center or the Warranty and Support Information document on the IBM BladeCenter Documentation CD.
Related documentation
Documentation for the PS700 blade server includes documents in Portable Document Format (PDF) on
the IBM BladeCenter Documentation CD and the online information center.
The most recent version of all BladeCenter documentation is in the BladeCenter information center.
The online BladeCenter information center is available in the IBM BladeCenter Information Center at
http://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp.
PDF versions of the following documents are on the IBM BladeCenter Documentation CD and in the online
information center:
v Installation and User's Guide
This document contains general information about the blade server, including how to install supported
options and how to configure the blade server.
v Safety Information
This document contains translated caution and danger statements. Each caution and danger statement
that appears in the documentation has a number that you can use to locate the corresponding
statement in your language in the Safety Information document.
v Warranty and Support Information
This document contains information about the terms of the warranty and about getting service and
assistance.
© Copyright IBM Corp. 2010, 2011
1
Additional documents might be included in the online information center and on the IBM BladeCenter
Documentation CD.
The blade server might have features that are not described in the documentation that comes with the
blade server. Occasional updates to the documentation might include information about those features, or
technical updates might be available to provide additional information that is not included in the
documentation that comes with the blade server.
Review the online information or the Planning Guide and the Installation Guide for your IBM BladeCenter
unit. The information can help you prepare for system installation and configuration. The most current
version of each document is available in the BladeCenter information center.
Notices and statements
The caution and danger statements in this document are also in the multilingual Safety Information. Each
statement is numbered for reference to the corresponding statement in your language in the Safety
Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you avoid inconvenient or
problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data. An attention notice is
placed just before the instruction or situation in which damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous to you. A caution
statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or extremely hazardous to
you. A danger statement is placed just before the description of a potentially lethal or extremely
hazardous procedure step or situation.
Features and specifications
Features and specifications of the IBM BladeCenter PS700 blade server are summarized in this overview.
The PS700 Type 8406 is a single-wide (non-expandable) blade server. The PS700 blade server is used in an
IBM BladeCenter H (8852 and 7989), BladeCenter HT (8740 and 8750), or BladeCenter S (8886 and 7779)
chassis unit.
Notes:
v Power, cooling, removable-media drives, external ports, and advanced system management are
provided by the BladeCenter unit.
v The operating system in the blade server must provide support for the Universal Serial Bus (USB), to
enable the blade server to recognize and communicate internally with the removable-media drives and
front-panel USB ports.
2
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Core electronics:
v 64-bit Power 7 processors (12S
technology)
v Four core, single socket (4-way)
processors @ 3.0 GHz
v 64 GB maximum in 8 very low
profile (VLP) DIMM slots; Supports
4 GB DDR3 at 1066MHz, and 8 GB
DDR3 at 800HMz
P5IOC2 I/O hub
On-board, integrated features:
v Two 1 GB Ethernet ports (HEA)
(two on each side)
v SAS controller
v USB 2.0
v 1 Serial over LAN (SOL) console
using FSP
FSP1 Service Processor - IPMI and
SOL
v The baseboard management
controller (BMC) is a flexible
service processor (FSP1) with
Intelligent Platform Management
Interface (IPMI), Serial over LAN
(SOL), and Wake on LAN (WOL)
firmware support.
Integrated functions:
v RS-485 interface for
communication with the
management module
v Automatic server restart (ASR)
v SOL through FSP
v Two Universal Serial Bus (USB
2.0) buses on base planar for
communication with
removable-media drives
v Optical media available by shared
chassis feature
Environment:
v Air temperature:
– Blade server on: 10° to 35°C
(50° to 95°F). Altitude: 0 to 914
m (3000 ft)
– Blade server on: 10° to 32°C
(50° to 90°F). Altitude: 914 m to
2133 m (3000 ft to 7000 ft)
– Blade server off: -40° to 60°C
(-40° to 140°F)
v Humidity:
– Blade server on: 8% to 80%
– Blade server off: 8% to 80%
PS700 Size:
v Height: 24.5 cm (9.7 inches)
v Depth: 44.6 cm (17.6 inches)
Local Storage:
v First DASD bay: zero or one 2.5"
v Width: 30 mm (1.14 inches)
SAS HDD
v Second DASD bay: zero or one 2.5"
SAS HDD
v SAS HDDs are 300 GB and 600 GB
v Hardware mirroring
Daughter card I/O options:
v 1 1Xe expansion card (CIOv)
v SAS Pass-through using 1Xe
v 1 High-Speed expansion card
(CFFh)
Systems management:
v Supported by BladeCenter chassis
management module
v Front panel LEDs
v IBM Director
v Hardware Management Console
(HMC)
v Integrated Virtualization Manager
(IVM)
v Energy Scale thermal management
for power management/
oversubscription (throttling) and
environmental sensing
v Active Energy Manager
Clusters support for:
v IBM Director
v xCat
Virtualization support for:
PowerVM® Standard Edition hardware
feature, which provides the Integrated
Virtualization Manager, Virtual I/O
Server, and Director Power Systems™
Manager (DPSM).
Reliability and service features:
v Dual alternating current power
supply
v BladeCenter chassis redundant and
hot plug power and cooling
modules
v Boot-time processor deallocation
v Blade server hot plug
v Customer setup and expansion
v Automatic reboot on power loss
v Internal and ambient temperature
monitors
v ECC, chipkill memory
v System management alerts
Electrical input: 12 V dc
See the ServerProven Web site for information about supported operating-system versions and all PS700
blade server optional devices.
Chapter 1. Introduction
3
Supported DIMMs
Each planar in the PS700 blade server contains eight very low profile (VLP) memory connectors for
registered dual inline memory modules (RDIMMs). The maximum size for a single DIMM is 8 GB. The
total memory capacity ranges for PS700 from a minimum of 4 GB to a maximum of 64 GB.
See Chapter 3, “Parts listing, Type 8406,” on page 229 for memory modules that you can order from IBM.
Memory module rules:
v Install DIMM fillers in unused DIMM slots for proper cooling.
v Install DIMMs in pairs (1 and 3, 6 and 8, 2 and 4, 5 and 7)
v Both DIMMs in a pair must be the same size, speed, type, and technology. You can mix compatible
DIMMs from different manufacturers.
v Each DIMM within a processor-support group (1-4 and 5-8) must be the same size and speed.
v Install only supported DIMMs, as described on the ServerProven Web site. See http://www.ibm.com/
servers/eserver/serverproven/compat/us/.
v Installing or removing DIMMs changes the configuration of the blade server. After you install or
remove a DIMM, the blade server is automatically re-configured, and the new configuration
information is stored.
v See “System-board connectors” on page 8 for DIMM connector locations.
Table 1 shows allowable placement of DIMM modules:
Table 1. Memory module combinations
DIMM
count
PS700 Base blade planar (P1) DIMM slots
1
2
3
4
2
X
X
4
X
X
6
X
X
X
X
8
X
X
X
X
5
X
6
8
X
X
X
X
X
Figure 1. DIMM connectors. Base unit connectors
4
7
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
X
X
Blade server control panel buttons and LEDs
Blade server control panel buttons and LEDs provide operational controls and status indicators.
Note: Figure 2 shows the control-panel door in the closed (normal) position. To access the power-control
button, you must open the control-panel door.
Figure 2. Blade server control panel buttons and LEDs
▌1▐ Media-tray select button: Press this button to associate the shared BladeCenter unit media tray
(removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes
while the request is being processed, then is lit when the ownership of the media tray has been
transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade
server to recognize the media tray.
If there is no response when you press the media-tray select button, use the management module to
determine whether local control has been disabled on the blade server.
Note: The operating system in the blade server must provide USB support for the blade server to
recognize and use the removable-media drives and USB ports.
Chapter 1. Introduction
5
▌2▐ Information LED: When this amber LED is lit, it indicates that information about a system error for
the blade server has been placed in the management-module event log. The information LED can be
turned off through the Web interface of the management module or through IBM Director Console.
▌3▐ Blade-error LED: When this amber LED is lit, it indicates that a system error has occurred in the
blade server. The blade-error LED will turn off after one of the following events:
v Correcting the error
v Reseating the blade server in the BladeCenter unit
v Cycling the BladeCenter unit power
▌4▐ Power-control button: This button is behind the control panel door. Press this button to turn on or
turn off the blade server.
The power-control button has effect only if local power control is enabled for the blade server. Local
power control is enabled and disabled through the Web interface of the management module.
Press the power button for 5 seconds to begin powering down the blade server.
▌5▐ NMI reset (recessed): The nonmaskable interrupt (NMI) reset dumps the partition. Use this recessed
button only as directed by IBM Support.
▌6▐ Power-on LED: This green LED indicates the power status of the blade server in the following
manner:
v Flashing rapidly: The service processor is initializing the blade server.
v Flashing slowly: The blade server has completed initialization and is waiting for a power-on command.
v Lit continuously: The blade server has power and is turned on.
Note: The enhanced service processor can take as long as three minutes to initialize after you install the
BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
▌7▐ Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or
network.
▌8▐ Location LED: When this blue LED is lit, it has been turned on by the system administrator to aid in
visually locating the blade server. The location LED can be turned off through the Web interface of the
management module or through IBM Director Console.
Turning on the blade server
After you connect the blade server to power through the BladeCenter unit, you can start the blade server
after the discovery and initialization process is complete.
6
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
You can start the blade server in any of the following ways.
v Start the blade server by pressing the power-control button on the front of the blade server.
The power-control button is behind the control panel door, as described in “Blade server control panel
buttons and LEDs” on page 5.
After you push the power-control button, the power-on LED continues to blink slowly for about 15
seconds, then is lit solidly when the power-on process is complete.
Wait until the power-on LED on the blade server flashes slowly before you press the blade server
power-control button. If the power-on LED is flashing rapidly, the service processor is initializing the
blade server. The power-control button does not respond during initialization.
Note: The enhanced service processor can take as long as three minutes to initialize after you install
the BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
v Start the blade server automatically when power is restored after a power failure.
If a power failure occurs, the BladeCenter unit and then the blade server can start automatically when
power is restored. You must configure the blade server to restart through the management module.
v Start the blade server remotely using the management module.
After you initiate the power-on process, the power-on LED blinks slowly for about 15 seconds, then is
lit solidly when the power-on process is complete.
Turning off the blade server
When you turn off the blade server, it is still connected to power through the BladeCenter unit. The blade
server can respond to requests from the service processor, such as a remote request to turn on the blade
server. To remove all power from the blade server, you must remove it from the BladeCenter unit.
Shut down the operating system before you turn off the blade server. See the operating-system
documentation for information about shutting down the operating system.
You can turn off the blade server in one of the following ways.
v Turn off the blade server by pressing the power-control button for at least 5 seconds.
The power-control button is on the blade server behind the control panel door. See “Blade server
control panel buttons and LEDs” on page 5 for the location.
Note: The power-control LED can remain on solidly for up to 1 minute after you push the
power-control button. After you turn off the blade server, wait until the power-control LED is blinking
slowly before you press the power-control button to turn on the blade server again.
If the operating system stops functioning, press and hold the power-control button for more than 5
seconds to force the blade server to turn off.
v Use the management module to turn off the blade server.
The power-control LED can remain on solidly for up to 1 minute after you initiate the power-off
process. After you turn off the blade server, wait until the power-control LED is blinking slowly before
you initiate the power-on process from the AMM to turn on the blade server again.
Use the management-module Web interface to configure the management module to turn off the blade
server if the system is not operating correctly.
For additional information, see the online documentation or the User's Guide for the management
module.
Chapter 1. Introduction
7
System-board layouts
Illustrations show the connectors and LEDs on the system board. The illustrations might differ slightly
from your hardware.
System-board connectors
Blade server components attach to the connectors on the system board.
Figure 3 shows the connectors on the base unit system board in the blade server.
Figure 3. PS700 system-board connectors
Table 2 shows connector descriptions.
Table 2. PS700 connectors
Callout
PS700 blade server connectors
▌1▐
Operator panel connector
▌2▐
DIMM 1-4 connectors (See Figure 4 on page 9 for individual connectors.) Expansion unit
(SMP) connector
▌3▐
Management card connector (P1-C9)
▌4▐
SAS hard disk drive connector (P1-D2)
▌5▐
Light Path Blue Button
▌6▐
SAS hard disk drive (P1-C10)
▌7▐
CIOv (1Xe) expansion card connector (P1-C11)
▌8▐
High-Speed (CFFh) expansion card connector (P1-C12)
▌9▐
DIMM 5-8 connectors (See Figure 4 on page 9 for individual connectors.)
▌10▐
3V lithium battery connector (P1-E1)
Figure 4 on page 9 shows individual DIMM connectors.
8
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 4. DIMM connectors. Base unit connectors
System-board LEDs
Use the illustration of the LEDs on the system board to identify a light emitting diode (LED).
Remove the blade server from the BladeCenter unit, open the cover, press the blue button to see any
error LEDs that were turned on during error processing, and use Figure 5 to identify the failing
component.
Figure 5 shows the locations of LEDs on the system board.
Table 3 shows LED descriptions.
Figure 5. LED locations on the system board of the PS700 blade server
Table 3. PS700 LEDs
Callout
Base unit LEDs
▌1▐
3V lithium battery LED
▌2▐
DIMM 1-4 LEDs
▌3▐
Management card LED
▌4▐
Light path power LED
▌5▐
System board LED
▌6▐
HDD1 LED
▌7▐
Interposer LED
Chapter 1. Introduction
9
Table 3. PS700 LEDs (continued)
Callout
Base unit LEDs
▌8▐
CIOv (1Xe) expansion card connector LED
▌9▐
High-Speed (CFFh) expansion card connector LED
▌10▐
HDD2 LED
▌11▐
DIMM 5-8 LEDs
10
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 2. Diagnostics
Use the available diagnostic tools to help solve any problems that might occur in the blade server.
The first and most crucial component of a solid serviceability strategy is the ability to accurately and
effectively detect errors when they occur. While not all errors are a threat to system availability, those that
go undetected are dangerous because the system does not have the opportunity to evaluate and act if
necessary. POWER7® processor-based systems are specifically designed with error-detection mechanisms
that extend from processor cores and memory to power supplies and hard drives.
POWER7 processor-based systems contain specialized hardware detection circuitry for detecting
erroneous hardware operations. Error checking hardware ranges from parity error detection coupled with
processor instruction retry and bus retry, to ECC correction on caches and system buses.
IBM hardware error checkers have these distinct attributes:
v Continuous monitoring of system operations to detect potential calculation errors
v Attempted isolation of physical faults based on runtime detection of each unique failure
v Initiation of a wide variety of recovery mechanisms designed to correct a problem
POWER7 processor-based systems include extensive hardware and firmware recovery logic.
Machine check handling
Machine checks are handled by firmware. When a machine check occurs, the firmware analyzes the error
to identify the failing device and creates an error log entry.
If the system degrades to the point that the service processor cannot reach standby state, the ability to
analyze the error does not exist. If the error occurs during POWER® hypervisor (PHYP) activities, the
PHYP initiates a system reboot.
In partitioned mode, an error that occurs during partition activity is reported to the operating system in
the partition.
Diagnostic tools
Tools are available to help you diagnose and solve hardware-related problems.
© Copyright IBM Corp. 2010, 2011
11
v Power-on self-test (POST) progress codes (checkpoints), error codes, and isolation procedures
The POST checks out the hardware at system initialization. IPL diagnostic functions test some system
components and interconnections. The POST generates eight-digit checkpoints to mark the progress of
powering up the blade server.
Use the management module to view progress codes.
The documentation of a progress code includes recovery actions for system hangs. See “POST progress
codes (checkpoints)” on page 84 for more information.
If the service processor detects a problem during POST, an error code is logged in the management
module event log. Error codes are also logged in the Linux syslog or AIX® diagnostic log, if possible.
See “System reference codes (SRCs)” on page 16.
The service processor can generate codes that point to specific isolation procedures. See “Service
processor problems” on page 200.
v Light path diagnostics
Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system
error LED on the system LED panel on the front or rear of the BladeCenter unit is lit, one or more
error LEDs on the BladeCenter unit components also might be lit.
Light path diagnostics help identify failing customer replaceable unit (CRUs). CRU location codes are
included in error codes and the event log.
LED locations
See “System-board LEDs” on page 9.
Front panel
See “Blade server control panel buttons and LEDs” on page 5.
v Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
See “Troubleshooting tables” on page 191.
v Dump data collection
In some circumstances, an error might require a dump to show more data. The Integrated
Virtualization Manager (IVM) or Hardware Management Console (HMC) sets up a dump area. Specific
IVM or HMC information is included as part of the information that can optionally be sent to IBM
support for analysis.
See “Collecting dump data” on page 13 for more information.
v Stand-alone diagnostics
The AIX-based stand-alone diagnostics CD is in the ship package and is also available from the IBM
Web site. Boot the diagnostics from a CD drive or from an AIX network installation manager (NIM)
server if the blade server cannot boot to an operating system, no matter which operating system is
installed.
Functions provided by the stand-alone diagnostics include:
– Analysis of errors reported by platform, such as microprocessor and memory errors
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for the AIX operating system
Run AIX concurrent diagnostics if AIX is functioning instead of the stand-alone diagnostics. Functions
provided by disk-based AIX diagnostics include:
– Automatic error log analysis
– Analysis of errors reported by platform, such as microprocessor and memory errors
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for Linux operating systems
12
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Linux on POWER service and productivity tools include hardware diagnostic aids and productivity
tools, and installation aids. The installation aids are provided in the IBM Installation Toolkit for Linux
on POWER, a set of tools that aids the installation of Linux on IBM servers with POWER architecture.
You can also use the tools to update the PS700 blade server firmware.
Diagnostic utilities for the Linux operating system are available from IBM at https://
www14.software.ibm.com/webapp/set2/sas/f/lopdiags/home.html.
v Diagnostic utilities for other operating systems
You can use the stand-alone diagnostics CD to perform diagnostics on the PS700 blade server, no matter
which operating system is loaded on the blade server. However, other supported operating systems
might have diagnostic tools that are available through the operating system. See the documentation for
your operating system for more information.
Collecting dump data
A dump might be critical for fault isolation when the built-in First Failure Data Capture (FFDC)
mechanisms are not capturing sufficient fault data. Even when a fault is identified, dump data can
provide additional information that is useful in problem determination.
All hardware state information is part of the dump if a hardware checkstop occurs. When a checkstop
occurs, the service processor attempts to dump data that is necessary to analyze the error from
appropriate parts of the system.
Note: If you power off the blade through the management module while the service processor is
performing a dump, platform dump data is lost.
You might be asked to retrieve a dump to send it to IBM Support for analysis. The location of the dump
data varies by operating system.
v Collect an AIX dump from the /var/adm/platform directory.
v Collect a Linux dump from the /var/log/dump directory.
v Collect an Integrated Virtualization Manager (IVM) dump from the IVM-managed PS700 blade server
through the Manage Dumps task in the IVM console.
v To collect a system dump by using the Hardware Management Console (HMC), complete these steps:
1. Perform a controlled shutdown of all partitions.
Note: A system dump will abnormally terminate any running partitions.
2. In the navigation area, open Systems Management.
3. Select the server and open it.
4. Select Serviceability > Manage Dumps > Action > Initiate System Dump. The dump is
automatically saved to the HMC. For details on how to copy, report, or delete a dump after you
have completed a dump, see Managing dumps.
Chapter 2. Diagnostics
13
Location codes
Location codes identify components of the blade server. Location codes are displayed with some error
codes to identify the blade server component that is causing the error.
See “System-board connectors” on page 8 for component locations.
Notes:
1. Location codes do not indicate the location of the blade server within the BladeCenter unit. The codes
identify components of the blade server only.
2. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify
the failing component when there is a hang condition.
3. For checkpoints with location codes, use the following table to identify the failing component when
there is a hang condition.
4. For 8-digit codes not listed in Table 4, see the “Checkout procedure” on page 184.
Table 4. Location codes
Components
Physical Location Code
CRU LED
Un location codes are for enclosure and VPD locations.
Un = Utttt.mmm.sssssss
tttt = system machine type
mmm = system model number
sssssss = system serial number
DIMM 1
Un-P1-C1
Yes
DIMM 2
Un-P1-C2
Yes
DIMM 3
Un-P1-C3
Yes
DIMM 4
Un-P1-C4
Yes
DIMM 5
Un-P1-C5
Yes
DIMM 6
Un-P1-C6
Yes
DIMM 7
Un-P1-C7
Yes
DIMM 8
Un-P1-C8
Yes
2.5" SAS HDD1
Un-P1-D1
Yes
2.5" SAS HDD2
Un-P1-D2
Yes
Management Card
Un-P1-C9
Yes
Battery
Un-P1-E1
Yes
PCIe High Speed Expansion Card
Un-P1-C12
Yes
1Xe Card
Un-P1-C11
Yes
USB Port 1 (CDROM/FDD)
Un-P1-T1
No
USB Port 2 (CDROM/FDD)
Un-P1-T2
No
SAS controller
Un-P1-T3
No
Ethernet HEA0_A
Un-P1-T4
No
Ethernet HEA0_B
Un-P1-T5
No
Machine Location Code
Utttt.mmm.sssssss
No
Um codes are for firmware. The format is the same as for a Un location code.
Um = Utttt.mmm.sssssss
14
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 4. Location codes (continued)
Components
Firmware version
Physical Location Code
CRU LED
Um-Y1
Reference codes
Reference codes are diagnostic aids that help you determine the source of a hardware or operating
system problem. To use reference codes effectively, use them in conjunction with other service and
support procedures.
The BladeCenter PS700 Type 8406 blade server produces several types of codes.
Progress codes: The power-on self-test (POST) generates eight-digit status codes that are known as
checkpoints or progress codes, which are recorded in the management-module event log. The checkpoints
indicate which blade server resource is initializing.
Error codes: The First Failure Data Capture (FFDC) error checkers capture fault data, which the service
processor then analyzes. For unrecoverable errors (UEs), for recoverable events that meet or exceed their
service thresholds, and for fatal system errors, an unrecoverable checkstop service event triggers the
service processor to analyze the error, log the system reference code (SRC), and turn on the system
attention LED.
The service processor logs the nine-word, eight-digit per word error code in the BladeCenter
management-module event log. Error codes are either system reference codes (SRCs) or service request
numbers (SRNs). A location code might also be included.
Isolation procedures: If the fault analysis does not determine a definitive cause, the service processor
might indicate a fault isolation procedure that you can use to isolate the failing component.
Viewing the codes
The PS700 blade server does not display checkpoints or error codes on the remote console. The shared
BladeCenter unit video also does not display the codes.
If the POST detects a problem, a 9-word, 8-digit error code is logged in the BladeCenter
management-module event log. A location code that identifies a component might also be included. See
“Error logs” on page 183 for information about viewing the management-module event log.
Service request numbers can be viewed using the AIX diagnostics CD, or various operating system
utilities, such as AIX diagnostics or the Linux service aid “diagela”, if it is installed.
Chapter 2. Diagnostics
15
System reference codes (SRCs)
System reference codes indicate a server hardware or software problem that can originate in hardware, in
firmware, or in the operating system.
A blade server component generates an error code when it detects a problem. An SRC identifies the
component that generated the error code and describes the error. Use the SRC information to identify a
list of possibly failing items and to find information about any additional isolation procedures.
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event
log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each
word after the first word to show relative word positions. The seventh word is the direct select address,
which is 77777777 in the example.
Table 5. Nine-word system reference code in the management-module event log
Index
Sev
Source
Date/Time
Text
1
E
Blade_05
01/21/2008,
17:15:14
(PS700-BC1BLD5E) SYS F/W: Error. Replace UNKNOWN
(5008FECF B7001111 22222222 33333333 44444444 55555555
66666666 77777777 88888888 99999999)
Depending on your operating system and the utilities you have installed, error messages might also be
stored in an operating system log. See the documentation that comes with the operating system for more
information.
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list
to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent
SRCs.
Table 6. Management module reference code listing
Unique ID
System Reference Code
Timestamp
00040001
D1513901
2005-11-13 19:30:20
00000016
D1513801
2005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the
message to cause the management module to present the additional message detail:
D1513901
Created at: 2007-11-13
19:30:20
SRC Version: 0x02
Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
16
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
SRC formats
SRCs are strings of either six or eight alphanumeric characters. The first two characters designate the
reference code type.
The first character indicates the type of error. In a few cases, the first two characters indicate the type of
error:
v 1xxxxxxx - System power control network (SPCN) error
v 6xxxxxxx - Virtual optical device error
v A1xxxxxx - Attention required (Service processor)
v AAxxxxxx - Attention required (Partition firmware)
v B1xxxxxx - Service processor error, such as a boot problem
v B6xxxxxx - Licensed Internal Code or hardware event error
v B9xxxxxx - Software installation error or IBM i IPL error. See "Recovering from IPL or system failures"
in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha5_p5/iplprocedure.htm.
v BAxxxxxx - Partition firmware error
v Cxxxxxxx - Checkpoint (must hang to indicate an error)
v Dxxxxxxx - Dump checkpoint (must hang to indicate an error)
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the
POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha8/codefinder.htm.
1xxxyyyy SRCs
The 1xxxyyyy system reference codes are system power control network (SPCN) reference codes.
Look for the rightmost 4 characters (yyyy in 1xxxyyyy) in the error code; this is the reference code. Find
the reference code in Table 7.
Perform all actions before exchanging failing items.
Table 7. 1xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
Description
Action
00AC
Informational message: AC loss No action is required.
was reported
00AD
Informational message: A
service processor reset caused
the blade server to power off
No action is required.
1F02
Informational message: The
trace logs reached 1K of data.
No action is required.
1F03
Informational message: Invalid
TMS of location code.
No action is required.
Chapter 2. Diagnostics
17
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
2600
2610
Description
Action
Power good (pGood) master
fault
1. Go to “Checkout procedure” on page 184.
pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2620
12V dc pGood input fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2629
1.5V reg_pgood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
262B
1.8V reg_pgood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
262C
5V reg_pgood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
262D
3.3V reg_pgood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
262E
2.5V reg_pgood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2630
VRM CP0 core pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2632
VRM CP0 cache pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2647
12V "or-ing" FET short
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2648
Blade power latch fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
18
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
Description
Action
2649
Blade power fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2670
The BladeCenter encountered a 1. Check the management-module event log for entries that were
problem, and the blade server
made around the time that the PS700 blade server shut down.
was automatically shut down
2. Resolve any problems that are found.
as a result
3. Reboot the blade server.
4. If the problem is not resolved, replace the system-board and
chassis assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2671
2672
12V power fault in the blade
server
1. Go to “Checkout procedure” on page 184.
Blades PEU3 voltage alert
Perform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4.
If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
2675
1.1
1. Go to “Checkout procedure” on page 184.
Reg_CPU0_P5IO2C_Vio_pGood
2. Replace the system-board, as described in “Replacing the FRU
fault
system-board and chassis assembly” on page 260.
2676
VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2677
VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2678
PROC_Vmem_controller_pGood 1. Go to “Checkout procedure” on page 184.
1.0V fault
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2679
Vmem_controller_pGood 1.5V
reg fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics
19
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
Description
Action
267A
HSDC/4xel_A0_pGood fault
Perform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4.
If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
267B
HSDC/4xel_B0_pGood fault
Perform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4.
If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
267C
267D
REG_P5IO2C_core 1.2V pGood
fault
1. Go to “Checkout procedure” on page 184.
2.0_PLL_pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2710
3120
pGood output/
P7_VRM_PVID_gate fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
VRM voltage adjustment failure 1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
20
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
3134
Description
Action
Fault on the hardware
monitoring chip
Perform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4.
If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
8400
Invalid configuration decode
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page 184.
b. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
8402
8413
Unable to get VPD from the
concentrator
1. Go to “Checkout procedure” on page 184.
Invalid processor 1 VPD
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
8414
Invalid processor 2 VPD
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
8423
No processor VPD was found
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
8480
84A0
Bad or missing memory
controller VID
1. Go to “Checkout procedure” on page 184.
No backplane VPD was found
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
6xxxyyyy SRCs
The 6xxxyyyy system reference codes are virtual optical reference codes.
Chapter 2. Diagnostics
21
Look for the rightmost 4 characters (yyyy in 6xxxyyyy) in the error code; this is the reference code. Find
the reference code in Table 8.
Table 8. 6xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy
Error Codes
Description
Action
632Byyyy codes are Network File System (NFS) virtual optical SRCs
632BCFC1
A virtual optical device cannot On this partition and on the Network File System server, verify
access the file containing the
that the proper file is specified and that the proper authority is
list of volumes.
granted.
632BCFC2
A non-recoverable error was
detected while reading the list
of volumes.
632BCFC3
The data in the list of volumes On the Network File System server, verify that the proper file is
is not valid.
specified, that all files are entered correctly, that there are no blank
lines, and that the character set used is valid.
632BCFC4
A virtual optical device cannot On the Network File System server, verify that the proper file is
access the file containing the
specified in the list of volumes, and that the proper authority is
specified optical volume.
granted.
632BCFC5
A non-recoverable error was
detected while reading a
virtual optical volume.
Resolve any errors on the Network File System server.
632BCFC6
The file specified does not
contain data that can be
processed as a virtual optical
volume.
On the Network File System server, verify that all the files specified
in the list of optical volumes are correct.
632BCFC7
A virtual optical device
detected an error reported by
the Network File System
server that cannot be
recovered.
Resolve any errors on the Network File System server.
632BCFC8
A virtual optical device
Install any available operating system updates.
encountered a non-recoverable
error.
Resolve any errors on the Network File System server.
632Cyyyy codes are virtual optical SRCs
632CC000
Informational system log entry No corrective action is required.
only.
632CC002
self configuring SCSI device
(SCSD) selection or reselection
timeout occurred.
Refer to the hosting partition for problem analysis.
632CC010
Undefined sense key returned
by device.
Refer to the hosting partition for problem analysis.
632CC020
Configuration error.
Refer to the hosting partition for problem analysis.
632CC100
SCSD bus error occurred.
Refer to the hosting partition for problem analysis.
632CC110
SCSD command timeout
occurred.
Refer to the hosting partition for problem analysis.
632CC210
Informational system log entry No corrective action is required.
only.
22
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 8. 6xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy
Error Codes
Description
Action
632CC300
Media or device error
occurred.
Refer to the hosting partition for problem analysis.
632CC301
Media or device error
occurred.
Refer to the hosting partition for problem analysis.
632CC302
Media or device error
occurred.
Refer to the hosting partition for problem analysis.
632CC303
Media has an unknown
format.
No corrective action is required.
632CC333
Incompatible media.
1. Verify that the disk has a supported format.
2. If the format is supported, clean the disk and attempt the
failing operation again.
3. If the operation fails again with the same system reference code,
ask your media source for a replacement disk.
632CC400
Physical link error detected by Refer to the hosting partition for problem analysis.
device.
632CC402
An internal program error
occurred.
632CCFF2
Informational system log entry No corrective action is required.
only.
632CCFF4
Internal device error occurred.
632CCFF6
Informational system log entry No corrective action is required.
only.
632CCFF7
Informational system log entry No corrective action is required.
only.
632CCFFE
Informational system log entry No corrective action is required.
only.
632CFF3D
Informational system log entry No corrective action is required.
only.
632CFF6D
Informational system log entry No corrective action is required.
only.
Install any available operating system updates.
Refer to the hosting partition for problem analysis.
Chapter 2. Diagnostics
23
A1xxyyyy service processor SRCs
An A1xxyyyy system reference code (SRC) is an attention code that offers information about a platform
or service processor dump or confirms a control panel function request. Take the steps in the Action
column only if the BladeSystem appears to hang on an attention code.
Table 9 shows A1xxyyyy SRCs.
Table 9. A1xxyyyy service processor SRCs
Attention code
Description
Action
A1xxyyyy
Attention code
1. Go to “Checkout procedure” on page 184.
2. Replace the system board and chassis
assembly, as described in “Replacing the
FRU system-board and chassis assembly” on
page 260.
A2xxyyyy Logical partition SRCs
An A2xxyyyy SRC is a logical partition reference code that is related to logical partitioning.
Table 10. A2xxyyyy Logical partition SRCs
Reference Code
Description
Action
A2xxyyyy
See the description for the B200yyyy error
code with the same yyyy value.
Perform the action described in the B200yyyy
error code with the same yyyy value.
A2D03000
User-initiated immediate termination and MSD No corrective action is required.
of a partition.
A2D03001
User-initiated RSCDUMP of RPA partition's
PFW content.
No corrective action is required.
A2D03002
User-initiated RSCDUMP of IBM i partition's
SLIC bootloader and PFW content.
No corrective action is required.
A700yyyy Licensed internal code SRCs
An A700xxxx system reference code (SRC) is an error/event code that is related to licensed internal code.
Table 11. A700yyyy Licensed internal code SRCs
Reference Code
Description
Action
A700173C
Informational system log entry only.
No corrective action is required.
A7003000
A user-initiated platform dump occurred.
No service action required.
A7004700
Informational system log entry only.
No corrective action is required.
A7004712
A problem occurred when initializing, reading, Replace the management card, as described in
or using system VPD.
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
A7004713
A problem occurred when initializing, reading, Replace the management card, as described in
or using system VPD.
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
A7004715
A problem occurred when initializing, reading, Replace the management card, as described in
or using system VPD.
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
24
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 11. A700yyyy Licensed internal code SRCs (continued)
Reference Code
Description
Action
A7004721
The World Wide Port Name (WWPN) Prefix is
not valid.
https://www-912.ibm.com/supporthome.nsf/
document/51455410
A7004730
Informational system log entry only.
No corrective action is required.
A7004740
Informational system log entry only.
No corrective action is required.
A7004741
Informational system log entry only.
No corrective action is required.
A7004788
Informational system log entry only.
No corrective action is required.
A70047FF
Informational system log entry only.
No corrective action is required.
A7013003
Partition-initiated PHYP-content RSCDUMP.
No corrective action is required.
A700yyyy
For any other A7xxyyyy SRC not listed here,
Perform the action in the B7xxyyyy error code
see the description for the B7xxyyyy error code with the same xxyyyy value.
with the same xxyyyy value.
AA00E1A8 to AA260005 Partition firmware attention codes
AAxx attention codes provide information about the next target state for the platform firmware. These
codes might indicate that you need to perform an action.
Table 12 describes the partition firmware codes that might be displayed if the POST detects a problem.
Each message description includes a suggested action to correct the problem.
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code
Description
Action
AA00E1A8
The system is booting to the open
firmware prompt.
At the open firmware prompt, type dev
/packages/gui obe and press Enter; then, type
1 to select SMS Menu.
AA00E1A9
The system is booting to the System
Management Services (SMS) menus.
1. If the system or partition returns to the
SMS menus after a boot attempt failed, use
the SMS menus to check the progress
indicator history for a BAxx xxxx error,
which may indicate why the boot attempt
failed. Follow the actions for that error code
to resolve the boot problem.
2. Use the SMS menus to establish the boot
list and restart the blade server.
AA00E1B0
Waiting for the user to select the
language and keyboard. The menu
should be visible on the console.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
25
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code
Description
Action
AA00E1B1
Waiting for the user to accept or decline
the license
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
AA060007
A keyboard was not found.
Verify that a keyboard is attached to the USB
port that is assigned to the partition.
AA06000B
The system or partition was not able to
find an operating system on any of the
devices in the boot list.
1. Use the SMS menus to modify the boot list
so that it includes devices that have a
known-good operating system and restart
the blade server.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
AA06000C
The media in a device in the boot list
was not bootable.
1. Replace the media in the device with
known-good media or modify the boot list
to boot from another bootable device.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
AA06000D
The media in the device in the bootlist
was not found under the I/O adapter
specified by the bootlist.
1. Verify that the media from which you are
trying to boot is bootable or modify the
boot list to boot from another bootable
device.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
AA06000E
The adapter specified in the boot list is
not present or is not functioning.
v For an AIX operating system:
1. Try booting the blade server from
another bootable device; then, run AIX
online diagnostics against the failing
adapter.
2. If AIX cannot be booted from another
device, boot the blade server using the
stand-alone diagnostics CD or a NIM
server; then, run diagnostics against the
failing adapter.
v For a Linux operating system, boot the blade
server using the stand-alone diagnostics CD
or a NIM server; then, run diagnostics
against the failing adapter.
26
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code
Description
Action
AA060011
The firmware did not find an operating
system image and at least one hard disk
in the boot list was not detected by the
firmware. The firmware is retrying the
entries in the boot list.
This might occur if a disk enclosure that
contains the boot disk is not fully initialized or
if the boot disk belongs to another partition.
Verify that:
v The boot disk belongs to the partition from
which you are trying to boot.
v The boot list in the SMS menus is correct.
AA130013
Bootable media is missing from a USB
CD-ROM
Verify that a bootable CD is properly inserted
in the CD or DVD drive and retry the boot
operation.
AA130014
The media in a USB CD-ROM has
changed.
1. Retry the operation.
Setenv/$setenv parameter error - the
name contains a null character.
1. Go to “Checkout procedure” on page 184.
Setenv/$setenv parameter error - the
value contains a null character.
1. Go to “Checkout procedure” on page 184.
AA170210
AA170211
AA190001
2. Check for server firmware updates; then,
install the updates if available and retry the
operation.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
The hypervisor function to get or set the 1. Use the operating system to set the system
time-of-day clock reported an error.
clock.
2. If the problem persists, check for server
firmware updates.
3. Install any available updates and retry the
operation.
AA260001
Enter the Type Model Number (Must be Enter the machine type and model of the blade
8 characters)
server at the prompt.
AA260002
Enter the Serial Number (Must be 7
characters)
Enter the serial number of the blade server at
the prompt.
AA260003
Enter System Unique ID (Must be 12
characters)
Enter the system unique ID number at the
prompt.
AA260004
Enter WorldWide Port Number (Must be Enter the worldwide port number of the blade
12 characters)
server at the prompt.
AA260005
Enter Brand (Must be 2 characters)
Enter the brand number of the blade server at
the prompt.
Chapter 2. Diagnostics
27
Bxxxxxxx Service processor early termination SRCs
A Bxxxxxxx system reference code (SRC) is an error code that is related to an event or exception that
occurred in the service processor firmware.
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the
POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha8/codefinder.htm.
Table 13 describes error codes that might occur if POST detects a problem. The description also includes
suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Solving undetermined
problems” on page 227.
Table 13. B181xxxx Service processor early termination SRCs
B181 xxxx Error
Code
Description
7200
Invalid boot request
7201
Service processor failure
7202
The permanent and temporary
firmware sides are both marked
invalid
7203
Error setting boot parameters
7204
Error reading boot parameters
7205
Boot code error
7206
Unit check timer was reset
7207
Error reading from NVRAM
7208
Error writing to NVRAM
7209
The service processor boot watchdog
timer expired and forced the service
processor to attempt a boot from the
other firmware image in the service
processor flash memory
720A
Power-off reset occurred. FipsDump
should be analyzed: Possible
software problem
28
Action
Go to “Checkout procedure” on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
B200xxxx Logical partition SRCs
A B200xxxx SRC is a logical partition reference code that is related to logical partitioning.
Table 14 describes system reference codes that might be displayed if system firmware detects a problem.
Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 14. B200xxxx Logical partition SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
B2001130
Description
Action
A problem occurred during the
migration of a partition
Look for and fix power or thermal problems and then
retry the migration.
You attempted to migrate a partition
to a system that has a power or
thermal problem. The migration will
not continue.
B2001131
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001132
A problem occurred during the
startup of a partition.
Collect a platform dump and then go to “Isolating
firmware problems” on page 218.
A platform firmware error occurred
while it was trying to allocate
memory. The startup will not
continue.
B2001133
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001134
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001140
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001141
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
Chapter 2. Diagnostics
29
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
B2001142
Description
Action
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001143
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001144
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001148
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001150
During the startup of a partition, a
partitioning configuration problem
occurred.
Go to “Verifying the partition configuration” on page 186.
B2001151
A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001170
During the startup of a partition, a
failure occurred due to a validation
error.
Go to “Verifying the partition configuration” on page 186.
B2001225
A problem occurred during the
startup of a partition.
Restart the partition.
The partition attempted to start up
prior to the platform fully
initializing. Restart the partition after
the platform has fully completed and
the platform is not in standby mode.
B2001230
During the startup of a partition, a
partitioning configuration problem
occurred; the partition is lacking the
necessary resources to start up.
Go to “Verifying the partition configuration” on page 186.
B2001260
A problem occurred during the
startup of a partition.
Set the partition to Normal.
The partition could not start at the
Timed Power On setting because the
partition was not set to Normal.
30
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2001265
The partition could not start up. An Correct the startup settings.
operating system Main Storage
Dump startup was attempted with
the startup side on D-mode, which is
not a valid operating system startup
scenario. The startup will be halted.
This SRC can occur when a D-mode
SLIC installation fails and attempts a
Main Storage Dump.
B2001266
The partition could not start up. You Install a supported operating system and restart the
are attempting to start up an
partition.
operating system that is not
supported.
B2001280
A problem occurred during a
partition Main Storage Dump. A
mainstore dump startup did not
complete due to a configuration
mismatch.
B2001281
A partition memory error occurred.
Restart the partition.
The failed memory will no longer be
used.
B2001282
A problem occurred during the
startup of a partition.
Go to “Isolating firmware problems” on page 218.
B2001320
A problem occurred during the
startup of a partition.
Configure a load source for the partition. Then restart the
partition.
Go to “Isolating firmware problems” on page 218.
No default load source was selected.
The startup will attempt to continue,
but there may not be enough
information to find the correct load
source.
B2001321
A problem occurred during the
startup of a partition.
Verify that the correct slot is specified for the load source.
Then restart the partition.
B2001322
In the partition startup, code failed
during a check of the load source
path.
Verify that the path for the load source is specified
correctly. Then restart the partition.
B2002048
A problem occurred during a
partition Main Storage Dump. A
mainstore dump startup did not
complete due to a copy error.
Go to “Isolating firmware problems” on page 218.
B2002054
A problem occurred during a
partition Main Storage Dump. A
mainstore dump IPL did not
complete due to a configuration
mismatch.
Go to “Isolating firmware problems” on page 218.
Chapter 2. Diagnostics
31
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2002058
A problem occurred during a
partition Main Storage Dump. A
mainstore dump startup did not
complete due to a copy error.
Go to “Isolating firmware problems” on page 218.
B2002210
Informational system log entry only.
No corrective action is required.
B2002220
Informational system log entry only.
No corrective action is required.
B2002250
During the startup of a partition, an
attempt to toggle the power state of
a slot has failed.
Check for server firmware updates; then, install the
updates if available.
B2002260
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2002300
During the startup of a partition, an
attempt to toggle the power state of
a slot has failed.
B2002310
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2002320
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2002425
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2002426
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2002475
During the startup of a partition, a
Check for server firmware updates; then, install the
slot that was needed for the partition updates if available.
was either empty or the device in
the slot has failed.
B2002485
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition firmware attempted an
operation that failed.
B2003000
Informational system log entry only.
B2003081
During the startup of a partition, the Check for server firmware updates; then, install the
startup did not complete due to a
updates if available.
copy error.
B2003084
A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
updates if available.
No corrective action is required.
Verify that the adapter type is supported.
The adapter type might not be
supported.
B2003088
32
Informational system log entry only.
No corrective action is required.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
B200308C
Description
Action
A problem occurred during the
startup of a partition.
Verify that a valid I/O Load Source is tagged.
The adapter type cannot be
determined.
B2003090
A problem occurred during the
startup of a partition.
Go to “Isolating firmware problems” on page 218.
B2003110
A problem occurred during the
startup of a partition.
Go to “Isolating firmware problems” on page 218.
B2003113
A problem occurred during the
startup of a partition.
Look for B7xx xxxx errors and resolve them.
B2003114
A problem occurred during the
startup of a partition.
Look for other errors and resolve them.
B2003120
Informational system log entry only.
No corrective action is required.
B2003123
Informational system log entry only.
No corrective action is required.
B2003125
During the startup of a partition, the Check for server firmware updates; then, install the
blade server firmware could not
updates if available.
obtain a segment of main storage
within the blade server to use for
managing the creation of a partition.
B2003128
A problem occurred during the
startup of a partition. A return code
for an unexpected failure was
returned when attempting to query
the load source path.
Look for and resolve B700 69xx errors.
B2003130
A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
updates if available.
B2003135
A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
updates if available.
B2003140
A problem occurred during the
startup of a partition. This is a
configuration problem in the
partition.
Reconfigure the partition to include the intended load
source path.
B2003141
Informational system log entry only.
No corrective action is required.
B2003142
Informational system log entry only.
No corrective action is required.
B2003143
Informational system log entry only.
No corrective action is required.
B2003144
Informational system log entry only.
No corrective action is required.
B2003145
Informational system log entry only.
No corrective action is required.
B2003200
Informational system log entry only.
No corrective action is required.
B2004158
Informational system log entry only.
No corrective action is required.
B2004400
A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
updates if available.
Chapter 2. Diagnostics
33
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2005106
A problem occurred during the
startup of a partition. There is not
enough space to contain the partition
main storage dump. The startup will
not continue.
Verify that there is sufficient memory available to start the
partition as it is configured. If there is already enough
memory, then go to “Isolating firmware problems” on
page 218.
B2005109
A problem occurred during the
startup of a partition. There was a
partition main storage dump
problem. The startup will not
continue.
Go to “Isolating firmware problems” on page 218.
B2005114
A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition. There is not
enough space to contain the partition
main storage dump. The startup will
not continue.
B2005115
A problem occurred during the
startup of a partition. There was an
error reading the partition main
storage dump from the partition
load source into main storage. The
startup will attempt to continue.
B2005117
A problem occurred during the
Use the Main Storage Dump Manager to rename or copy
startup of a partition. A partition
the current main storage dump.
main storage dump has occurred but
cannot be written to the load source
device because a valid dump already
exists.
B2005121
A problem occurred during the
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The startup will not
continue.
Look for related errors in the "Product Activity Log" and
fix any problems found. Use virtual control panel
function 34 to retry the current Main Store Dump startup
while the partition is still in the failed state.
B2005122
Informational system log entry only.
No corrective action is required.
B2005123
Informational system log entry only.
No corrective action is required.
B2005135
A problem occurred during the
Look for other errors and resolve them.
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
B2005137
A problem occurred during the
Look for other errors and resolve them.
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
34
If the startup does not continue, look for and resolve
other errors.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2005145
A problem occurred during the
Look for other errors and resolve them.
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
B2005148
A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition. An error
occurred while doing a main storage
dump that would have caused
another main storage dump. The
startup will not continue.
B2005149
A problem occurred during the
Check for server firmware updates; then, install the
startup of a partition while doing a
updates if available.
Firmware Assisted Dump that would
have caused another Firmware
Assisted Dump.
B200514A
A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
B200542A
A Firmware Assisted Dump did not
complete due to a read error.
Check for server firmware updates; then, install the
updates if available.
B200542B
A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
B200543A
A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
B200543B
A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
B200543C
Informational system log entry only.
No corrective action is required.
B200543D
A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
B2006006
During the startup of a partition, a
Go to “Isolating firmware problems” on page 218.
system firmware error occurred
when the partition memory was
being initialized; the startup will not
continue.
B2006006
A problem occurred during the
startup of a partition. The partition
could not reserve the memory
required for IPL.
B2006012
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition LID failed to completely
load into the partition main storage
area.
Contact IBM support.
Chapter 2. Diagnostics
35
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2006015
A problem occurred during the
startup of a partition. The load
source media is corrupted or not
valid.
Replace the load source media.
B2006025
A problem occurred during the
startup of a partition. This is a
problem with the load source media
being corrupt or not valid.
Replace the load source media.
B2006027
During the startup of a partition, a
failure occurred when allocating
memory for an internal object used
for firmware module load
operations.
1. Make sure that enough main storage was allocated to
the partition.
B2006110
A problem occurred during the
startup of a partition. There was an
error on the load source device. The
startup will attempt to continue.
Look for other errors and resolve them.
B200690A
During the startup of a partition, an Go to “Isolating firmware problems” on page 218.
error occurred while copying open
firmware into the partition load area.
B2007200
Informational system log entry only.
No corrective action is required.
B2008080
Informational system log entry only.
No corrective action is required.
B2008081
During the startup of a partition, an
internal firmware time-out occurred;
the partition might continue to start
up but it can experience problems
while running.
Check for server firmware updates; then, install the
updates if available.
B2008105
During the startup of a partition,
Check for server firmware updates; then, install the
there was a failure loading the VPD updates if available.
areas of the partition; the load source
media has been corrupted or is
unsupported on this server.
B2008106
A problem occurred during the
startup of a partition. The startup
will not continue.
B2008107
During the startup of a partition,
Check for server firmware updates; then, install the
there was a problem getting a
updates if available.
segment of main storage in the blade
server main storage.
B2008109
During the startup of a partition, a
1. Make sure that there is enough memory to start up
failure occurred. The startup will not
the partition.
continue.
2. Check for server firmware updates; then, install the
updates if available.
B2008111
A problem occurred during the
startup of a partition.
36
2. Retry the operation.
Replace the load source media.
Check for server firmware updates; then, install the
updates if available.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2008112
During the startup of a partition, a
failure occurred; the startup will not
continue.
Check for server firmware updates; then, install the
updates if available.
B2008113
During the startup of a partition, an
error occurred while mapping
memory for the partition startup.
Check for server firmware updates; then, install the
updates if available.
B2008114
During the startup of a partition,
there was a failure verifying the
VPD for the partition resources
during startup.
Check for server firmware updates; then, install the
updates if available.
B2008115
During the startup of a partition,
there was a low level
partition-to-partition communication
failure.
Check for server firmware updates; then, install the
updates if available.
B2008117
During the startup of a partition, the Check for server firmware updates; then, install the
partition did not start up due to a
updates if available.
system firmware error.
B2008121
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition did not start up due to a
system firmware error.
B2008123
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition did not start up due to a
system firmware error.
B2008125
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition did not start up due to a
system firmware error.
B2008127
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition did not start up due to a
system firmware error.
B2008129
During the startup of a partition, the Go to “Isolating firmware problems” on page 218.
partition did not start up due to a
system firmware error.
B200813A
There was a problem establishing a
console.
Go to “Isolating firmware problems” on page 218.
B2008140
Informational system log entry only.
No corrective action is required.
B2008141
Informational system log entry only.
No corrective action is required.
B2008142
Informational system log entry only.
No corrective action is required.
B2008143
Informational system log entry only.
No corrective action is required.
B2008144
Informational system log entry only.
No corrective action is required.
B2008145
Informational system log entry only.
No corrective action is required.
B2008150
System firmware detected an error.
Collect a platform dump and then go to “Isolating
firmware problems” on page 218.
Chapter 2. Diagnostics
37
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B2008151
System firmware detected an error.
Use the Integrated Virtualization Manager (IVM) or
management console to increase the Logical Memory
Block (LMB) size, and to reduce the number of virtual
devices for the partition.
B2008152
No active system processors.
Verify that processor resources are assigned to the
partition.
B2008160
A problem occurred during the
migration of a partition.
Contact IBM support.
B2008161
A problem occurred during the
migration of a partition.
Contact IBM support.
B200A100
A partition ended abnormally; the
1. Check the error logs and take the actions for the error
partition could not stay running and
codes that are found.
shut itself down.
2. Go to “Isolating firmware problems” on page 218.
B200A101
A partition ended abnormally; the
1. Check the error logs and take the actions for the error
partition could not stay running and
codes that are found.
shut itself down.
2. Go to “Isolating firmware problems” on page 218.
B200A140
A lower priority partition lost a
usable processor to supply it to a
higher priority partition with a bad
processor.
Evaluate the entire LPAR configuration. Adjust partition
profiles with the new number of processors available in
the system.
B200B07B
Informational system log entry only.
No corrective action is required.
B200B215
A problem occurred after a partition
ended abnormally.
Restart the platform.
There was a communications
problem between this partition's
service processor and the platform's
service processor.
B2005127
Timeout occurred during a main
store dump IPL.
There was not enough memory available for the dump to
complete before the timeout occurred. Retry the main
store dump IPL, or else power on the partition normally.
B2D03001
Informational system log entry only.
No corrective action is required.
B2D03002
Informational system log entry only.
No corrective action is required.
B200C1F0
An internal system firmware error
occurred during a partition
shutdown or a restart.
Go to “Isolating firmware problems” on page 218.
B200D150
A partition ended abnormally; there
was a communications problem
between this partition and the code
that handles resource allocation.
Check for server firmware updates; then, install the
updates if available.
B200E0AA
A problem occurred during the
power off of a partition.
Go to “Isolating firmware problems” on page 218.
38
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference Code
Description
Action
B200F001
A problem occurred during the
startup of a partition. An operation
has timed out.
Look for other errors and resolve them.
B200F003
During the startup of a partition, the Collect the partition dump information; then, go to
partition processor(s) did not start
“Isolating firmware problems” on page 218.
the firmware within the time-out
window.
B200F004
Informational system log entry only.
No corrective action is required.
B200F005
Informational system log entry only.
No corrective action is required.
B200F006
During the startup of a partition, the 1. Check the error logs and take the actions for the error
code load operation for the partition
codes that are found.
startup timed out.
2. Go to “Isolating firmware problems” on page 218.
B200F007
During a shutdown of the partition,
a time-out occurred while trying to
stop a partition.
Check for server firmware updates; then, install the
updates if available.
B200F008
Informational system log entry only.
No corrective action is required.
B200F009
Informational system log entry only.
No corrective action is required.
B200F00A
Informational system log entry only.
No corrective action is required.
B200F00B
Informational system log entry only.
No corrective action is required.
B200F00C
Informational system log entry only.
No corrective action is required.
B200F00D
Informational system log entry only.
No corrective action is required.
B700xxxx Licensed internal code SRCs
A B700xxxx system reference code (SRC) is an error code or event code that is related to licensed internal
code.
Table 15 describes the system reference codes that might be displayed if system firmware detects a
problem. Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 15. B700xxxx Licensed internal code SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
0102
System firmware detected an error. A
machine check occurred during startup.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Chapter 2. Diagnostics
39
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
0103
System firmware detected a failure
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
0104
System firmware failure. Machine check, 1. Check for server firmware updates.
undefined error occurred.
2. Update the firmware.
0105
System firmware detected an error.
More than one request to terminate the
system was issued.
Go to “Isolating firmware problems” on page
218.
0106
System firmware failure.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
0107
0200
0201
System firmware failure. The system
detected an unrecoverable machine
check condition.
1. Collect the event log information.
System firmware has experienced a low
storage condition
No immediate action is necessary.
System firmware detected an error.
No immediate action is necessary.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
Continue running the system normally. At the
earliest convenient time or service window,
work with IBM Support to collect a platform
dump and restart the system; then, go to
“Isolating firmware problems” on page 218.
Continue running the system normally. At the
earliest convenient time or service window,
work with IBM Support to collect a platform
dump and restart the system; then, go to
“Isolating firmware problems” on page 218.
0202
Informational system log entry only.
No corrective action is required.
0302
System firmware failure
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
0441
Service processor failure. The platform
encountered an error early in the
startup or termination process.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
0443
Service processor failure.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
40
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
0601
Informational system log entry only.
No corrective action is required.
Note: This code and associated data can be
used to determine why the time of day for a
partition was lost.
0602
System firmware detected an error
condition.
1. Collect the event log information.
0611
There is a problem with the system
hardware clock; the clock time is
invalid.
Use the operating system to set the system
clock.
0621
Informational system log entry only.
No corrective action is required.
0641
System firmware detected an error.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
2. Go to “Isolating firmware problems” on
page 218.
0650
System firmware detected an error.
Resource management was unable to
allocate main storage. A platform dump
was initiated.
1. Collect the event log.
2. Collect the platform dump data.
3. Collect the partition configuration
information.
4. Go to “Isolating firmware problems” on
page 218.
0651
The system detected an error in the
system clock hardware
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
0803
Informational system log entry only.
No corrective action is required.
0804
Informational system log entry only.
No corrective action is required.
0A00
Informational system log entry only.
No corrective action is required.
0A01
Informational system log entry only.
No corrective action is required.
0A10
Informational system log entry only.
No corrective action is required.
1150
Informational system log entry only.
No corrective action is required.
1151
Informational system log entry only.
No corrective action is required.
1152
Informational system log entry only.
No corrective action is required.
1160
Service processor failure
1. Go to “Isolating firmware problems” on
page 218.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1161
Informational system log entry only.
No corrective action is required.
1730
The VPD for the system is not what was Replace the management card, as described in
expected at startup.
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
Chapter 2. Diagnostics
41
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
1731
The VPD on a memory DIMM is not
correct and the memory on the DIMM
cannot be used, resulting in reduced
memory.
Replace the MEMDIMM symbolic CRU, as
described in “Service processor problems” on
page 200.
1732
The VPD on a processor card is not
Replace the system-board and chassis
correct and the processor card cannot be assembly, as described in “Replacing the FRU
used, resulting in reduced processing
system-board and chassis assembly” on page
power.
260.
1733
System firmware failure. The startup
will not continue.
Look for and correct B1xxxxxx errors. If there
are no serviceable B1xxxxxx errors, or if
correcting the errors does not correct the
problem, contact IBM support to reset the
server firmware settings.
Attention: Resetting the server firmware
settings results in the loss of all of the partition
data that is stored on the service processor.
Before continuing with this operation,
manually record all settings that you intend to
preserve.
The service processor reboots after IBM
Support resets the server firmware settings.
If the problem persists, Replace the
system-board, as described in “Replacing the
FRU system-board and chassis assembly” on
page 260.
173A
A VPD collection overflow occurred.
1. Look for and resolve other errors.
2. If there are no other errors:
a. Update the firmware to the current
level, as described in “Updating the
firmware” on page 263.
b. You might also have to update the
management module firmware to a
compatible level.
173B
A system firmware failure occurred
during VPD collection.
Look for and correct other B1xxxxxx errors.
4091
Informational system log entry only.
No corrective action is required.
4400
There is a platform dump to collect
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
4401
System firmware failure. The system
firmware detected an internal problem.
4402
A system firmware error occurred while Go to “Isolating firmware problems” on page
attempting to allocate the memory
218.
necessary to create a platform dump.
42
Go to “Isolating firmware problems” on page
218.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
4705
System firmware failure. A problem
occurred when initializing, reading, or
using the system VPD. The Capacity on
Demand function is not available.
Restart the system.
4710
Informational system log entry only.
No corrective action is required.
4714
Informational system log entry only.
No corrective action is required.
4715
A problem occurred when initializing,
reading, or using system VPD.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
4750
Informational system log entry only.
No corrective action is required.
4788
Informational system log entry only.
No corrective action is required.
47CB
Informational system log entry only.
No corrective action is required.
5120
System firmware detected an error
If the system is not exhibiting problematic
behavior, you can ignore this error. Otherwise,
go to “Isolating firmware problems” on page
218.
5121
System firmware detected a
1. Collect the event log information.
programming problem for which a
2. Collect the platform dump information.
platform dump may have been initiated.
3. Go to “Isolating firmware problems” on
page 218.
5122
An error occurred during a search for
the load source.
If the partition fails to startup, go to “Isolating
firmware problems” on page 218. Otherwise,
no corrective action is required.
5123
Informational system log entry only.
No corrective action is required.
5190
Operating system error. The server
firmware detected a problem in an
operating system.
Check for error codes in the partition that is
reporting the error and take the appropriate
actions for those error codes.
5191
System firmware detected a virtual I/O
configuration error.
1. Use the Integrated Virtualization Manager
(IVM) or management console to verify or
reconfigure the invalid virtual I/O
configuration.
2. Check for server firmware updates; then,
install the updates if available.
5209
Informational system log entry only.
No corrective action is required.
5219
Informational system log entry only.
No corrective action is required.
5300
System firmware detected a failure
while partitioning resources. The
platform partitioning code encountered
an error.
Check the management-module event log for
error codes; then, take the actions associated
with those error codes.
5301
User intervention required. The system
detected a problem with the partition
configuration.
Use the Integrated Virtualization Manager
(IVM) or management console to reallocate the
system resources.
Chapter 2. Diagnostics
43
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
5302
An unsupported Preferred Operating
System was detected.
Work with IBM support to select a supported
Preferred Operating System; then, re-IPL the
system.
The Preferred Operating System
specified is not supported. The IPL will
not continue.
5303
An unsupported Preferred Operating
System was detected.
The Preferred Operating System
specified is not supported. The IPL will
continue.
Work with IBM support to select a supported
Preferred Operating System; then, re-IPL the
system.
5304
The number of available World Wide
Port Names (WWPN) is low.
https://www-912.ibm.com/supporthome.nsf/
document/51455410
5305
The number of available World Wide
Port Names (WWPN) is low.
https://www-912.ibm.com/supporthome.nsf/
document/51455410
5400
System firmware detected a problem
with a processor.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
5442
System firmware detected an error.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
54DD
Informational system log entry only.
No corrective action is required.
5600
Informational system log entry only.
No corrective action is required.
5601
System firmware failure. There was a
problem initializing, reading, or using
system location codes.
Go to “Isolating firmware problems” on page
218.
5602
The system has out-of-date VPD LIDs.
Check for server firmware updates; then,
install the updates if available.
5603
Enclosure feature code and/or serial
number not valid.
Verify that the machine type, model, and serial
number are correct for this server.
6900
PCI host bridge failure
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2.
6906
44
System bus error
If the problem persists, use the “PCI
expansion card (PIOCARD) problem
isolation procedure” on page 194 to
determine the failing component.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
6907
System bus error
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on
page 218.
6908
System bus error
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6909
System bus error
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on
page 218.
6911
Platform LIC unable to find or retrieve
VPD LID file.
Check for server firmware updates; then,
install the updates if available.
6912
Platform LIC unable to find or retrieve
VPD LID file.
Check for server firmware updates; then,
install the updates if available.
6944
Informational system log entry only.
No corrective action is required.
6950
A platform dump has occurred.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
6951
An error occurred because a partition
needed more NVRAM than was
available.
Use the Integrated Virtualization Manager
(IVM) or management console to delete one or
more partitions.
6952
Informational system log entry only.
No corrective action is required.
6953
PHYP NVRAM is unavailable after a
service processor reset and reload.
Go to “Isolating firmware problems” on page
218.
6954
Informational system log entry only.
No corrective action is required.
6955
Informational system log entry only.
No corrective action is required.
6956
An NVRAM failure was detected.
Go to “Isolating firmware problems” on page
218.
6965
Informational system log entry only.
No corrective action is required.
6970
PCI host bridge failure
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2.
If the problem persists, use the “PCI
expansion card (PIOCARD) problem
isolation procedure” on page 194 to
determine the failing component.
Chapter 2. Diagnostics
45
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
6971
PCI bus failure
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 194
to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as
described in “Replacing the FRU
system-board and chassis assembly” on
page 260.
6972
System bus error
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6973
System bus error
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 194
to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as
described in “Replacing the FRU
system-board and chassis assembly” on
page 260.
6974
Informational system log entry only.
No corrective action is required.
6978
Informational system log entry only.
No corrective action is required.
6979
Informational system log entry only.
No corrective action is required.
697C
Connection from service processor to
system processor failed.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6980
RIO, HSL or 12X controller failure
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6981
System bus error.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6984
Informational system log entry only.
No corrective action is required.
6985
Remote I/O (RIO), high-speed link
(HSL), or 12X loop status message.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6987
Remote I/O (RIO), high-speed link
(HSL), or 12X connection failure.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
46
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
6990
Service processor failure.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6991
System firmware failure
Go to “Isolating firmware problems” on page
218.
6993
Service processor failure
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on
page 218.
6994
Service processor failure.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6995
Informational system log entry only.
No corrective action is required.
69C2
Informational system log entry only.
No corrective action is required.
69C3
Informational system log entry only.
No corrective action is required.
69D9
Host Ethernet Adapter (HEA) failure.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
69DA
Informational system log entry only.
No corrective action is required.
69DB
System firmware failure.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
BAD1
The platform firmware detected an
error.
Go to “Isolating firmware problems” on page
218.
BAD2
System firmware detected an error.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
F103
System firmware failure
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
F104
Operating system error. System
firmware terminated a partition.
Check the management-module event log for
partition firmware error codes (especially
BA00F104); then, take the appropriate actions
for those error codes.
F105
System firmware detected an internal
error
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
Chapter 2. Diagnostics
47
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes
Description
Action
F106
System firmware detected an error
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
F107
System firmware detected an error.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
F108
A firmware error caused the system to
terminate.
1. Collect the event log information.
F10A
System firmware detected an error
Look for and correct B1xxxxxx errors.
F10B
A processor resource has been disabled
due to hardware problems
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
F10C
The platform LIC detected an internal
problem performing Partition Mobility.
1. Collect the event log information.
F120
Informational system log entry only.
No corrective action is required.
F130
Thermal Power Management Device
firmware error was detected.
Check for server firmware updates; then,
install the updates if available.
2. Go to “Isolating firmware problems” on
page 218.
2. Go to “Isolating firmware problems” on
page 218.
BA000010 to BA400002 Partition firmware SRCs
The power-on self-test (POST) might display an error code that the partition firmware detects.
Table 16 describes error codes that might be displayed if POST detects a problem. The description also
includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 16. BA000010 to BA400002 Partition firmware SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA000010
The device data structure is corrupted
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
48
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA000020
Incompatible firmware levels were
found
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000030
BA000031
BA000032
An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
The firmware failed to register the
lpevent queues
1. Reboot the blade server.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000034
The firmware failed to exchange
capacity and allocate lpevents
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000038
The firmware failed to exchange virtual
continuation events
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000040
The firmware was unable to obtain the
RTAS code lid details
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
49
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA000050
The firmware was unable to load the
RTAS code lid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000060
The firmware was unable to obtain the
open firmware code lid details
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000070
The firmware was unable to load the
open firmware code lid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000080
The user did not accept the license
agreement
Accept the license agreement and restart the
blade server.
If the problem persists:
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA000081
Failed to get the firmware license policy
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000082
Failed to set the firmware license policy
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
50
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA000091
Unable to load a firmware code update
module
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA00E820
BA00E830
An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
Failure when initializing ibm,event-scan
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E840
Failure when initializing PCI hot-plug
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E843
BA00E850
BA00E860
Failure when initializing the interface to
AIX or Linux
1. Go to “Checkout procedure” on page 184.
Failure when initializing dynamic
reconfiguration
1. Go to “Checkout procedure” on page 184.
Failure when initializing sensors
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA010000
BA010001
There is insufficient information to boot
the systems
1. Go to “Checkout procedure” on page 184.
The client IP address is already in use
by another network device
Verify that all of the IP addresses on the
network are unique; then, retry the operation.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
51
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA010002
Cannot get gateway IP address
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010003
Cannot get server hardware address
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010004
Bootp failed
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
52
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA010005
File transmission (TFTP) failed
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010006
The boot image is too large
Start up from another device with a bootable
image.
BA010007
The device does not have the required
device_type property.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010008
The device_type property for this device 1. Reboot the blade server.
is not supported by the iSCSI initiator
2. If the problem persists:
configuration specification.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010009
The arguments specified for the ping
function are invalid.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000A
The itname parameter string exceeds the The embedded host Ethernet adapters (HEAs)
maximum length allowed.
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
Chapter 2. Diagnostics
53
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA01000B
The ichapid parameter string exceeds
the maximum length allowed.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000C
The ichappw parameter string exceeds
the maximum length allowed.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000D
The iname parameter string exceeds the
maximum length allowed.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000E
The LUN specified is not valid.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000F
The chapid parameter string exceeds the The embedded host Ethernet adapters (HEAs)
maximum length allowed.
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA010010
The chappw parameter string exceeds
the maximum length allowed.
BA010011
SET-ROOT-PROP could not find / (root) 1. Reboot the blade server.
package
2. If the problem persists:
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
54
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA010013
The information in the error log entry
for this SRC provides network trace
data.
Informational message. No action is required.
BA010014
The information in the error log entry
for this SRC provides network trace
data.
Informational message. No action is required.
BA010015
The information in the error log entry
for this SRC provides network trace
data.
Informational message. No action is required.
BA010020
A trace entry addition failed because of
a bad trace type.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012010
Opening the TCP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012011
TCP failed to read from the network
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012012
TCP failed to write to the network.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012013
Closing TCP failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
55
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA017020
Failed to open the TFTP package
Verify that the Trivial File Transfer Protocol
(TFTP) parameters are correct.
BA017021
Failed to load the TFTP file
Verify that the TFTP server and network
connections are correct.
BA01B010
Opening the BOOTP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01B011
BOOTP failed to read from the network
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01B012
BOOTP failed to write to the network
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01B013
The discover mode is invalid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
56
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA01B014
Closing the BOOTP node failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01B015
The BOOTP discover server timed out
Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D001
Opening the DHCP node failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D020
DHCP failed to read from the network
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D030
DHCP failed to write to the network
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
57
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA01D040
The DHCP discover server timed out
1. Verify that the DHCP server has addresses
available.
2. Verify that the DHCP server configuration
file is not overly constrained. An
over-constrained file might prevent a server
from meeting the configuration requested
by the client.
3. Perform the following actions that
checkpoint CA00E174 describes:
a. Verify that:
v The bootp server is correctly
configured; then, retry the operation.
v The network connections are correct;
then, retry the operation.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
BA01D050
DHCP::discover no good offer
DHCP discovery did not receive any DHCP
offers from the servers that meet the client
requirements.
Verify that the DHCP server configuration file
is not overly constrained. An over-constrained
file might prevent a server from meeting the
configuration requested by the client.
BA01D051
DHCP::discover DHCP request timed
out
DHCP discovery did receive a DHCP offer
from a server that met the client requirements,
but the server did not send the DHCP
acknowledgement (DHCP ack) to the client
DHCP request.
Another client might have used the address
that was served.
Verify that the DHCP server has addresses
available.
BA01D052
DHCP::discover: 10 incapable servers
were found
Ten DHCP servers have sent DHCP offers,
none of which met the requirements of the
client. Check the compatibility of the
configuration that the client is requesting and
the server DHCP configuration files.
BA01D053
DHCP::discover received a reply, but
without a message type
Verify that the DHCP server is properly
configured.
58
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA01D054
DHCP::discover: DHCP nak received
DHCP discovery did receive a DHCP offer
from a server that meets the client
requirements, but the server sent a DHCP not
acknowledged (DHCP nak) to the client DHCP
request.
Another client might be using the address that
was served.
This situation can occur when there are
multiple DHCP servers on the same network,
and server A does not know the subnet
configuration of server B, and vice-versa.
This situation can also occur when the pool of
addresses is not truly divided.
Set the DHCP server configuration file to
"authoritative".
Verify that the DHCP server is functioning
properly.
BA01D055
DHCP::discover: DHCP decline
DHCP discovery did receive a DHCP offer
from one or more servers that meet the client
requirements. However, the client performed
an ARP test on the address and found that
another client was using the address.
The client sent a DHCP decline to the server,
but the client did not receive an additional
DHCP offer from a server. The client still does
not have a valid address.
Verify that the DHCP server is functioning
properly.
BA01D056
DHCP::discover: unknown DHCP
message
DHCP discovery received an unknown DHCP
message type. Verify that the DHCP server is
functioning properly.
BA01D0FF
Closing the DHCP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA030011
RTAS attempt to allocate memory failed
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
59
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA04000F
Self test failed on device; no error or
location code information available
1. If a location code is identified with the
error, replace the device specified by the
location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA040010
Self test failed on device; can't locate
package.
1. If a location code is identified with the
error, replace the device specified by the
location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA040020
The machine type and model are not
recognized by the blade server
firmware.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA040030
The firmware was not able to build the
UID properly for this system. As a
result, problems may occur with the
licensing of the AIX operating system.
1. Go to “Checkout procedure” on page 184.
BA040035
The firmware was unable to find the
“plant of manufacture” in the VPD. This
may cause problems with the licensing
of the AIX operating system.
Verify that the machine type, model, and serial
number are correct for this server. If this is a
new server, check for server firmware updates;
then, install the updates if available.
BA040040
Setting the machine type, model, and
serial number failed.
1. Go to “Checkout procedure” on page 184.
The h-call to switch off the boot
watchdog timer failed.
1. Go to “Checkout procedure” on page 184.
BA040050
60
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA040060
Setting the firmware boot side for the
next boot failed.
1. Go to “Checkout procedure” on page 184.
Failed to reboot a partition in logical
partition mode
1. Go to “Checkout procedure” on page 184.
Failed to locate service processor device
tree node.
1. Go to “Checkout procedure” on page 184.
Failed to send boot failed message
1. Go to “Checkout procedure” on page 184.
BA050001
BA050004
BA05000A
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060008
No configurable adapters found by the
Remote IPL menu in the SMS utilities
This error occurs when the firmware cannot
locate any LAN adapters that are supported by
the remote IPL function. Verify that the devices
in the remote IPL device list are correct using
the SMS menus.
BA06000B
The system was not able to find an
operating system on the devices in the
boot list.
Go to “Boot problem resolution” on page 190.
BA06000C
A pointer to the operating system was
found in non-volatile storage.
1. Go to “Checkout procedure” on page 184.
The environment variable “boot-device”
exceeded the allowed character limit.
1. Go to “Checkout procedure” on page 184.
The environment variable “boot-device”
contained more than five entries.
1. Go to “Checkout procedure” on page 184.
BA060020
BA060021
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
61
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA060022
The environment variable “boot-device”
contained an entry that exceeded 255
characters in length
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA060030
Logical partitioning with shared
processors is enabled and the operating
system does not support it.
1. Install or boot a level of the operating
system that supports shared processors.
2. Disable logical partitioning with shared
processors in the operating system.
3. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA060060
The operating system expects an IOSP
partition, but it failed to make the
transition to alpha mode.
1. Verify that:
v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating
system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA060061
The operating system expects a
1. Verify that:
non-IOSP partition, but it failed to make
v The alpha-mode operating system image
the transition to MGC mode.
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating
system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
62
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA060070
The operating system does not support
this system's processor(s)
Boot a supported version of the operating
system.
BA060071
An invalid number of vectors was
received from the operating system
Boot a supported version of the operating
system.
BA060072
Client-arch-support hcall error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060075
Client-arch-support firmware error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060200
Failed to set the operating system boot
list from the management module boot
list
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA060201
BA060202
BA060300
Failed to read the VPD "boot path" field
value
1. Go to “Checkout procedure” on page 184.
Failed to update the VPD with the new
"boot path" field value
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
An I/O error on the adapter from which 1. Using the SMS menus, select another
the boot was attempted prevented the
adapter from which to boot the operating
operating system from being booted.
system, and reboot the system.
2. Attempt to reboot the system.
3. Go to “Boot problem resolution” on page
190.
BA07xxxx
self configuring SCSI device (SCSD)
controller failure
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
63
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA090001
SCSD DASD: test unit ready failed;
hardware error
1. Go to “Checkout procedure” on page 184.
SCSD DASD: test unit ready failed;
sense data available
1. Go to “Checkout procedure” on page 184.
SCSD DASD: send diagnostic failed;
sense data available
1. Go to “Checkout procedure” on page 184.
SCSD DASD: send diagnostic failed:
devofl cmd
1. Go to “Checkout procedure” on page 184.
There was a vendor specification error.
1. Check the vendor specification for
additional information.
BA090002
BA090003
BA090004
BA09000A
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000B
Generic SCSD sense error
1. Verify that the SCSD cables and devices are
properly plugged.
2. Correct any problems that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000C
The media is write-protected
1. Change the setting of the media to allow
writing, then retry the operation.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
64
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA09000D
The media is unsupported or not
recognized.
1. Insert new media of the correct type.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000E
The media is not formatted correctly.
1. Insert the media.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000F
Media is not present
1. Insert new media with the correct format.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA090010
The request sense command failed.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA090011
The retry limit has been exceeded.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
65
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA090012
There is a SCSD device that is not
supported.
1. Replace the SCSD device that is not
supported with a supported device.
2. If the problem persists:
a. Troubleshoot the SCSD devices.
b. Verify that the SCSD cables and devices
are properly plugged. Correct any
problems that are found.
c. Replace the SCSD cables and devices.
d. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
BA120001
On an undetermined SCSD device, test
unit ready failed; hardware error
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA120002
On an undetermined SCSD device, test
unit ready failed; sense data available
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
66
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA120003
On an undetermined SCSD device, send
diagnostic failed; sense data available
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA120004
On an undetermined SCSD device, send
diagnostic failed; devofl command
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA120010
Failed to generate the SAS device
physical location code. The event log
entry has the details.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130010
USB CD-ROM in the media tray: device
remained busy longer than the time-out
period
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
67
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA130011
USB CD-ROM in the media tray:
1. Retry the operation.
execution of ATA/ATAPI command was
2. Reboot the blade server.
not completed with the allowed time.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130012
USB CD-ROM in the media tray:
execution of ATA/ATAPI command
failed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130013
USB CD-ROM in the media tray:
bootable media is missing from the
drive
1. Insert a bootable CD in the drive and retry
the operation.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
68
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA130014
USB CD-ROM in the media tray: the
media in the USB CD-ROM drive has
been changed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130015
USB CD-ROM in the media tray:
1. Remove the CD or DVD in the drive and
ATA/ATAPI packet command execution
replace it with a known-good disk.
failed.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
BA131010
The USB keyboard has been removed.
1. Reseat the keyboard cable in the
management module USB port.
2. Check for server firmware updates; then,
install the updates if available.
BA140001
The SCSD read/write optical test unit
ready failed; hardware error.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
69
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA140002
The SCSD read/write optical test unit
ready failed; sense data available.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA140003
The SCSD read/write optical send
diagnostic failed; sense data available.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA140004
The SCSD read/write optical send
diagnostic failed; devofl command.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA150001
PCI Ethernet BNC/RJ-45 or PCI
Ethernet AUI/RJ-45 adapter: internal
wrap test failure
Replace the adapter specified by the location
code.
BA151001
10/100 Mbps Ethernet PCI adapter:
internal wrap test failure
Replace the adapter specified by the location
code.
BA151002
10/100 Mbps Ethernet card failure
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA153002
70
Gigabit Ethernet adapter failure
Verify that the MAC address programmed in
the FLASH/EEPROM is correct.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA153003
Gigabit Ethernet adapter failure
1. Check for server firmware updates; then,
install the updates if available.
2. Replace the Gigabit Ethernet adapter.
BA154010
HEA software error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA154020
The required open firmware property
was not found.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154030
Invalid parameters were passed to the
HEA device driver.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154040
The TFTP package open failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154050
The transmit operation failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154060
Failed to initialize the HEA port or
queue
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
71
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA154070
The receive operation failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170000
BA170100
NVRAMRC initialization failed; device
test failed
1. Go to “Checkout procedure” on page 184.
NVRAM data validation check failed
1. Shut down the blade server; then, restart it.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170201
The firmware was unable to expand
target partition - saving configuration
variable
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170202
The firmware was unable to expand
1. Reboot the blade server.
target partition - writing event log entry
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170203
The firmware was unable to expand
target partition - writing VPD data
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170210
72
Setenv/$Setenv parameter error - name
contains a null character
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA170211
Setenv/$Setenv parameter error - value
contains a null character
1. Go to “Checkout procedure” on page 184.
Unable to write a variable value to
NVRAM due to lack of free memory in
NVRAM.
1. Reduce the number of partitions, if
possible, to add more NVRAM memory to
this partition.
BA170220
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170221
Setenv/$setenv had to delete stored
firmware network boot settings to free
memory in NVRAM.
Enter the adapter and network parameters
again for the network boot or network
installation.
BA170998
NVRAMRC script evaluation error command line execution error.
1. Go to “Checkout procedure” on page 184.
PCI device Fcode evaluation error
1. Go to “Checkout procedure” on page 184.
BA180008
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA180009
The Fcode on a PCI adapter left a data
stack imbalance
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA180010
PCI probe error, bridge in freeze state
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA180011
PCI bridge probe error, bridge is not
usable
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
73
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA180012
PCI device runtime error, bridge in
freeze state
1. Go to “Checkout procedure” on page 184.
MSI software error
1. Reboot the blade server.
BA180014
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA180020
No response was received from a slot
during PCI probing.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA180099
PCI probe error; bridge in freeze state,
slot in reset state
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA180100
The FDDI adapter Fcode driver is not
supported on this server.
IBM may produce a compatible driver in the
future, but does not guarantee one.
BA180101
Stack underflow from fibre-channel
adapter
1. Go to “Checkout procedure” on page 184.
Firmware function to get/set
time-of-day reported an error
1. Go to “Checkout procedure” on page 184.
BA190001
74
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA201001
The serial interface dropped data
packets
1. Go to “Checkout procedure” on page 184.
The serial interface failed to open
1. Go to “Checkout procedure” on page 184.
BA201002
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA201003
The firmware failed to handshake
properly with the serial interface
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210000
Partition firmware reports a default
catch
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210001
Partition firmware reports a stack
underflow was caught
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210002
Partition firmware was ready before
standout was ready
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
75
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA210003
A data storage error was caught by
partition firmware
1. If the location code reported with the error
points to an adapter, check for adapter
firmware updates.
2. Apply any available updates.
3. Check for server firmware updates.
4. Apply any available updates.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210004
An open firmware stack-depth assert
failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210010
BA210011
BA210012
The transfer of control to the SLIC
loader failed
1. Go to “Checkout procedure” on page 184.
The transfer of control to the IO
Reporter failed
1. Go to “Checkout procedure” on page 184.
There was an NVRAMRC forced-boot
problem; unable to load the previous
boot's operating system image
1. Use the SMS menus to verify that the
partition firmware can still detect the
operating system image.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210013
There was a partition firmware error
when in the SMS menus.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
76
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA210020
I/O configuration exceeded the
maximum size allowed by partition
firmware.
1. Increase the logical memory block size to
256 MB and restart the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210100
An error may not have been sent to the
management module event log.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210101
The partition firmware event log queue
is full
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210102
There was a communication failure
between partition firmware and the
hypervisor. The lpevent that was
expected from the hypervisor was not
received.
1. Review the event log for errors that
occurred around the time of this error.
2. Correct any errors that are found and
reboot the blade server.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
Chapter 2. Diagnostics
77
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA210103
There was a communication failure
1. Review the event log for errors that
between partition firmware and the
occurred around the time of this error.
hypervisor. There was a failing return
2. Correct any errors that are found and
code with the lpevent acknowledgement
reboot the blade server.
from the hypervisor.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
BA220010
There was a partition firmware error
during a USB hotplug probing. USB
hotplug may not work properly on this
partition.
1. Look for EEH-related errors in the event
log.
2. Resolve any EEH event log entries that are
found.
3. Correct any errors that are found and
reboot the blade server.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA220020
CRQ registration error; partner vslot
may not be valid
Verify that this client virtual slot device has a
valid server virtual slot device in a hosting
partition.
BA278001
Failed to flash firmware: invalid image
file
Download a new firmware update image and
retry the update.
BA278002
Flash file is not designed for this
platform
Download a new firmware update image and
retry the update.
BA278003
Unable to lock the firmware update lid
manager
1. Restart the blade server.
BA278004
An invalid firmware update lid was
requested
Download a new firmware update image and
retry the update.
BA278005
Failed to flash a firmware update lid
Download a new firmware update image and
retry the update.
BA278006
Unable to unlock the firmware update
lid manager
Restart the blade server.
78
2. Verify that the operating system is
authorized to update the firmware. If the
system is running multiple partitions, verify
that this partition has service authority.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA278007
Failed to reboot the system after a
firmware flash update
Restart the blade server.
BA278009
The operating system's server firmware
update management tools are
incompatible with this system.
Go to the IBM download site at
www14.software.ibm.com/webapp/set2/sas/
f/lopdiags/home.html to download the latest
version of the service aids package for Linux.
BA27800A
The firmware installation failed due to a 1. Look for hardware errors in the event log.
hardware error that was reported.
2. Resolve any hardware errors that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA280000
BA290000
BA290001
RTAS discovered an invalid operation
that may cause a hardware error
1. Go to “Checkout procedure” on page 184.
RTAS discovered an internal stack
overflow
1. Go to “Checkout procedure” on page 184.
RTAS low memory corruption was
detected
1. Reboot the blade server.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA290002
RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA310010
Unable to obtain the SRC history
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
79
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA310020
An invalid SRC history was obtained.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA310030
Writing the MAC address to the VPD
failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA330000
Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA330001
Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA330002
Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA330003
Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
80
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA330004
Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340001
There was a logical partition event
communication failure reading the
BladeCenter open fabric manager
parameter data structure from the
service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340002
There was a logical partition event
communication failure reading the
BladeCenter open fabric manager
location code mapping data from the
service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340003
An internal firmware error occurred;
unable to allocate memory for the open
fabric manager location code mapping
data.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340004
An internal firmware error occurred; the 1. Reboot the blade server.
open fabric manager parameter data
2. If the problem persists:
was corrupted.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340005
An internal firmware error occurred; the 1. Reboot the blade server.
location code mapping table was
2. If the problem persists:
corrupted.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
81
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA340006
An LP event communication failure
occurred reading the system initiator
capability data from the service
processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340007
An internal firmware error occurred; the 1. Reboot the blade server.
open fabric manager system initiator
2. If the problem persists:
capability data was corrupted.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340008
An internal firmware error occurred; the 1. Reboot the blade server.
open fabric manager system initiator
2. If the problem persists:
capability data version was not correct.
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340009
An internal firmware error occurred; the 1. Reboot the blade server.
open fabric manager system initiator
2. If the problem persists:
capability processing encountered an
a. Go to “Checkout procedure” on page
unexpected error.
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340010
An internal firmware error was detected 1. Reboot the blade server.
during open fabric manager processing.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340011
Assignment of fabric ID to the I/O
adapter failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
82
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code
Description
Action
BA340020
A logical partition event communication
failure occurred when writing the
BladeCenter open fabric manager
parameter data to the service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340021
A logical partition event communication 1. Reboot the blade server.
failure occurred when writing the
2. If the problem persists:
BladeCenter open fabric manager system
a. Go to “Checkout procedure” on page
initiator capabilities data to the service
184.
processor.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA400001
Informational message: DMA trace
buffer full.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA400002
Informational message: DMA map-out
size mismatch.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
83
POST progress codes (checkpoints)
When you turn on the blade server, the power-on self-test (POST) performs a series of tests to check the
operation of the blade server components. Use the management module to view progress codes that offer
information about the stages involved in powering on and performing an initial program load (IPL).
Progress codes do not indicate an error, although in some cases, the blade server can pause indefinitely
(hang). Progress codes for blade servers are, 8-digit hexadecimal numbers that start with C and D.
Checkpoints are generated by various components. The baseboard management controller (BMC) service
processor and the partitioning firmware are key contributors. The service processor provides additional
isolation procedure codes for troubleshooting.
A checkpoint might have an associated location code as part of the message. The location code provides
information that identifies the failing component when there is a hang condition.
Notes:
1. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify
the failing component when there is a hang condition.
2. For checkpoints with location codes, see “Location codes” on page 14 to identify the failing
component when there is a hang condition.
3. For eight-digit codes not listed here, see “Checkout procedure” on page 184 for information.
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list
to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent
SRCs.
Table 17. Management module reference code listing
Unique ID
System Reference Code
Timestamp
00040001
D1513901
2005-11-13 19:30:20
00000016
D1513801
2005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the
message to cause the management module to present the additional message detail:
D1513901
Created at: 2007-11-13
19:30:20
SRC Version: 0x02
Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
84
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
C1001F00 to C1645300 Service processor checkpoints
The C1xx progress codes, or checkpoints, offer information about the initialization of both the service
processor and the server. Service processor checkpoints are typical reference codes that occur during the
initial program load (IPL) of the server.
Table 18 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
In the following progress codes, x can be any number or letter.
Table 18. C1001F00 to C1645300 checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C10010xx
Pre-standby
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1001F00
C1001F0D
Pre-standby: starting initial transition
file
1. Go to “Checkout procedure” on page 184.
Pre-standby: discovery completed in
initial transition file
1. Wait at least 15 minutes for this checkpoint
to change before you decide that the system
is hung.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
While the blade server displays this
Reading the system VPD might take as long
checkpoint, the service processor reads
as 15 minutes on systems with maximum
the system vital product data (VPD). The
configurations or many disk drives.
service processor must complete reading
the system VPD before the system
2. Go to “Checkout procedure” on page 184.
displays the next progress code.
3. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1001F0F
C1001FFF
C1009x01
Pre-standby: waiting for standby
synchronization from initial transition
file
1. Go to “Checkout procedure” on page 184.
Pre-standby: completed initial transition
file
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM): the
cancontinue flag is being cleared
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
85
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009x02
Hardware object manager: (HOM):
erase HOM IPL step in progress
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
build cards IPL step in progress
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
build processors IPL step in progress
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
build chips IPL step in progress
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
initialize HOM
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
validate HOM
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
GARD in progress
1. Go to “Checkout procedure” on page 184.
Hardware object manager: (HOM):
clock test in progress
1. Go to “Checkout procedure” on page 184.
Frequency control IPL step in progress
1. Go to “Checkout procedure” on page 184.
C1009x04
C1009x08
C1009x0C
C1009x10
C1009x14
C1009x18
C1009x1C
C1009x20
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x24
Asset protection IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
86
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009x28
Memory configuration IPL step in
progress
1. Go to “Checkout procedure” on page 184.
Processor CFAM initialization in
progress
1. Go to “Checkout procedure” on page 184.
Processor self-synchronization in
progress
1. Go to “Checkout procedure” on page 184.
Processor mask attentions being
initialized
1. Go to “Checkout procedure” on page 184.
C1009x2C
C1009x30
C1009034
C1009x38
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Processor check ring IPL step in progress 1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x39
Processor L2 line delete in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x3A
Load processor gptr IPL step in progress 1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x3C
Processor ABIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x40
Processor LBIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x44
Processor array initialization step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
87
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009x46
Processor AVP initialization step in
progress
1. Go to “Checkout procedure” on page 184.
Processor flush IPL step in progress
1. Go to “Checkout procedure” on page 184.
C1009x48
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x4C
Processor wiretest IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x50
Processor long scan IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x54
C1009x58
C1009x5C
C1009x5E
Start processor clocks IPL step in
progress
1. Go to “Checkout procedure” on page 184.
Processor SCOM initialization step in
progress
1. Go to “Checkout procedure” on page 184.
Processor interface alignment procedure
in progress
1. Go to “Checkout procedure” on page 184.
Processor AVP L2 test case in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x60
Processor random data test in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x64
88
Processor enable machine check test in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009x66
Concurrent initialization in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x68
C1009x6C
C1009x70
C1009x74
Processor fabric initialization step in
progress
1. Go to “Checkout procedure” on page 184.
Processor PSI initialization step in
progress
1. Go to “Checkout procedure” on page 184.
ASIC CFAM initialization step in
progress
1. Go to “Checkout procedure” on page 184.
ASIC mask attentions being set up
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x78
ASIC check rings being set up
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x7C
ASIC ABIST test being run
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x80
ASIC LBIST test being run
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x82
ASIC RGC being reset
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x84
ASIC being flushed
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
89
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009x88
ASIC long scan initialization in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x8C
ASIC start clocks in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x90
Wire test in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x92
ASIC restore erepair in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x94
C1009x98
ASIC transmit/receive initialization step
in progress
1. Go to “Checkout procedure” on page 184.
ASIC wrap test in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x9C
C1009x9E
ASIC SCOM initialization step in
progress
1. Go to “Checkout procedure” on page 184.
ASIC HSS set up in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xA0
ASIC onyx BIST in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xA4
90
ASIC interface alignment step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009xA8
ASIC random data test in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xAC
C1009xB0
ASIC enable machine check step in
progress
1. Go to “Checkout procedure” on page 184.
ASIC I/O initialization step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xB4
C1009xB8
C1009xB9
ASIC DRAM initialization step in
progress
1. Go to “Checkout procedure” on page 184.
ASIC memory diagnostic step in
progress
1. Go to “Checkout procedure” on page 184.
PSI diagnostic step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xBB
Restore L3 line delete step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xBD
AVP memory test case in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xC0
C1009xC4
Node interface alignment procedure in
progress
1. Go to “Checkout procedure” on page 184.
Dump initialization step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
91
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C1009xC8
Start PRD step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009xCC
C1009xD0
C1009xD4
Message passing waiting period has
begun
1. Go to “Checkout procedure” on page 184.
Message passing waiting period has
begun
1. Go to “Checkout procedure” on page 184.
Starting elastic interface calibration
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C103A1xx
C103A2xx
C103A3xx
C103A400
C103A401
C116C2xx
92
Hypervisor code modules are being
transferred to system storage
1. Go to “Checkout procedure” on page 184.
Hypervisor data areas are being built in
system storage
1. Go to “Checkout procedure” on page 184.
Hypervisor data structures are being
transferred to system storage
1. Go to “Checkout procedure” on page 184.
Special purpose registers are loaded and
instructions are started on the system
processors
1. Go to “Checkout procedure” on page 184.
Instructions have been started on the
system processors
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
System power interface is listening for
1. Go to “Checkout procedure” on page 184.
power fault events from SPCN. The last
2. Replace the system-board, as described in
byte (xx) will increment up from 00 to
“Replacing the FRU system-board and
1F every second while it waits.
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C162E4xyy
VPD is being collected; yy indicates the
type of device from which VPD is being
collected
1. Go to “Checkout procedure” on page 184.
C1645300
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Starting a data synchronization operation 1. Go to “Checkout procedure” on page 184.
between the primary service processor
2. Replace the system-board, as described in
and the secondary service processor.
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001000 to C20082FF Virtual service processor checkpoints
The C2xx progress codes indicate the progress of a partition IPL that is controlled by the virtual service
processor. The virtual service processor progress codes end after the environment setup completes and
the specific operating system code continues the IPL.
The virtual service processor can start a variety of operating systems. Some codes are specific to an
operating system and therefore, do not apply to all operating systems.
Table 19 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
In the following progress codes, x can be any number or letter.
Table 19. C2001000 to C20082FF checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2001000
Partition auto-startup during a platform
startup
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001010
Startup source
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
93
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2001100
Adding partition resources to the
secondary configuration
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20011FF
Partition resources added successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001200
Checking if startup is allowed
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20012FF
Partition startup is allowed to proceed
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001300
Initializing ISL roadmap
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20013FF
ISL roadmap initialized successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001400
Initializing SP Communication Area #1
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2001410
Initializing startup parameters
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
94
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C20014FF
Startup parameters initialized
successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002100
Power on racks
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002110
Issuing a power on command
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C200211F
Power on command successful
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20021FF
Power on phase complete
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002200
Begin acquiring slot locks
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20022FF
End acquiring slot locks
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002300
Begin acquiring VIO slot locks
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
95
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C20023FF
End acquiring VIO slot locks
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002400
Begin powering on slots
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002450
Waiting for power on of slots to
complete
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20024FF
End powering on slots
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2002500
Begin power on VIO slots
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20025FF
End powering on VIO slots
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003100
Validating ISL command parameters
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003111
Waiting for bus object to become
operational
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
96
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2003112
Waiting for bus unit to become disabled
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003115
Waiting for creation of bus object
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003150
Sending ISL command to bus unit
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20031FF
Waiting for ISL command completion
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20032FF
ISL command complete successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003300
Start SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2003350
Waiting for SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20033FF
Finish SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
97
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2004100
Waiting for load source device to enlist
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2004200
Load source device has enlisted
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2004300
Preparing connection to load source
device
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20043FF
Load source device is connected
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006000
Locating first LID information on the
load source
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006005
Clearing all partition main store
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006010
Locating next LID information on the
load source
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006020
Verifying LID information
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
98
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2006030
Priming LP configuration LID
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006040
Preparing to initiate LID load from load
source
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006050
LP configuration LID primed
successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006060
Waiting for LID load to complete
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006100
LID load completed successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2006200
Loading raw kernel memory image
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20062FF
Loading raw kernel memory image
completed successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008040
Begin transfer slot locks to partition
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
99
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2008060
End transfer slot locks to partition
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008080
Begin transfer VIO slot locks to partition 1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20080A0
End transfer VIO slot locks to partition
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20080FF
Hypervisor low-level session manager
object is ready
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008100
Initializing service processor
communication area #2
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008104
Loading data structures into main store
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008110
Initializing event paths
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008120
Starting processor(s)
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
100
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 19. C2001000 to C20082FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
C2008130
Begin associate of system ports
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008138
Associating system ports to the partition 1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C200813F
End associate of system ports
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20081FF
Processors started successfully, now
1. Go to “Recovering the system firmware” on
waiting to receive the continue
page 220.
acknowledgement from system firmware
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C2008200
Continue acknowledgement received
from system firmware
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C20082FF
VSP startup complete successfully
1. Go to “Recovering the system firmware” on
page 220.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
101
IPL status progress codes
A server that stalls during an initial program load (IPL) of the operating system indicates a problem with
the operating system code or hardware configuration.
The Systems Hardware Information center at http://publib.boulder.ibm.com/infocenter/powersys/
v3r1m5/index.jsp describes IPL status progress codes C3yx, C500, C5yx, C600, and C6xx.
C700xxxx Server firmware IPL status checkpoints:
A server that stalls during an initial program load (IPL) of the server firmware indicates a problem with
the server firmware code. If the C700 progress that you see is not C700 4091, your only service action is
to collect information on words 3 and 4 of the SRC, and to call your next level of support.
Table 20 shows the form of the C700xxxx progress codes, where xxxx can be any number or letter.
v If the system hangs on a progress code, follow the suggested actions in the order in which they are
listed in the Action column until the problem is solved. If an action solves the problem, you can stop
performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and
which components are FRUs.
Table 20. C700xxxx Server firmware IPL status checkpoints
Progress code
Description
Action
C700xxxx
A problem has occurred with the system 1. Shutdown and restart the blade server from
firmware during startup.
the permanent-side image.
2. Check for updates to the system firmware.
3. Update the firmware.
4. Go to “Checkout procedure” on page 184.
5. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260. assembly.
CA000000 to CA2799FF Partition firmware checkpoints
The CAxx partition firmware progress codes provide information about the progress of partition
firmware as it is initializing. In some cases, a server might hang (or stall) at one of these progress codes
without displaying an 8-character system reference code (SRC).
102
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
In the following progress codes, x can be any number or letter.
Table 21. CA000000 to CA2799FF checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA000000
Process control now owned by partition
firmware
1. Go to “Checkout procedure” on page 184.
Checking firmware levels
1. Go to “Checkout procedure” on page 184.
CA000020
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000030
CA000032
Attempting to establish a
communication link by using lpevents
1. Go to “Checkout procedure” on page 184.
Attempting to register lpevent queues
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000034
Attempting to exchange cap and allocate 1. Go to “Checkout procedure” on page 184.
lpevents
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000038
Attempting to exchange virtual continue 1. Go to “Checkout procedure” on page 184.
events
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000040
Attempting to obtain RTAS firmware
details
1. Go to “Checkout procedure” on page 184.
Attempting to load RTAS firmware
1. Go to “Checkout procedure” on page 184.
CA000050
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000060
Attempting to obtain open firmware
details
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
103
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA000070
Attempting to load open firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000080
Preparing to start open firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA000090
Open firmware package corrupted
(phase 1)
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA000091
Attempting to load the second pass of C 1. Go to “Checkout procedure” on page 184.
code
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA0000A0
Open firmware package corrupted
(phase 2)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D001
PCI probe process completed, create PCI 1. Go to “Checkout procedure” on page 184.
bridge interrupt routing properties
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D002
PCI adapter NVRAM hint created;
system is rebooting
1. Go to “Checkout procedure” on page 184.
PCI probing complete
1. Go to “Checkout procedure” on page 184.
CA00D003
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D004
104
Beginning of install-console, loading
GUI package
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00D008
Initialize console and flush queues
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D00C
CA00D00D
The partition firmware is about to
search for an NVRAM script
1. Go to “Checkout procedure” on page 184.
Evaluating NVRAM script
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D010
CA00D011
First pass open firmware initialization
complete; establish parameters for
restart
1. Go to “Checkout procedure” on page 184.
First pass open firmware initialization
complete; control returned to
initialization firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D012
Second pass open firmware initialization 1. Go to “Checkout procedure” on page 184.
complete; control returned to
2. Replace the system-board, as described in
initialization firmware
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00D013
Runtime open firmware initialization
complete; control returned to
initialization firmware
1. Go to “Checkout procedure” on page 184.
About to download the run the SLIC
loader (IOP-less boot)
1. Go to “Checkout procedure” on page 184.
About to download the run the IO
Reporter (for VPD collection)
1. Go to “Checkout procedure” on page 184.
Create RTAS node
1. Go to “Checkout procedure” on page 184.
CA00D020
CA00D021
CA00E101
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
105
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E102
Load and initialize RTAS
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E105
Transfer control to operating system
(normal mode boot)
Go to “Boot problem resolution” on page 190.
CA00E10A
Load RTAS device tree
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E10B
Set RTAS device properties
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E110
Create KDUMP properties
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E130
Build device tree
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E131
Create root node properties
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E134
Create memory node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E135
Create HCA node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
106
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E136
Create BSR node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E137
Create HEA node
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E138
Create options node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E139
Create aliases node and system aliases
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E13A
Create packages node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E13B
Create HEA node
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E13C
Create HEA port node
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E140
Loading operating system
Go to “Boot problem resolution” on page 190.
Chapter 2. Diagnostics
107
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E141
Synchronizing the operating system
bootlist to the management module
bootlist
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E142
The management module bootlist is
being set from the operating system
bootlist
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E143
The operating system bootlist is being
set from the management module
bootlist
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E149
Create boot manager node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E14C
Create terminal emulator node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E14D
Load boot image
Go to “Boot problem resolution” on page 190.
CA00E150
Create host (primary) node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E151
Probing PCI bus
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E152
108
Probing for adapter FCODE; evaluate if
present
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E153
End adapter FCODE probing and
evaluation
1. Go to “Checkout procedure” on page 184.
Create PCI bridge node
1. Go to “Checkout procedure” on page 184.
CA00E154
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E155
Probing PCI bridge secondary bus
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E156
Create plug-in PCI bridge node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E15B
Transfer control to operating system
(service mode boot)
Go to “Boot problem resolution” on page 190.
CA00E15F
Adapter VPD evaluation
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E170
Start of PCI bus probe
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E172
First pass of PCI device probe
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E174
Establishing host connection
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics
109
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E175
Bootp request
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E176
TFTP file transfer
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E177
Transfer failure due to TFTP error
condition
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E178
Initiating TFTP file transfer
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
110
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E179
Closing BOOTP
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E17B
Microprocessor clock speed
measurement
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E198
The system is rebooting to enact changes Go to “Boot problem resolution” on page 190.
that were specified in
ibm,client-architecture-support
CA00E199
The system is rebooting to enact changes 1. Verify that:
that were specified in the boot image
v The bootp server is correctly configured;
ELF header
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E19A
NVRAM auto-boot? variable not found - 1. Go to “Checkout procedure” on page 184.
assume FALSE
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E19B
NVRAM menu? variable not found assume FALSE
1. Go to “Checkout procedure” on page 184.
Create NVRAM node
1. Go to “Checkout procedure” on page 184.
CA00E19D
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
111
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E1A0
User requested boot to SMS menus
using keyboard entry
1. Go to “Checkout procedure” on page 184.
User requested boot to open firmware
prompt using keyboard entry
1. Go to “Checkout procedure” on page 184.
User requested boot using default
service mode boot list using keyboard
entry
1. Go to “Checkout procedure” on page 184.
User requested boot using customized
service mode boot list using keyboard
entry
1. Go to “Checkout procedure” on page 184.
User requested boot to SMS menus
1. Go to “Checkout procedure” on page 184.
CA00E1A1
CA00E1A2
CA00E1A3
CA00E1A4
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1A5
CA00E1A6
CA00E1A7
CA00E1AA
User requested boot to open firmware
prompt
1. Go to “Checkout procedure” on page 184.
User requested boot using default
service mode boot list
1. Go to “Checkout procedure” on page 184.
User requested boot using customized
service mode boot list
1. Go to “Checkout procedure” on page 184.
System boot check for NVRAM settings
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1AB
112
System booting using default service
mode boot list
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E1AC
System booting using customized
service mode boot list
1. Go to “Checkout procedure” on page 184.
System booting to the operating system
1. Go to “Checkout procedure” on page 184.
CA00E1AD
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1AE
CA00E1AF
CA00E1B1
CA00E1B2
System booted to SMS multiboot menu
using NVRAM settings
1. Go to “Checkout procedure” on page 184.
System booted to SMS utilities menu
using NVRAM settings
1. Go to “Checkout procedure” on page 184.
System booting system-directed
boot-device repair
1. Go to “Checkout procedure” on page 184.
XOFF received, waiting for XON
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1B3
XON received
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1B4
CA00E1B5
System-directed boot-string didn't load
an operating system
1. Go to “Checkout procedure” on page 184.
Checking for iSCSI disk aliases
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1D0
Create PCI self configuring SCSI device
(SCSD) node
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
113
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E1D3
Create SCSD block device node (SD)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1D4
Create SCSD byte device node (ST)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1DC
Dynamic console selection
1. Verify the video session and the SOL
session. The console might be redirected to
the video controller.
2. Start a remote control session or access the
local KVM to see the status.
3. Go to “Checkout procedure” on page 184.
4. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1DD
A graphics adapter has been selected as
the firmware console, but the USB
keyboard is not attached.
1. Verify that there is a USB keyboard
attached to a USB port that is assigned to
the partition.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E1F0
Start out-of-box experience
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1F1
CA00E1F2
Start self test sequence on one or more
devices
1. Go to “Checkout procedure” on page 184.
Power on password prompt
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1F3
Privileged-access password prompt
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
114
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E1F4
End self-test sequence on one or more
boot devices; begin system management
services
1. Go to “Checkout procedure” on page 184.
Build boot device list
1. Go to “Checkout procedure” on page 184.
CA00E1F5
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1F6
Determine boot device sequence
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1F7
No boot image located
CA00E1F8
Build boot device list for SCSD adapters. 1. Go to “Checkout procedure” on page 184.
(The location code of the SCSD adapter
2. Replace the system-board, as described in
being scanned is also displayed.)
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1F9
Build boot device list for fibre-channel
adapters. (The location code of the SAN
adapter being scanned is also
displayed.)
1. Go to “Checkout procedure” on page 184.
Building device list for SCSD adapters.
(The device ID and device LUN of the
device being scanned is also displayed.)
1. Go to “Checkout procedure” on page 184.
Scan SCSD bus for attached devices
1. Go to “Checkout procedure” on page 184.
CA00E1FA
CA00E1FB
Go to “Boot problem resolution” on page 190.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E1FC
CA00E1FE
CA00E1FF
Build boot device list for SSA adapters.
(The location code of the SSA adapter
being scanned is also displayed.)
1. Go to “Checkout procedure” on page 184.
Building device list for fibre-channel
(SAN) adapters. (The WWPN of the
SAN adapter being scanned is also
displayed.)
1. Go to “Checkout procedure” on page 184.
Build device list for fibre-channel (SAN)
adapters. (The LUN of the SAN adapter
being scanned is also displayed.)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
115
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
CA00E440
Validate NVRAM, initialize partitions as 1. Go to “Checkout procedure” on page 184.
needed
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E441
Generate /options node NVRAM
configuration variable properties
1. Go to “Checkout procedure” on page 184.
Validate NVRAM partitions
1. Go to “Checkout procedure” on page 184.
CA00E442
Action
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E443
Generate NVRAM configuration variable 1. Go to “Checkout procedure” on page 184.
dictionary words
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E444
The NVRAM size is less than 8K bytes
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E701
Create memory VPD
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E800
Initialize RTAS
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E810
Initializing ioconfig pfds
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E820
Initializing lpevent
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
116
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E830
Initializing event scan
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E840
Initializing hot plug
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E843
Initializing interface/AIX access
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E850
Initializing dynamic reconfiguration
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E860
Initializing sensors
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E865
Initializing VPD
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E870
Initializing pfds memory manager
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E875
Initializing rtas_last_error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E876
Initializing rtas_error_inject
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E877
Initializing dump interface
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
117
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA00E879
Initializing the platform-assisted kdump
interface
1. Go to “Checkout procedure” on page 184.
Initializing set-power-level
1. Go to “Checkout procedure” on page 184.
CA00E885
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E886
Initializing exit2c
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E887
Initialize gdata for activate_firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E890
Starting to initialize open firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00E891
Finished initializing open firmware
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA00EAA1
Probe PCI-PCI bridge bus
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA060203
An alias was modified or created
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
CA26ttss
Waiting for lpevent of type tt and
subtype ss.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
118
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA26FFFF
An extended item was required for
lpevent to complete.
1. Go to “Checkout procedure” on page 184.
The firmware update image contains an
update module that is not already on
the system.
1. Look at the event log for a BA27xxxx error
code to determine if a firmware installation
error occurred.
CA279001
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. If a firmware installation error did occur,
resolve the problem.
3. Retry the firmware update.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA2799FD
A firmware update module is being
read.
This checkpoint alternates in the control panel
with CA2799FF.
This pair of checkpoints might stay in the
display for up to 30 minutes with no indication
of activity other than the alternating codes. Do
not assume that the system is hung until the
alternation stops and only one of the
checkpoints remains in the control panel for at
least 30 minutes, with no other indication of
activity.
If the system is hung on this checkpoint, then
CA2799FD and CA2799FF are not alternating
and you must perform the following
procedure:
1. Shut down the blade server.
2. Restart the blade server using the
permanent boot image, as described in
“Starting the PERM image” on page 220.
3. Use the Update and Manage System Flash
menu to reject the temporary image.
Chapter 2. Diagnostics
119
Table 21. CA000000 to CA2799FF checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
CA2799FF
A firmware update module is being
written.
This checkpoint alternates in the control panel
with CA2799FD.
This pair of checkpoints might stay in the
display for up to 30 minutes with no indication
of activity other than the alternating codes. Do
not assume that the system is hung until the
alternation stops and only one of the
checkpoints remains in the control panel for at
least 30 minutes, with no other indication of
activity.
If the system is hung on this checkpoint, then
CA2799FD and CA2799FF are not alternating
and you must perform the following
procedure:
1. Shut down the blade server.
2. Restart the blade server using the
permanent boot image, as described in
“Starting the PERM image” on page 220.
3. Use the Update and Manage System Flash
menu to reject the temporary image.
D1001xxx to D1xx3FFF Service processor dump codes
D1xx service processor dump status codes indicate the cage or node ID that the dump component is
processing, the node from which the hardware data is collected, and a counter that increments each time
that the dump processor stores 4K of dump data.
Service processor dump status codes use the format, D1yy1xxx, where yy and xxx can be any number or
letter.
The yy part of the code indicates the cage or node ID that the dump component is processing. The node
varies depending on the node from which the hardware data is collected. The node is 0xFF when
collecting the mainstore memory data.
The xxx part of the code is a counter that increments each time that the dump processor stores 4K of
dump data.
Table 22 on page 121 lists the progress codes that might be displayed during the power-on self-test
(POST), along with suggested actions to take if the system hangs on the progress code. Only when you
experience a hang condition should you take any of the actions described for a progress code.
120
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 22. D1001xxx to D1xx3FFF dump codes
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
D1001xxx
Dump error data
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1011xxx
Dump dump header
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D101C00F
D1021xxx
No power off to allow debugging for
CPU controls
1. Go to “Checkout procedure” on page 184.
Dump dump header directory
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1031xxx
Dump dump header fips header
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1041xxx
Dump dump header entry header
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1051xxx
Dump core file for failing component
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1061xxx
Dump all NVRAM
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1071xxx
D1081xxx
Dump component trace for failing
component
1. Go to “Checkout procedure” on page 184.
Dump component data from /opt/p0
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
121
Table 22. D1001xxx to D1xx3FFF dump codes (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
D1091xxx
Dump /opt/p1//*
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1111xxx
Dump /opt/p0/*
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1121xxx
Dump /opt/p1/*
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1131xxx
Dump all traces
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1141xxx
Dump code version
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1151xxx
Dump all /opt/p3 except rtbl
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1161xxx
Dump pddcustomize -r command
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1171xxx
Dump registry -l command
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1181xxx
Dump all /core/core.* files
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1191xxx
122
Dump BDMP component trace (after
dump if enough space)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 22. D1001xxx to D1xx3FFF dump codes (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
D11A1xxx
Dump any state information before
dumping starts
1. Go to “Checkout procedure” on page 184.
Dump /proc filesystem
1. Go to “Checkout procedure” on page 184.
D11B1xxx
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D11C1xxx
Dump mounted filesystem statistics
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D11D1xxx
Dump environment
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1231xxx
Dump update dump headers
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1241xxx
Dump CRC1 calculation off
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1251xxx
Dump CRC1 calculation on
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1261xxx
Dump CRC2 calculation off
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1271xxx
Dump CRC2 calculation on
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1281xxx
Dump output the calculated CRC1
(dump headers)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
123
Table 22. D1001xxx to D1xx3FFF dump codes (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
D1291xxx
Dump output the calculated CRC2 (data 1. Go to “Checkout procedure” on page 184.
and data headers)
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D12A1xxx
Jump to the position in dump directly
after CRC1
1. Go to “Checkout procedure” on page 184.
Initialize the headers dump time and
serial numbers
1. Go to “Checkout procedure” on page 184.
Display final SRC to panel
1. Go to “Checkout procedure” on page 184.
D12B1xxx
D12C1xxx
Action
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D12D1xxx
Rmove /core/core.app.time.pid
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D12E1xxx
Remove /core/core.*
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D12F1xxx
Display beginning SRC to panel
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1301xxx
Turn off error log capture into dump
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1311xxx
Turn on error log capture into dump
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1321xxx
124
Store information about existing core
files
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 22. D1001xxx to D1xx3FFF dump codes (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code
Description
Action
D1381xxx
Invalidate the dump
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1391xxx
Check for valid dump sequence
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D13A1xxx
Get dump identity sequence
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D13B1xxx
Get dump length sequence
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1FF1xxx
Dump complete
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3000 - D1xx3FFF
Platform dump status codes are described in “D1xx3y01 to D1xx3yF2 Service processor
dump codes”
D1xx3y01 to D1xx3yF2 Service processor dump codes:
These D1xx3yxx service processor dump codes use the format: D1xx3yzz, where xx indicates the cage or
node ID that the dump component is processing, y increments from 0 to F to indicate that the system is
not hung, and zz indicates the command being processed.
Chapter 2. Diagnostics
125
Table 23 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
Table 23. D1xx3y01 to D1xx3yF2 checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Description
Progress code
(Command Being Processed)
Action
D1xx3y01
Get SCOM
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y02
Get scan ring
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y03
Get array values
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y04
Stop the clocks
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y05
Flush the cache
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y06
Get CFAM
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y07
Put SCOM
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y08
Send command
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y09
Get optimized cache
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
126
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 23. D1xx3y01 to D1xx3yF2 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Description
Progress code
(Command Being Processed)
Action
D1xx3y0A
Get general purpose (GP) register
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y0B
Processor clean-up
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y0C
Get JTAG register
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3y0D
Stop clocks without quiescing
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3yF0
Memory collection set-up
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3yF1
Memory collection DMA step
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xx3yF2
Memory collection cleanup
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
127
D1xx900C to D1xxC003 Service processor power-off checkpoints
These D1xx service processor power-off status codes offer information about the status of the service
processor during a power-off operation.
Table 24 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
Table 24. D1xx900C to D1xxC003 checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Description
Progress code
(Command Being Processed)
D1xx900C
Breakpoint set in CPU controls has been 1. Go to “Checkout procedure” on page 184.
hit
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
D1xxB0FF
Request to initiate power-off program
has been sent
1. Go to “Checkout procedure” on page 184.
Indicates a message is ready to send to
the hypervisor to power off
1. Go to “Checkout procedure” on page 184.
Waiting for the hypervisor to
acknowledge the delayed power off
notification
1. Go to “Checkout procedure” on page 184.
Waiting for the hypervisor to send the
power off message
1. Go to “Checkout procedure” on page 184.
Hypervisor handshaking is complete
1. Go to “Checkout procedure” on page 184.
D1xxC000
D1xxC001
D1xxC002
D1xxC003
Action
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
128
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Service request numbers (SRNs)
Service request numbers (SRNs) are error codes that the operating system generates. The codes have
three digits, a hyphen, and three or four digits after the hyphen. SRNs can be viewed using the AIX
diagnostics or the Linux service aid “diagela” if it is installed.
Note: The “diagela” service aid is part of the Linux service aids for hardware diagnostics. The service
aids are separate from the operating system and are available for download from the Service and
productivity tools for Linux on POWER systems site.
Using the SRN tables
The service request number (SRN) list is in numerical sequence. The failing function codes (FFCs) are
provided to aid in locating a failing component.
1. Look up a service request number when you see an error code with a hyphen.
The SRN is in the first column of the SRN table in numerical order.
The SRN might have an associated FFC number. Possible FFC values for SRNs are displayed in the
second column of the table. FFC numbers might be the first three digits of the SRN or the last three
digits, or might not be in the SRC.
The third column describes the problem and an action to take to try to fix the problem. The
description also includes how to find the FFC number for an SRC if one exists.
2. See “Failing function codes 151 through 2E33” on page 181 for a description of each FFC value.
3. If the SRN does not appear in the table, see “Solving undetermined problems” on page 227.
4. After replacing a component, verify the replacement part and perform a log-repair action using the
AIX diagnostics.
101-711 through FFC-725 SRNs
AIX might generate service request numbers (SRNs) from 101-711 to FFC-725.
Replace any parts in the order that the codes are listed in Table 25.
Note: An x in the following SRNs represents a digit or character that might have any value.
Table 25. 101-711 through FFC-725 SRNs
SRN
FFC
101-711 to
101-726
711 to 726
Description and Action
The system hung while trying to configure an unknown resource.
1. Run the stand-alone diagnostics problem determination procedure.
2. If the problem remains, refer to “Failing function codes 151 through 2E33” on
page 181 to find the FFC that matches the last three digits of the SRN.
3. Suspect the device adapter or device itself.
101-888
210 227
The system does not IPL. Go to “Performing the checkout procedure” on page 184 or
undetermined problem procedure.
101-2020
The system hung while trying to configure the InfiniBand Communication Manager.
This problem may be attributed to software. Report this problem to the AIX Support
Center.
101-2021
The system hung while trying to configure the InfiniBand TCP/IP Interface. This
problem may be attributed to software. Report this problem to the AIX Support
Center.
101-xxxx
xxxx
The system hung while configuring a resource. The last three or four digits after the
dash (-) identify the failing function code for the resource being configured. Go to
undetermined problem procedure.
Chapter 2. Diagnostics
129
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
103-151
151
The time-of-day battery failed.
1. Go to “Removing the battery” on page 250 to start the battery replacement
procedure.
2. Go to “Installing the battery” on page 251 to complete the procedure.
109-200
The system crashed while you running it.
1. Go to “Performing the checkout procedure” on page 184.
2. If the 8-digit error and location codes were NOT reported, run AIX diagnostics in
problem determination procedure and record and report the 8-digit error and
location codes for this SRN.
110-101
The diagnostics did not detect an installed resource. If this SRN appeared when
running concurrent diagnostics, then run concurrent diagnostics using the diag -a
command.
110-921 to
110-926
812 xxx
The system halted while diagnostics were executing.
110-935
812
The system halted while diagnostics were executing. Use the problem determination
procedure.
110-xxxx
xxxx 221
The system halted while diagnostics were executing.
Note: xxxx corresponds to the last three or four digits of the SRN following the dash
(-).
Go to “Performing the checkout procedure” on page 184 or problem resolution.
Note: xxx corresponds to the last three digits of the SRN.
1. If your 110 SRN is not listed, substitute the last three or four digits of the SRN
for xxxx and go to “Failing function codes 151 through 2E33” on page 181 to
identify the failing feature.
2. Run stand-alone diagnostics and the problem determination procedure for your
operating system.
111-107
A machine check occurred. Go to “Performing the checkout procedure” on page 184.
111-108
An encoded SRN was displayed. Go to “Performing the checkout procedure” on
page 184.
111-121
There is a display problem. Go to “Performing the checkout procedure” on page 184.
111-78C
227
PCI adapter I/O bus problem. Go to “Performing the checkout procedure” on page
184. Perform “Solving undetermined problems” on page 227.
111-999
210
System does not perform a soft reset. Go to “Performing the checkout procedure” on
page 184.
650-xxx
650
Disk drive configuration failed.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Update the disk drive firmware.
4. Troubleshoot the disk drive.
5. Replace the system-board.
651-xxx
The CEC reported a non-critical error.
1. Schedule deferred maintenance.
2. Refer to the entry MAP in this system unit system service guide, with the 8-digit
error and location codes, for the necessary repair action.
3. If the 8-digit error and location codes were NOT reported, then run diagnostics
in problem determination mode and record and report the 8-digit error and
location codes for this SRN.
130
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
651-140
221
Display Character test failed.
Note: Diagnostic will provide this SRN but there is no action to be taken. Do not
perform operator panel test from diagnostics.
651-151
152 2E2
651-152
2E1
Sensor indicates a voltage is outside the normal range. Go to “Performing the
checkout procedure” on page 184.
Sensor indicates an abnormally high internal temperature. Verify that:
1. The room ambient temperature is within the system operating environment.
2. There is unrestricted air flow around the system.
3. All system covers are closed.
4. Verify that all fans in the BladeCenter unit are operating correctly.
651-159
210
Sensor indicates a FRU has failed. Use the failing function codes, use the physical
location code(s) from the diagnostic problem report screen to determine the FRUs.
651-161
2E2
Sensor indicates a voltage is outside the normal range. Go to “Performing the
checkout procedure” on page 184.
651-162
2E1
Sensor indicates an abnormally high internal temperature. Verify that:
1. The room ambient temperature is within the system operating environment.
2. There is unrestricted air flow around the system.
3. There are no fan or blower failures in the BladeCenter unit.
If the problem remains, check the management-module event log for possible causes
of overheating.
651-169
Sensor indicates a FRU has failed. Go to “Performing the checkout procedure” on
page 184.
651-170
Sensor status not available. Go to “Performing the checkout procedure” on page 184.
651-171
Sensor status not available. Go to “Performing the checkout procedure” on page 184.
651-600
Uncorrectable memory or unsupported memory.
1. Examine the memory modules and determine if they are supported types.
2. If the modules are supported, then reseat the DIMMs.
3. Replace the appropriate memory modules.
651-601
Missing or bad memory. If the installed memory matches the reported memory size,
then replace the memory; otherwise, add the missing memory.
651-602
2C7
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-603
2C6 2C7
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-605
2C6
Memory module has no matched pair. The most probable failure is the memory
module paired with the memory module identified by the location code.
1. Examine the memory modules and determine if they are supported types.
2. If the modules are supported, then reseat the DIMMs.
3. Replace the appropriate memory modules.
651-608
D01
Bad L2 cache. Go to “Performing the checkout procedure” on page 184.
651-609
D01
Missing L2 cache. Go to “Performing the checkout procedure” on page 184.
651-610
210
CPU internal error. Go to “Performing the checkout procedure” on page 184.
651-611
210
CPU internal cache controller error. Go to “Performing the checkout procedure” on
page 184.
651-612
D01
External cache ECC single-bit error. Go to “Performing the checkout procedure” on
page 184.
Chapter 2. Diagnostics
131
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
651-613
D01
External cache ECC single-bit error. Go to “Performing the checkout procedure” on
page 184.
651-614
214
System bus time-out error. Go to “Performing the checkout procedure” on page 184.
651-615
292
Time-out error waiting for I/O. Go to “Performing the checkout procedure” on page
184.
651-619
132
Error log analysis indicates an error detected by the CPU. Use failing function codes
and the physical location codes from the diagnostic problem report screen to
determine the FRUs.
651-621
2C6
ECC correctable error. Go to “Performing the checkout procedure” on page 184.
651-623
2C6
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-624
214
Memory control subsystem internal error. Go to “Performing the checkout
procedure” on page 184.
651-625
214
Memory address error (invalid address or access attempt). Go to “Performing the
checkout procedure” on page 184.
651-626
214
Memory data error (bad data going to memory). Go to “Performing the checkout
procedure” on page 184.
651-627
214
System bus time-out error. Go to “Performing the checkout procedure” on page 184.
651-628
210
System bus protocol/transfer error. Go to “Performing the checkout procedure” on
page 184.
651-629
210
Error log analysis indicates an error detected by the memory controller. Go to
“Performing the checkout procedure” on page 184.
651-632
308
Internal device error. Go to “Performing the checkout procedure” on page 184.
651-639
210
Error log analysis indicates an error detected by the I/O. Using the problem
determination procedure, failing function codes, and the physical location codes
from the diagnostic problem report to determine the FRUs.
651-640
2D5
I/O general bus error. Go to “Performing the checkout procedure” on page 184.
651-641
2D6
Secondary I/O general bus error. Go to “Performing the checkout procedure” on
page 184.
651-642
2D3
Internal service processor memory error. Go to “Performing the checkout procedure”
on page 184.
651-643
2D3
Internal service processor firmware error. Go to “Performing the checkout
procedure” on page 184.
651-644
2D3
Other internal service processor hardware error. Go to “Performing the checkout
procedure” on page 184.
651-659
2CD
ECC correctable error. Go to “Performing the checkout procedure” on page 184.
651-65A
2CE
ECC correctable error. Go to “Performing the checkout procedure” on page 184.
651-65B
2CC
ECC correctable error. Go to “Performing the checkout procedure” on page 184.
651-664
302
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-665
303
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-666
304
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-669
2CD
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
651-66A
2CE
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-66B
2CC
Correctable error threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
651-674
302
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-675
303
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-676
304
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-679
2CD
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-67A
2CE
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-67B
2CC
Failed memory module. Go to “Performing the checkout procedure” on page 184.
651-685
303
Memory module has no matched pair. The most probable failure is the memory
module paired with the memory module identified by the location code. Go to
“Performing the checkout procedure” on page 184.
651-686
304
Memory module has no matched pair. The most probable failure is the memory
module paired with the memory module identified by the location code. Go to
“Performing the checkout procedure” on page 184.
651-710
214 2C4
System bus parity error. Go to “Performing the checkout procedure” on page 184.
651-711
210 2C4
System bus parity error. Go to “Performing the checkout procedure” on page 184.
651-712
214
System bus parity error. Go to “Performing the checkout procedure” on page 184.
651-713
214
System bus protocol/transfer error. Go to “Performing the checkout procedure” on
page 184.
651-714
2C4
System bus protocol/transfer error. Go to “Performing the checkout procedure” on
page 184.
651-715
2C4
System bus protocol/transfer error. Go to “Performing the checkout procedure” on
page 184.
651-720
2C7 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-721
2C6 2C7 214 Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-722
2C4
System bus parity error. Go to “Performing the checkout procedure” on page 184.
651-723
2C4
System bus protocol/transfer error. Go to “Performing the checkout procedure” on
page 184.
651-724
292
I/O host bridge time-out error. Go to “Performing the checkout procedure” on page
184.
651-725
292
I/O host bridge address/data parity error. Go to “Performing the checkout
procedure” on page 184.
651-726
Software
651-731
2C8
Intermediate or system bus address parity error. Go to “Performing the checkout
procedure” on page 184.
651-732
2C8
Intermediate or system bus data parity error. Go to “Performing the checkout
procedure” on page 184.
651-733
2C8
Intermediate or system bus address parity error. Go to “Performing the checkout
procedure” on page 184.
I/O host bridge timeout caused by software. This error is caused by a software or
operating system attempt to access an invalid memory address. Go to “Performing
the checkout procedure” on page 184.
Chapter 2. Diagnostics
133
Table 25. 101-711 through FFC-725 SRNs (continued)
134
SRN
FFC
Description and Action
651-734
292
Intermediate or system bus data parity error. Go to “Performing the checkout
procedure” on page 184.
651-735
292
Intermediate or system bus time-out error. Go to “Performing the checkout
procedure” on page 184.
651-736
292
Intermediate or system bus time-out error. Go to “Performing the checkout
procedure” on page 184.
651-740
2D3
Note: Ensure that the system IPLROS and service processor are at the latest
firmware level before removing any parts from the system.
651-741
2D3
Service processor error accessing special registers. Go to “Performing the checkout
procedure” on page 184.
651-742
2D3
Service processor reports unknown communication error. Go to “Performing the
checkout procedure” on page 184.
651-743
2D5
Service processor error accessing Vital Product Data EEPROM. Go to “Performing
the checkout procedure” on page 184.
651-745
2D9
Service processor error accessing power controller. Go to “Performing the checkout
procedure” on page 184.
651-746
2D4
Service processor error accessing fan sensor. Go to “Performing the checkout
procedure” on page 184.
651-747
2D5
Service processor error accessing thermal sensor. Go to “Performing the checkout
procedure” on page 184.
651-748
2E2
Service processor error accessing voltage sensor. Go to “Performing the checkout
procedure” on page 184.
651-750
2D4
Service processor detected NVRAM error. Go to “Performing the checkout
procedure” on page 184.
651-751
2D4
Service processor error accessing real-time clock/time-of-day clock. Go to
“Performing the checkout procedure” on page 184.
651-752
2D4
Service processor error accessing JTAG/COP controller/hardware. Go to
“Performing the checkout procedure” on page 184.
651-753
151 2D4
651-770
292
Intermediate or system bus address parity error. Go to “Performing the checkout
procedure” on page 184.
651-771
292
Intermediate or system bus data parity error. Go to “Performing the checkout
procedure” on page 184.
651-772
292
Intermediate or system bus time-out error. Go to “Performing the checkout
procedure” on page 184.
651-773
227
Intermediate or system bus data parity error. Go to “Performing the checkout
procedure” on page 184.
651-780
2C7 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-781
2C7 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-784
302 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-785
303 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
Service processor detects loss of voltage from the time-of-day clock backup battery.
Go to “Performing the checkout procedure” on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
651-786
304 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-789
2CD 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-78A
2CE 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-78B
2CC 214
Uncorrectable memory error. Go to “Performing the checkout procedure” on page
184.
651-809
Description and Action
Power fault warning due to unspecified cause.
Go to “Performing the checkout procedure” on page 184.
651-810
2E2
Over-voltage condition was detected.
Do the following procedure before replacing any FRUs:
1. Shut the system down.
2. Visually inspect the power cables and reseat the connectors.
3. Run the following command diag -Avd sysplanar0. When the Resource Repair
Action menu displays, select sysplanar0.
651-811
2E2
Under voltage condition was detected
Do the following procedure before replacing any FRUs:
1. Shut the system down.
2. Visually inspect the power cables and reseat the connectors.
3. Run the following command diag -Avd sysplanar0. When the Resource Repair
Action menu displays, select sysplanar0.
651-813
System shutdown due to loss of ac power to the site. System resumed normal
operation, no action required.
651-818
Power fault due to manual activation of power-off request. Resume normal
operation.
651-820
2E1
An over-temperature condition was detected.
1. Make sure that:
v The room ambient temperature is within the system operating environment
v There is unrestricted air flow around the system
2. Replace the system-board.
651-821
2E1
System shutdown due to an over maximum temperature condition being reached.
1. Make sure that:
v The room ambient temperature is within the system operating environment
v There is unrestricted air flow around the system
2. Replace the system-board.
651-822
2E1
System shutdown due to over temperature condition and fan failure. Use the
physical FRU location(s) as the probable cause(s). Use the physical location codes to
replace the FRUs that are identified on the diagnostics problem report screen.
651-831
2E2
Sensor detected a voltage outside of the normal range. Go to “Performing the
checkout procedure” on page 184.
651-832
G2E1
Sensor detected an abnormally high internal temperature. Make sure that:
1. The room ambient temperature is within the system operating environment.
2. There is unrestricted air flow around the system.
3. There are no fan failures.
Chapter 2. Diagnostics
135
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
651-841
152 2E2
651-842
2E1
Description and Action
Sensor detected a voltage outside of the normal range. Go to “Performing the
checkout procedure” on page 184.
Sensor detected an abnormally high internal temperature. Make sure that:
1. The room ambient temperature is within the system operating environment.
2. There is unrestricted air flow around the system.
3. All system covers are closed.
4. There are no fan failures.
136
651-90x
Platform-specific error. Call your support center.
652-600
A non-critical error has been detected: uncorrectable memory or unsupported
memory. Schedule deferred maintenance. Examine the memory modules and
determine if they are supported types. If the modules are supported, then replace
the appropriate memory modules.
652-610
210
A non-critical error has been detected: CPU internal error. Schedule deferred
maintenance. Go to “Performing the checkout procedure” on page 184.
652-611
210
A non-critical error has been detected: CPU internal cache or cache controller error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-612
D01
A non-critical error has been detected: external cache parity or multi-bit ECC error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-613
D01
A non-critical error has been detected: external cache ECC single-bit error. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-623
2C6
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-630
307
A non-critical error has been detected: I/O expansion bus parity error. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-631
307
A non-critical error has been detected: I/O expansion bus time-out error. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-632
307
A non-critical error has been detected: I/O expansion bus connection failure.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-633
307
A non-critical error has been detected: I/O expansion unit not in an operating state.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-634
307
A non-critical error has been detected: internal device error. Schedule deferred
maintenance. Go to “Performing the checkout procedure” on page 184.
652-664
302
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-665
303
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-666
304
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-669
2CD
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred G maintenance. Go to “Performing the checkout procedure” on page 184.
652-66A
2CE
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
652-66B
2CC
A non-critical error has been detected: correctable error threshold exceeded. Schedule
deferred maintenance. Go to “Performing the checkout procedure” on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
652-731
2C8
A non-critical error has been detected: intermediate or system bus address parity
error. Schedule deferred maintenance. Go to “Performing the checkout procedure”
on page 184.
652-732
2C8
A non-critical error has been detected: intermediate or system bus data parity error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-733
2C8 292
A non-critical error has been detected: intermediate or system bus address parity
error. Schedule deferred maintenance. Go to “Performing the checkout procedure”
on page 184.
652-734
2C8 292
A non-critical error has been detected: intermediate or system bus data parity error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-735
2D2 292
A non-critical error has been detected: intermediate or system bus time-out error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-736
2D2 292
A non-critical error has been detected: intermediate or system bus time-out error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-770
2C8 292
A non-critical error has been detected: intermediate system bus address parity error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-771
2C8 292
A non-critical error has been detected: intermediate or system bus data parity error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-772
2D2 292
A non-critical error has been detected: intermediate or system bus time-out error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-773
227
A non-critical error has been detected: intermediate or system bus data parity error.
Schedule deferred maintenance. Go to “Performing the checkout procedure” on page
184.
652-88x
The CEC or SPCN reported a non-critical error.
1. Schedule deferred maintenance.
2. Refer to the entry MAP in this system unit system service guide, with the 8-digit
error and location codes, for the necessary repair action.
3. If the 8-digit error and location codes were NOT reported, then run diagnostics
in problem determination mode and record and report the 8-digit error and
location codes for this SRN.
652-89x
The CEC or SPCN reported a non-critical error.
1. Schedule deferred maintenance.
2. Refer to the entry MAP in this system unit system service guide, with the 8-digit
error and location codes, for the necessary repair action.
3. If the 8-digit error and location codes were NOT reported, then run diagnostics
in problem determination mode and record and report the 8-digit error and
location codes for this SRN.
814-112
814
The NVRAM test failed. Go to “Performing the checkout procedure” on page 184.
814-113
221
The VPD test failed. Go to “Performing the checkout procedure” on page 184.
814-114
814
I/O Card NVRAM test failed. Go to “Performing the checkout procedure” on page
184.
Chapter 2. Diagnostics
137
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
815-100
815
The floating-point processor test failed. Go to “Performing the checkout procedure”
on page 184.
815-101
815
Floating point processor failed. Go to “Performing the checkout procedure” on page
184.
815-102
815
Floating point processor failed. Go to “Performing the checkout procedure” on page
184.
815-200
815 7C0
815-201
815
Processor has a status of failed. Processors with a failed status are deconfigured and
therefore cannot be tested or used by the system. Go to “Performing the checkout
procedure” on page 184.
817-123
817
The I/O planar time-of-day clock test failed. Go to “Performing the checkout
procedure” on page 184.
817-124
817
Time of day RAM test failed. Go to “Performing the checkout procedure” on page
184.
817-210
817
The time-of-day clock is at POR. Go to “Performing the checkout procedure” on
page 184.
817-211
817
Time of day POR test failed. Go to “Performing the checkout procedure” on page
184.
817-212
151
The battery is low. Go to “Performing the checkout procedure” on page 184.
817-213
817
The real-time clock is not running. Go to “Performing the checkout procedure” on
page 184.
817-215
817
Time of day clock not running test failed. Go to “Performing the checkout
procedure” on page 184.
817-217
817
Time of day clock not running. Go to “Performing the checkout procedure” on page
184.
887-101
887
POS register test failed. Go to “Performing the checkout procedure” on page 184.
887-102
138
Power-on self-test indicates a processor failure. Go to “Performing the checkout
procedure” on page 184.
887I/O register test failed. Go to “Performing the checkout procedure” on page 184.
887-103
887
Local RAM test failed. Go to “Performing the checkout procedure” on page 184.
887-104
887
Vital Product Data (VPD) failed. Go to “Performing the checkout procedure” on page
184.
887-105
887
LAN coprocessor internal tests failed. Go to “Performing the checkout procedure” on
page 184.
887-106
887
Internal loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-107
887
External loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-108
887
External loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-109
887
External loopback parity tests failed. Go to “Performing the checkout procedure” on
page 184.
887-110
887
External loopback fairness test failed. Go to “Performing the checkout procedure” on
page 184.
887-111
887
External loopback fairness and parity tests failed. Go to “Performing the checkout
procedure” on page 184.
887-112
887
External loopback (twisted pair) test failed. Go to “Performing the checkout
procedure” on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
887-113
887
External loopback (twisted pair) parity test failed. Go to “Performing the checkout
procedure” on page 184.
887-114
887
Ethernet loopback (twisted pair) fairness test failed. Go to “Performing the checkout
procedure” on page 184.
887-115
887
External loopback (twisted pair) fairness and parity tests failed. Go to “Performing
the checkout procedure” on page 184.
887-116
887
Twisted pair wrap data failed. Go to “Performing the checkout procedure” on page
184.
887-117
887
Software device configuration fails. Go to “Performing the checkout procedure” on
page 184.
887-118
887
Device driver indicates a hardware problem. Go to “Performing the checkout
procedure” on page 184.
887-120
887
Device driver indicates a hardware problem. Go to “Performing the checkout
procedure” on page 184.
887-121
B08
Ethernet transceiver test failed. Go to “Performing the checkout procedure” on page
184.
887-122
B09
Ethernet 10 base-2 transceiver test failed. Go to “Performing the checkout procedure”
on page 184.
887-123
887
Internal loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-124
887
Software error log indicates a hardware problem. Go to “Performing the checkout
procedure” on page 184.
887-125
887
Fuse test failed. Go to “Performing the checkout procedure” on page 184.
887-202
887
Vital Product Data test failed. Go to “Performing the checkout procedure” on page
184.
887-203
887
Vital Product Data test failed. Go to “Performing the checkout procedure” on page
184.
887-209
887
RJ-45 converter test failed. Go to “Performing the checkout procedure” on page 184.
887-304
887
Coprocessor internal test failed. Go to “Performing the checkout procedure” on page
184.
887-305
887
Internal loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-306
887
Internal loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-307
887
External loopback test failed. Go to “Performing the checkout procedure” on page
184.
887-319
887
Software device driver indicates a hardware failure. Go to “Performing the checkout
procedure” on page 184.
887-400
887
Fuse test failed. Go to “Performing the checkout procedure” on page 184.
887-401
887
Circuit breaker for Ethernet test failed. Go to “Performing the checkout procedure”
on page 184.
887-402
887
Ethernet 10 Base-2 transceiver test failed. Go to “Performing the checkout
procedure” on page 184.
887-403
887
Ethernet 10 Base-T transceiver test failed. Go to “Performing the checkout
procedure” on page 184.
887-405
887
Ethernet- network Rerun diagnostics in advanced mode for accurate problem
determination. Go to “Performing the checkout procedure” on page 184.
Chapter 2. Diagnostics
139
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
950-2506
2506 221
Description and Action
Missing options resolution for 3Gb SAS Adapter card.
Try each of the following steps. After reseating, removing, or replacing a part, retry
the operation.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Reseat the 3Gb SAS Adapter card.
4. Replace the 3Gb SAS Adapter card.
5. Replace the system-board.
2506-102E
722
Out of alternate disk storage for storage. Go to “Performing the checkout procedure”
on page 184.
2506-3002
722
Addressed device failed to respond to selection. Go to “Performing the checkout
procedure” on page 184.
2506-3010
722
Disk returned wrong response to adapter. Go to “Performing the checkout
procedure” on page 184.
2506-3020
-
Storage subsystem configuration error. Go to “Performing the checkout procedure”
on page 184.
2506-3100
-
Controller detected device bus interface error. Go to “Performing the checkout
procedure” on page 184.
2506-3109
-
Controller timed out a device command. Go to “Performing the checkout procedure”
on page 184.
2506-3110
-
Device bus interface error. Go to “Performing the checkout procedure” on page 184.
2506-4010
-
Configuration error, incorrect connection between cascaded enclosures. Go to
“Performing the checkout procedure” on page 184.
2506-4020
-
Configuration error, connections exceed IOA design limits. Go to “Performing the
checkout procedure” on page 184.
2506-4030
-
Configuration error, incorrect multipath connection. Go to “Performing the checkout
procedure” on page 184.
2506-4040
-
Configuration error, incomplete multipath connection between controller and
enclosure detected. Go to “Performing the checkout procedure” on page 184.
2506-4041
-
Configuration error, incomplete multipath connection between enclosure and device
detected. Go to “Performing the checkout procedure” on page 184.
2506-4050
-
Attached enclosure does not support required multipath function. Go to “Performing
the checkout procedure” on page 184.
2506-4060
-
Multipath redundancy level got worse. Go to “Performing the checkout procedure”
on page 184.
2506-4100
-
Device bus fabric error. Go to “Performing the checkout procedure” on page 184.
2506-4101
-
Temporary device bus fabric error. Go to “Performing the checkout procedure” on
page 184.
2506-4110
-
Unsupported enclosure function detected. Go to “Performing the checkout
procedure” on page 184.
2506-4150
2506
PCI bus error detected by controller. Go to “Performing the checkout procedure” on
page 184.
2506-4160
2506
PCI bus error detected by controller. Go to “Performing the checkout procedure” on
page 184.
2506-7001
722
Temporary disk data error. Go to “Performing the checkout procedure” on page 184.
140
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2506-8008
BAT
A permanent Cache Battery Pack failure occurred. Go to “Performing the checkout
procedure” on page 184.
2506-8009
BAT
Impending Cache Battery Pack failure. Go to “Performing the checkout procedure”
on page 184.
2506-8150
2506
Controller failure. Go to “Performing the checkout procedure” on page 184.
2506-8157
2506
Temporary controller failure. Go to “Performing the checkout procedure” on page
184.
2506-9000
-
Controller detected device error during configuration discovery. Go to “Performing
the checkout procedure” on page 184.
2506-9001
-
Controller detected device error during configuration discovery. Go to “Performing
the checkout procedure” on page 184.
2506-9002
-
Controller detected device error during configuration discovery. Go to “Performing
the checkout procedure” on page 184.
2506-9008
-
Controller does not support function expected for one or more disks. Go to
“Performing the checkout procedure” on page 184.
2506-9010
-
Cache data associated with attached disks cannot be found. Go to “Performing the
checkout procedure” on page 184.
2506-9011
-
Cache data belongs to disks other than those attached. Go to “Performing the
checkout procedure” on page 184.
2506-9020
-
Two or more disks are missing from a RAID-5 or RAID 6 Disk Array. Go to
“Performing the checkout procedure” on page 184.
2506-9021
-
Two or more disks are missing from a RAID-5 or RAID 6 Disk Array. Go to
“Performing the checkout procedure” on page 184.
2506-9022
-
Two or more disks are missing from a RAID-5 or RAID 6 Disk Array. Go to
“Performing the checkout procedure” on page 184.
2506-9023
-
One or more Disk Array members are not at required physical locations. Go to
“Performing the checkout procedure” on page 184.
2506-9024
-
Physical location of Disk Array members conflict with another Disk Array. Go to
“Performing the checkout procedure” on page 184.
2506-9025
-
Incompatible disk installed at degraded disk location in Disk Array. Go to
“Performing the checkout procedure” on page 184.
2506-9026
-
Previously degraded disk in Disk Array not found at required physical location. Go
to “Performing the checkout procedure” on page 184.
2506-9027
-
Disk Array is or would become degraded and parity data is out of synchronization.
Go to “Performing the checkout procedure” on page 184.
2506-9028
-
Maximum number of functional Disk Arrays has been exceeded. Go to “Performing
the checkout procedure” on page 184.
2506-9029
-
Maximum number of functional Disk Arrays disks has been exceeded. Go to
“Performing the checkout procedure” on page 184.
2506-9030
-
Disk Array is degraded due to missing/failed disk. Go to “Performing the checkout
procedure” on page 184.
2506-9031
-
Automatic reconstruction initiated for Disk Array. Go to “Performing the checkout
procedure” on page 184.
2506-9032
-
Disk Array is degraded due to missing/failed disk. Go to “Performing the checkout
procedure” on page 184.
2506-9041
-
Background Disk Array parity checking detected and corrected errors. Go to
“Performing the checkout procedure” on page 184.
Chapter 2. Diagnostics
141
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2506-9042
-
Background Disk Array parity checking detected and corrected errors on specified
disk. Go to “Performing the checkout procedure” on page 184.
2506-9050
-
Required cache data can not be located for one or more disks. Go to “Performing the
checkout procedure” on page 184.
2506-9051
-
Cache data exists for one or more missing/failed disks. Go to “Performing the
checkout procedure” on page 184.
2506-9052
-
Cache data exists for one or more modified disks. Go to “Performing the checkout
procedure” on page 184.
2506-9054
-
RAID controller resources not available due to previous problems. Go to
“Performing the checkout procedure” on page 184.
2506-9060
-
One or more disk pairs are missing from a RAID-10 Disk Array. Go to “Performing
the checkout procedure” on page 184.
2506-9061
-
One or more disks are missing from a RAID-0 Disk Array. Go to “Performing the
checkout procedure” on page 184.
2506-9062
-
One or more disks are missing from a RAID-0 Disk Array. Go to “Performing the
checkout procedure” on page 184.
2506-9063
-
Maximum number of functional Disk Arrays has been exceeded. Go to “Performing
the checkout procedure” on page 184.
2506-9073
-
Multiple controllers connected in an invalid configuration. Go to “Performing the
checkout procedure” on page 184.
2506-9074
-
Multiple controllers not capable of similar functions or controlling same set of
devices. Go to “Performing the checkout procedure” on page 184.
2506-9075
-
Incomplete multipath connection between controller and remote controller. Go to
“Performing the checkout procedure” on page 184.
2506-9076
-
Missing remote controller. Go to “Performing the checkout procedure” on page 184.
2506-9081
-
Controller detected device error during internal media recovery. Go to “Performing
the checkout procedure” on page 184.
2506-9082
-
Controller detected device error during internal media recovery. Go to “Performing
the checkout procedure” on page 184.
2506-9090
-
Disk has been modified after last known status. Go to “Performing the checkout
procedure” on page 184.
2506-9091
-
Incorrect disk configuration change has been detected. Go to “Performing the
checkout procedure” on page 184.
2506-9092
-
Disk requires Format before use. Format the disk and retry the operation.
2506-FF3D
2506
2506-FFF3
-
2506-FFF4
722
Device problem. Perform diagnostics on the device and retry the operation.
2506-FFF6
722
Device detected recoverable error. Retry the operation.
2506-FFFA
722
Temporary device bus error. Retry the operation.
2506-FFFE
-
Temporary device bus error. Retry the operation.
252B-101
252B
Temporary controller failure. Retry the operation.
Disk media format bad. Reformat the disk and retry the operation.
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
142
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
252B-710
252B
Permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-711
252B
Adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-712
252B
Adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-713
252B
Adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-714
252B
Temporary adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-715
252B
Temporary adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-716
252B 293
PCI bus error detected by EEH.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-717
252B 293
PCI bus error detected by adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-718
252B 293
Temporary PCI bus error detected by adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
143
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
252B-719
252B
Device bus termination power lost or not detected.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-720
252B
Adapter detected device bus failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-721
252B
Temporary adapter detected device bus failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-722
252B
Device bus interface problem.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
252B-723
252B
Device bus interface problem.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
144
254E-201
254E 221
Adapter configuration error. Go to “Performing the checkout procedure” on page
184.
254E-601
254
Error log analysis indicates adapter failure. Go to “Performing the checkout
procedure” on page 184.
254E-602
254
Error log analysis indicates an error attention condition. Go to “Performing the
checkout procedure” on page 184.
254E-603
254
Error log analysis indicates that the microcode could not be loaded on the adapter.
Go to “Performing the checkout procedure” on page 184.
254E-604
254
Error log analysis indicates a permanent adapter failure. Go to “Performing the
checkout procedure” on page 184.
254E-605
254
Error log analysis indicates permanent adapter failure is reported on the other port
of this adapter. Go to “Performing the checkout procedure” on page 184.
254E-606
254
Error log analysis indicates adapter failure. Go to “Performing the checkout
procedure” on page 184.
254E-701
254E 221
Error log analysis indicates permanent adapter failure. Go to “Performing the
checkout procedure” on page 184.
254E-702
254E 221
Error log analysis indicates permanent adapter failure is reported on the other port
of this adapter. Go to “Performing the checkout procedure” on page 184.
2567-xxx
2567
USB integrated system-board and chassis assembly. Go to “Performing the checkout
procedure” on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
256D-201
256D 221
Description and Action
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-601
256D
Error log analysis indicates adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-602
256D
Error log analysis indicates an error attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-603
256D
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-604
256D 210
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-605
256D
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-606
256D
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-701
256D 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
256D-702
256D 221
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
145
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
25C4-xxx
25C4
Generic reference for Broadcom adapter. Go to “Performing the checkout procedure”
on page 184.
25C4-201
25C4
Configuration error. Go to “Performing the checkout procedure” on page 184.
25C4-701
25C4
Permanent adapter failure. Go to “Performing the checkout procedure” on page 184.
25C4-601
25C4
Download firmware error. Go to “Performing the checkout procedure” on page 184.
25C4-602
25C4
EEPROM read error. Go to “Performing the checkout procedure” on page 184.
2604-xxx
2604
Generic reference for 4Gb Fibre Channel Adapter card. Go to “Performing the
checkout procedure” on page 184.
2604-102
2604
Reset Test failure for the Fibre Channel adapter card. Replace the 4Gb Fibre Channel
Adapter card.
2604-103
2604
Register Test failure for the Fibre Channel adapter card. Replace the 4Gb Fibre
Channel Adapter card.
2604-104
2604
SRAM Test failure for the Fibre Channel adapter card. Replace the 4Gb Fibre
Channel Adapter card.
2604-105
2604
Internal Wrap Test failure for the Fibre Channel adapter card. Replace the 4Gb Fibre
Channel Adapter card.
2604-106
2604
Gigabit Link Module (GLM) Wrap Test failure for the Fibre Channel adapter card.
Replace the 4Gb Fibre Channel Adapter card.
2604-108
2604 221
2604-110
2604
2604-201
2604 221
2604-203
2604
2604-204
2604 221
DMA Test Failure for the Fibre Channel adapter card. Go to “Performing the
checkout procedure” on page 184.
2604-205
2604 221
Error on Read/Write Operation for the Fibre Channel adapter card. Go to
“Performing the checkout procedure” on page 184.
2604-701
2604
Error Log Analysis indicates that the adapter self-test failed for the Fibre Channel
adapter card. Go to “Performing the checkout procedure” on page 184.
2604-703
2604
Error Log Analysis indicates that an unknown adapter error has occurred for the
Fibre Channel adapter card. Go to “Performing the checkout procedure” on page
184.
2604-704
2604
Error Log Analysis indicates that an adapter error has occurred for the Fibre Channel
adapter card. Go to “Performing the checkout procedure” on page 184.
2604-705
2604
Error Log Analysis indicates that a parity error has been detected for the Fibre
Channel adapter card.
Enhanced Error Handling Failure on the bus for the Fibre Channel adapter card. Go
to “Performing the checkout procedure” on page 184.
Enhanced Error Handling Failure on the Fibre Channel adapter card. Replace the
4Gb Fibre Channel Adapter card.
Configuration Register Test Failure for the Fibre Channel adapter card. Go to
“Performing the checkout procedure” on page 184.
PCI Wrap Test Failure for the Fibre Channel adapter card. Replace the 4Gb Fibre
Channel Adapter card.
The adapter must be replaced immediately. Failure to do so could result in data
being read or written incorrectly.
146
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2604-706
2604
Error Log Analysis indicates that a fatal hardware error has occurred for the Fibre
Channel adapter card.
This adapter was successfully taken off-line. It will remain off-line until reconfigured
or the system is rebooted. This adapter must be replaced and not brought back
on-line. Failure to adhere to this action could result in data being read or written
incorrectly or in the loss of data.
2607-xxx
2607
Generic reference for 8Gb PCIe Fibre Channel Expansion Card. Go to “Performing
the checkout procedure” on page 184.
2607-102
2607
Reset Test failure for the Fibre Channel adapter card. Replace the 8Gb PCIe Fibre
Channel Expansion Card.
2607-103
2607
Register Test failure for the Fibre Channel adapter card. Replace the 8Gb PCIe Fibre
Channel Expansion Card.
2607-104
2607
SRAM Test failure for the Fibre Channel adapter card. Replace the 8Gb PCIe Fibre
Channel Expansion Card.
2607-105
2607
Internal Wrap Test failure for the Fibre Channel adapter card. Replace the 8Gb PCIe
Fibre Channel Expansion Card.
2607-106
2607
Gigabit Link Module (GLM) Wrap Test failure for the Fibre Channel adapter card.
Replace the 8Gb PCIe Fibre Channel Expansion Card.
2607-108
2607 221
2607-110
2607
2607-201
2607 221
2607-203
2607
2607-204
2607 221
DMA Test Failure for the Fibre Channel adapter card. Go to “Performing the
checkout procedure” on page 184.
2607-205
2607 221
Error on Read/Write Operation for the Fibre Channel adapter card. Go to
“Performing the checkout procedure” on page 184.
2607-701
2607
Error Log Analysis indicates that the adapter self-test failed for the Fibre Channel
adapter card. Go to “Performing the checkout procedure” on page 184.
2607-703
2607
Error Log Analysis indicates that an unknown adapter error has occurred for the
Fibre Channel adapter card. Go to “Performing the checkout procedure” on page
184.
2607-704
2607
Error Log Analysis indicates that an adapter error has occurred for the Fibre Channel
adapter card. Go to “Performing the checkout procedure” on page 184.
2607-705
2607
Error Log Analysis indicates that a parity error has been detected for the Fibre
Channel adapter card.
Enhanced Error Handling Failure on the bus for the Fibre Channel adapter card. Go
to “Performing the checkout procedure” on page 184.
Enhanced Error Handling Failure on the Fibre Channel adapter card. Replace the
8Gb PCIe Fibre Channel Expansion Card.
Configuration Register Test Failure for the Fibre Channel adapter card. Go to
“Performing the checkout procedure” on page 184.
PCI Wrap Test Failure for the Fibre Channel adapter card. Replace the 8Gb PCIe
Fibre Channel Expansion Card.
The adapter must be replaced immediately. Failure to do so could result in data
being read or written incorrectly.
2607-706
2607
Error Log Analysis indicates that a fatal hardware error has occurred for the Fibre
Channel adapter card.
This adapter was successfully taken off-line. It will remain off-line until reconfigured
or the system is rebooted. This adapter must be replaced and not brought back
on-line. Failure to adhere to this action could result in data being read or written
incorrectly or in the loss of data.
Chapter 2. Diagnostics
147
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2624-xxx
2624
Generic reference for 4X PCI-E DDR InfiniBand Host Channel Adapter system-board and chassis assembly. Go to “Performing the checkout procedure” on
page 184.
2624-101
2624
Configuration failure - system-board and chassis assembly. Go to “Performing the
checkout procedure” on page 184.
2624-102
2624
Queue pair create failure - system-board and chassis assembly. Go to “Performing
the checkout procedure” on page 184.
2624-103
2624
Loop back test failure - system-board and chassis assembly. Go to “Performing the
checkout procedure” on page 184.
2624-201
cable
Loop back test failure. Do the following steps one at a time, in order, and rerun the
test after each step:
network
1. Reseat the cable.
2. Replace the cable.
3. Verify that the network is functional.
4. Verify that the network switch is functional.
2624-301
cable
network
2624
Loop back test failure. Do the following steps one at a time, in order, and rerun the
test after each step:
1. Reseat the cable.
2. Replace the cable.
3. Verify that the network is functional.
4. Verify that the network switch is functional.
5. Go to “Performing the checkout procedure” on page 184.
148
2624-701
2624
Error Log Analysis indicates that this adapter has failed due to an internal error system-board and chassis assembly. Go to “Performing the checkout procedure” on
page 184.
2624-702
2624
Error Log Analysis indicates that this adapter has failed due to a failure with the
uplink interface used to connect this device to the host processor - system-board and
chassis assembly. Go to “Performing the checkout procedure” on page 184.
2624-703
2624
Error Log Analysis indicates that this adapter has failed due to a memory error system-board and chassis assembly. Go to “Performing the checkout procedure” on
page 184.
2624-704
2624
Error Log Analysis indicates that this adapter has failed due to a unrecoverable
internal parity error - system-board and chassis assembly. Go to “Performing the
checkout procedure” on page 184.
2624-705
2624
Error Log Analysis indicates that this adapter has failed due to a internal error system-board and chassis assembly. Go to “Performing the checkout procedure” on
page 184.
2624-706
2624
Error Log Analysis indicates that this adapter has failed due to a memory error system-board and chassis assembly. Go to “Performing the checkout procedure” on
page 184.
2640-121
2640
Physical volume hardware error. Go to “Performing the checkout procedure” on
page 184.
2640-131
2640
Smart status threshold exceeded. Go to “Performing the checkout procedure” on
page 184.
2640-132
2640
Command timeouts threshold exceeded. Go to “Performing the checkout procedure”
on page 184.
2640-133
2640
Command timeout with error condition. Go to “Performing the checkout procedure”
on page 184.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2640-134
2640
Hardware command or DMA failure. Go to “Performing the checkout procedure” on
page 184.
2640-136
2640 2631
Timeout waiting for controller or drive with no busy status. Go to “Performing the
checkout procedure” on page 184.
2D02-xxx
2631
Generic reference for USB controller/adapter - system-board and chassis assembly.
Go to “Performing the checkout procedure” on page 184.
2E00-201
2E00 221
Configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E00-601
2E00
EEPROM read error
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E00-701
2E00 221
Permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-201
2E10 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-601
2E10
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-602
2E10
Error Log Analysis indicates an Error Attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-603
2E10
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-604
2E10
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
149
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E10-605
2E10
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-606
2E10
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-701
2E10 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E10-702
2E10 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-201
2E13 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-601
2E13
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-602
2E13
Error Log Analysis indicates an Error Attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-603
2E13
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-604
2E13
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
150
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E13-605
2E13
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-606
2E13
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-701
2E13 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E13-702
2E13 221
Error Log Analysis indicates permanent adapter failure is reported
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-201
2E14 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-601
2E14
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-602
2E14
Error Log Analysis indicates an Error Attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-603
2E14
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-604
2E14
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
151
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E14-605
2E14
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-606
2E14
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-701
2E14 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E14-702
2E14 221
Error Log Analysis indicates permanent adapter failure is reported
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-201
2E15 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-601
2E15
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-602
2E15
Error Log Analysis indicates an Error Attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-603
2E15
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-604
2E15
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
152
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E15-605
2E15
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-606
2E15
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-701
2E15 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E15-702
2E15 221
Error Log Analysis indicates permanent adapter failure is reported.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-201
2E21 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-601
2E21
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-602
2E21
Error Log Analysis indicates an Error Attention condition.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-603
2E21
Error Log Analysis indicates that the microcode could not be loaded on the adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-604
2E21
Error Log Analysis indicates a permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
153
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E21-605
2E21
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-606
2E21
Error Log Analysis indicates adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-701
2E21 221
Error Log Analysis indicates permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E21-702
2E21 221
Error Log Analysis indicates permanent adapter failure is reported on the other port
of this adapter.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-101
2E23
Register Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-102
2E23
VPD Checksum Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-103
2E23
Flash Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-104
2E23
Internal Wrap Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-105
2E23
External Wrap Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
154
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E23-106
2E23
External Wrap with IP Checksum Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-107
2E23
External Wrap with TCP Checksum Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-108
2E23
External Wrap with UDP Checksum Test Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-109
241
Network link test failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-201
2E23 221
Enhanced Error Handling Failure
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-202
241 2E23
Network link test failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-601
2E23
Error log analysis indicates a hardware error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-602
2E23
Error log analysis indicates an EEH error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E23-603
2E23
Error log analysis indicates an EEPROM error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
Chapter 2. Diagnostics
155
Table 25. 101-711 through FFC-725 SRNs (continued)
SRN
FFC
Description and Action
2E23-604
2E23
Error log analysis indicates transmission errors.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E33-201
2E33 221
Adapter configuration error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E33-601
2E33
Download firmware error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E33-602
2E33
EEPROM read error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
2E33-701
2E33 221
Permanent adapter failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system-board.
FFC-724
FFC
Temporary device bus interface problem.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Go to “Performing the checkout procedure” on page 184.
FFC-725
FFC
Temporary device bus interface problem.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Go to “Performing the checkout procedure” on page 184.
156
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
A00-FF0 through A24-xxx SRNs
AIX might generate service request numbers (SRNs) from A00-FF0 to A24-xxx.
Note: Some SRNs in this sequence might have 4 rather than 3 digits after the dash (–).
Table 26 shows the meaning of an x in any of the following SRNs, such as A01-00x.
Table 26. Meaning of the last character (x) after the hyphen
Number
Meaning
1
Replace all FRUs listed
2
Hot swap supported
4
Software might be the cause
8
Reserved
Table 27 describes each SRN and provides a recommended corrective action.
Table 27. A00-FF0 through A24-xxx SRNs
SRN
A00-FF0
Description
FRU/action
Error log analysis is unable to
determine the error. The error log
indicates the following physical FRU
locations as the probable causes.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-00x
Error log analysis indicates an error
detected by the microprocessor, but the
failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-01x
GCPU internal error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-02x
CPU internal cache or cache controller
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-05x
System bus time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
157
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A01-06x
Description
FRU/action
Time-out error waiting for I/O.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-07x
System bus parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A01-08x
System bus protocol/transfer error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-00x
Error log analysis indicates an error
detected by the memory controller, but
the failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-01x
Uncorrectable Memory Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-03x
Correctable error threshold exceeded.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-04x
Memory Control subsystem internal
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-05x
Memory Address Error (invalid address 1. Check the BladeCenter management-module
or access attempt).
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
158
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A02-06x
Description
FRU/action
Memory Data error (Bad data going to
memory).
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-09x
System bus parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-10x
System bus time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-11x
System bus protocol/transfer error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see“POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-12x
I/O Host Bridge time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A02-13x
I/O Host Bridge address/data parity
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-00x
Error log analysis indicates an error
detected by the I/O device, but the
failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-01x
I/O Bus Address parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
159
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A03-05x
Description
FRU/action
I/O Error on non-PCI bus.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-07x
System bus address parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-09x
System bus data parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-11x
System bus time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-12x
Error on System bus.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-13x
I/O Expansion bus parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-14x
I/O Expansion bus time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A03-15x
I/O Expansion bus connection failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
160
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A03-16x
Description
FRU/action
I/O Expansion unit not in an operating 1. Check the BladeCenter management-module
state.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-00x
Error log analysis indicates an
environmental and power warning, but
the failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-01x
Sensor indicates a fan has failed.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-02x
System shutdown due to a fan failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-03x
Sensor indicates a voltage outside
normal range.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-04x
System shutdown due to voltage
outside normal range.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-05x
Sensor indicates an abnormally high
internal temperature.
1. Make sure that:
a. The room ambient temperature is within
the system operating environment.
b. There is unrestricted air flow around the
system.
c. All system covers are closed.
d. There are no fan failures
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
161
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A05-06x
Description
FRU/action
System shutdown due to abnormally
high internal temperature.
1. Make sure that:
a. The room ambient temperature is within
the system operating environment.
b. There is unrestricted air flow around the
system.
c. All system covers are closed.
d. There are no fan failures
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
A05-07x
Sensor indicates a power supply has
failed.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-08x
System shutdown due to power supply 1. Check the BladeCenter management-module
failure.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-10x
System shutdown due to FRU that has
failed.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-14x
System shutdown due to power fault
with an unspecified cause.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-19x
System shutdown due to Fan failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
162
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A05-21x
Description
FRU/action
System shutdown due to Over
temperature condition.
1. Make sure that:
a. The room ambient temperature is within
the system operating environment.
b. There is unrestricted air flow around the
system.
c. All system covers are closed.
d. There are no fan failures
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
A05-22x
System shutdown due to over
temperature and fan failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A05-24x
Power Fault specifically due to internal
battery failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-00x
Error log analysis indicates an error
detected by the Service Processor, but
the failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-06x
Service Processor reports unknown
communication error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-07x
Internal service processor firmware
error or incorrect version.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-08x
Other internal Service Processor
hardware error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
163
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A0D-09x
Description
FRU/action
Service Processor error accessing Vital
Product Data EEPROM.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-18x
Service Processor detected NVRAM
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-19x
Service Processor error accessing Real
Time Clock/Time-of-Day Clock.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-21x
Service Processor detect error with
Time-of-Day Clock backup battery.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-23x
Loss of heart beat from Service
Processor.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-24x
Service Processor detected a
surveillance time-out.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-31x
Error detected while handling an
attention/interrupt from the system
hardware.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-35x
Mainstore or Cache IPL Diagnostic
Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
164
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A0D-36x
Description
FRU/action
Other IPL Diagnostic Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-37x
Clock or PLL Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-38x
Hardware Scan or Initialization Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A0D-40x
FRU Presence/Detect Error
(Mis-Plugged).
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A10-100
The resource is unavailable due to an
error. System is operating in degraded
mode.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A10-200
The resource was marked failed by the
platform. The system is operating in
degraded mode.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A10-210
The processor has been deconfigured.
The system is operating in degraded
mode.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A11-00x
A non-critical error has been detected.
Error log analysis indicates an error
detected by the microprocessor, but the
failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
165
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A11-01x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
CPU internal error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A11-02x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
CPU internal cache or cache controller
event log; if an error was recorded by the
error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A11-03x
A non-critical error has been detected,
an external cache parity or multi-bit
ECC error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, see “Solving
undetermined problems” on page 227
A11-05x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus time-out error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A11-06x
A non-critical error has been detected, a Check the BladeCenter management-module
time-out error waiting for an I/O
event log for an entry around this time. If no
device.
entry is found, replace the system-board and
chassis assembly.
A11-50x
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
Resource has been deconfigured and is
no longer in use due to a trend toward
an unrecoverable error.
1. Schedule maintenance; the system is
operating in a degraded mode.
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
A11-510
A11-540
166
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A11-550
A12-00x
Description
FRU/action
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
A non-critical error has been detected.
Error log analysis indicates an error
detected by the memory controller, but
the failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
2. If no entry is found, Replace the
system-board.
A12-01x
A non-critical error has been detected,
an uncorrectable memory error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-02x
A non-critical error has been detected,
an ECC correctable error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-03x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
correctable error threshold exceeded.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-04x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
memory control subsystem internal
event log; if an error was recorded by the
error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-05x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
memory address error (invalid address
event log; if an error was recorded by the
or access attempt).
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-06x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
memory data error (bad data going to
event log; if an error was recorded by the
memory).
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
167
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A12-07x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
memory bus/switch internal error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-08x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
memory time-out error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-09x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus parity error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-10x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus time-out error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-11x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus protocol/transfer error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-12x
A non-critical error has been detected,
an I/O host bridge time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-13x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
I/O host bridge address/data parity
event log; if an error was recorded by the
error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-15x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system support function error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
168
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A12-16x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus internal hardware/switch
event log; if an error was recorded by the
error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A12-50x
A13-00x
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
A non-critical error has been detected, a 1. Check the BladeCenter management-module
error log analysis indicates an error
event log; if an error was recorded by the
detected by the I/O device, but the
system, see “POST progress codes
failure could not be isolated.
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-01x
A non-critical error has been detected,
an I/O bus address parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-02x
A non-critical error has been detected,
an I/O bus data parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-03x
A non-critical error has been detected,
an I/O bus time-out, access or other
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-04x
A non-critical error has been detected,
an I/O bridge/device internal error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-05x
A non-critical error has been detected,
an I/O error on non-PCI bus.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
169
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A13-06x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
mezzanine bus address parity error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-07x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus address parity error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-09x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus data parity error.
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-11x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
system bus time-out error
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-12x
A non-critical error has been detected,
an error on system bus.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-13x
A non-critical error has been detected,
an I/O expansion bus parity error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-14x
A non-critical error has been detected,
an I/O expansion bus time-out error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-15x
A non-critical error has been detected,
an I/O expansion bus connection
failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
170
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A13-16x
Description
FRU/action
A non-critical error has been detected,
an I/O expansion unit not in an
operating state.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A13-50x
A15-01x
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
Sensor indicates a fan is turning too
slowly.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. Check the BladeCenter management-module
event log for an entry around this time. If no
entry is found, replace the system-board and
chassis assembly.
2. If no entry is found, Replace the
system-board.
A15-03x
Sensor indicates a voltage outside
normal range.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-05x
Sensor indicates an abnormally high
internal temperature.
1. Make sure that:
a. The room ambient temperature is within
the system operating environment.
b. There is unrestricted air flow around the
system.
c. All system covers are closed.
d. There are no fan failures
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
A15-07x
Sensor indicates a power supply has
failed.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-11x
Sensor detected a redundant fan failure. 1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
171
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A15-12x
Description
FRU/action
Sensor detected redundant power
supply failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-13x
Sensor detected a redundant FRU that
has failed.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-14x
Power fault due to unspecified cause.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-17x
Internal redundant power supply
failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-19x
Fan failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-20x
Non-critical cooling problem, loss of
redundant fan.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-21x
Over temperature condition.
1. Make sure that:
a. The room ambient temperature is within
the system operating environment.
b. There is unrestricted air flow around the
system.
c. All system covers are closed.
d. There are no fan failures
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
172
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A15-22x
Description
FRU/action
Fan failure and Over temperature
condition.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-23x
Non-critical power problem, loss of
redundant power supply.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-24x
Power Fault specifically due to internal
battery failure.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A15-50x
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
A1D-00x
A non-critical error has been detected.
Error log analysis indicates an error
detected by the Service Processor, but
the failure could not be isolated.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-02x
A non-critical error has been detected,
an I/O (I2C) general bus error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-04x
A non-critical error has been detected,
an internal service processor memory
error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
173
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A1D-05x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor error accessing special
event log; if an error was recorded by the
registers.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-06x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor reports unknown
event log; if an error was recorded by the
communication error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-07x
A non-critical error has been detected,:
Internal service processor firmware
error or incorrect version.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-08x
A non-critical error has been detected,
another internal service processor
hardware error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-09x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor error accessing vital
event log; if an error was recorded by the
product data EEPROM.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-12x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor error accessing fan
event log; if an error was recorded by the
sensor.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-13x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor error accessing a
event log; if an error was recorded by the
thermal sensor.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-18x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor detected NVRAM
event log; if an error was recorded by the
error.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
174
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A1D-19x
Description
FRU/action
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor error accessing real
event log; if an error was recorded by the
time clock/time-of-day clock.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-20x
A non-critical error has been detected:
Service processor error accessing scan
controller/hardware.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-21x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor detected error with
event log; if an error was recorded by the
time-of-day clock backup battery.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-23x
A non-critical error has been detected:
Loss of heart beat from Service
Processor.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-24x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service processor detected a
event log; if an error was recorded by the
surveillance time-out.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-29x
A non-critical error has been detected, a 1. Check the BladeCenter management-module
service process error accessing power
event log; if an error was recorded by the
control network.
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-30x
A non-critical error has been detected:
Non-supported hardware.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-31x
A non-critical error has been detected:
Error detected while handling an
attention/interrupt from the system
hardware.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
Chapter 2. Diagnostics
175
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A1D-34x
Description
FRU/action
A non-critical error has been detected:
Wire Test Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-35x
A non-critical error has been detected:
Mainstore or Cache IPL Diagnostic
Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-37x
A non-critical error has been detected:
Clock or PLL Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-38x
A non-critical error has been detected:
Hardware Scan or Initialization Error.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-40x
A non-critical error has been detected:
Presence/Detect Error (Mis-Plugged).
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. If no entry is found, Replace the
system-board.
A1D-50x
Recoverable errors on resource indicate
a trend toward an unrecoverable error.
However, the resource could not be
deconfigured and is still in use. The
system is operating with the potential
for an unrecoverable error.
1. If repair is not immediately available, reboot
and the resource will be deconfigured;
operations can continue in a degraded mode.
2. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
3. If no entry is found, Replace the
system-board.
A24-000
Spurious interrupts on shared interrupt
level have exceeded threshold
1. Check the BladeCenter management-module
event log. If an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. Replace part numbers reported by the
diagnostic program.
3. If no entry is found, Replace the
system-board.
176
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 27. A00-FF0 through A24-xxx SRNs (continued)
SRN
A24-xxx
Description
FRU/action
Spurious interrupts have exceeded
threshold.
1. Check the BladeCenter management-module
event log; if an error was recorded by the
system, see “POST progress codes
(checkpoints)” on page 84.
2. Replace part numbers reported by the
diagnostic program.
3. If no entry is found, Replace the
system-board.
SCSD Devices SRNs (ssss-102 to ssss-640)
These service request numbers (SRNs) identify a Self-Configuring SCSI Device (SCSD) problem.
Use Table 28 to identify an SRN when you suspect a SAS hard disk device problem. Replace the parts in
the order that the failing function codes (FFCs) are listed.
Notes:
1. Some SRNs might have 4 digits rather than 3 digits after the dash (–).
2. The ssss before the dash (–) represents the 3 digit or 4 digit SCSD SRN.
Table 28. ssss-102 through ssss-640 SRNs
SRN
FFC
Description and action
ssss-102
ssss
An unrecoverable media error occurred.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-104
ssss
The motor failed to restart.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-105
ssss
The drive did not become ready.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-106
ssss
The electronics card test failed.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics
177
Table 28. ssss-102 through ssss-640 SRNs (continued)
SRN
FFC
Description and action
ssss-108
ssss
The bus test failed.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-110
ssss
The media format is corrupted.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-112
ssss
The diagnostic test failed.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-114
ssss
An unrecoverable hardware error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-116
ssss
A protocol error.
1. Make sure that the device, adapter and diagnostic firmware, and the application
software levels are compatible.
2. If you do not find a problem, call your operating-system support person.
ssss-117
ssss
A write-protect error occurred.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-118
ssss 252B
A SCSD command time-out occurred.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-120
ssss
A SCSD busy or command error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
178
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 28. ssss-102 through ssss-640 SRNs (continued)
SRN
FFC
Description and action
ssss-122
ssss
A SCSD reservation conflict error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-124
ssss
A SCSD check condition error occurred.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-126
ssss 252B
A software error was caused by a hardware failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-128
252B ssss
software
The error log analysis indicates a hardware failure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-129
252B ssss
software
Error log analysis indicates a SCSD bus problem.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-130
ssss
Error log analysis indicates a problem reported by the disk drive's self-monitoring
function.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-132
ssss
A disk drive hardware error occurred.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics
179
Table 28. ssss-102 through ssss-640 SRNs (continued)
SRN
FFC
ssss-134
252B
software
Description and action
The adapter failed to configure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-135
ssss 252B
software
The device failed to configure.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-136
ssss
The certify operation failed.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-137
ssss 252B
Unit attention condition has occurred on the Send Diagnostic command.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-138
ssss
Error log analysis indicates that the disk drive is operating at a higher than
recommended temperature.
1. Make sure that:
v The ventilation holes in the blade server bezel are not blocked.
v The management-module event log is not reporting any system environmental
warnings.
2. If the problem remains, call IBM support.
ssss-140
199 252B
ssss
Error log analysis indicates poor signal quality.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
ssss-640
ssss
Error log analysis indicates a path error.
1. Check the BladeCenter management-module event log. If an error was recorded
by the system, see “POST progress codes (checkpoints)” on page 84.
2. Replace any parts reported by the diagnostic program.
3. Replace the system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
180
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Failing function codes 151 through 2E33
Failing function codes (FFCs) identify a function within the system unit that is failing.
Table 29 describes the component that each function code identifies.
Note: When replacing a component, perform system verification for the component. See “Using the
diagnostics program” on page 189.
Table 29. Failing function codes 151 through 2E33
FFC
Description and notes
151
1. Battery
Note: After replacing the battery:
a. Set the time and date.
b. Set the Network IP addresses (for blade servers that start up from a network).
2. System-board and chassis assembly
152
System-board and chassis assembly
166
Check management-module event log for a BladeCenter blower or fan fault. See the
documentation that comes with the BladeCenter unit.
210
System-board and chassis assembly
212
System-board and chassis assembly (cache problem)
214
System-board and chassis assembly
217
System-board and chassis assembly
219
Common Memory Logic problem for memory DIMMs.
Note: If more than one pair of memory DIMMs are reported missing:
1. Replace the memory DIMM at the physical location code that is reported
2. Replace the system-board
221
System-board and chassis assembly
226
System-board and chassis assembly
227
System-board and chassis assembly
241
Ethernet network problem
282
System-board and chassis assembly
292
System-board and chassis assembly (Host – PCI bridge problem)
293
System-board and chassis assembly (PCI – PCI bridge problem)
294
System-board and chassis assembly (MPIC interrupt controller problem)
296
PCI device or adapter problem.
Note: The replacement part can only be identified by the location code reported by diagnostics.
2C4
System-board and chassis assembly
2C6
2 GB DIMM 4 GB DIMM 8 GB DIMM
2C7
System-board and chassis assembly (Memory controller)
2C8
System-board and chassis assembly
2C9
System-board and chassis assembly
2D2
System-board and chassis assembly (Bus arbiter problem)
2D3
System-board and chassis assembly
2D4
System-board and chassis assembly (System/SP interface logic problem)
2D5
System-board and chassis assembly (I2C primary)
2D6
System-board and chassis assembly (I2C secondary)
Chapter 2. Diagnostics
181
Table 29. Failing function codes 151 through 2E33 (continued)
FFC
Description and notes
2D7
System-board and chassis assembly (VPD module)
2D9
System-board and chassis assembly (Power controller)
2E0
System-board and chassis assembly (Fan sensor problem)
2E1
System-board and chassis assembly (Thermal sensor problem)
2E2
System-board and chassis assembly (Voltage sensor problem)
2E3
System-board and chassis assembly (Serial port controller problem)
2E4
System-board and chassis assembly (JTAG/COP controller problem)
2E8
System-board and chassis assembly (Cache controller)
308
System-board and chassis assembly (I/O bridge problem)
650
Unknown hard disk drive.
Note: This FFC indicates that the hard disk drive could not configure properly.
711
Unknown adapter
722
Unknown disk drive
7C0
System-board and chassis assembly (microprocessor/system interface)
812
System-board and chassis assembly (Common standard adapter logic problem)
814
System-board and chassis assembly (NVRAM problem)
815
System-board and chassis assembly (floating point processor problem)
817
System-board and chassis assembly (time-of-day logic)
820
System-board and chassis assembly (interprocessor related testing problem)
887
System-board and chassis assembly (integrated Ethernet adapter)
893
LAN adapter
D01
System-board and chassis assembly (cache problem)
E19
System-board and chassis assembly (power supply sensor failed)
2506
3Gb SAS Passthrough Expansion Card
2506-101
Adapter configuration error indicated by test failures. Reconfigure the adapter. If the
problem persists, replace the adapter.
2506-710
Error log analysis reveals a permanent controller failure. Replace the adapter.
2506-713
Error log analysis reveals a controller failure. Replace the adapter.
2506-720
Error log analysis reveals a controller device bus configuration error. Replace the
adapter. If the problem persists, replace the system board and chassis assembly.
182
252B
System-board and chassis assembly (SAS controller)
2553
SAS 73 GB or SAS 146 GB hard disk drive
2567
System-board and chassis assembly (USB integrated adapter)
25A0
System-board and chassis assembly
25C4
Broadcom Ethernet adapter
2607
Emulex 8Gb PCI-Express Fibre Channel Expansion Card
2624
System-board and chassis assembly (InfiniBand Host Channel Adapter)
2631
System-board and chassis assembly
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 29. Failing function codes 151 through 2E33 (continued)
FFC
Description and notes
2D02
System-board and chassis assembly (generic USB reference to controller/adapter)
2E00
Qlogic 4Gb Fibre Channel and Broadcom 1 Gb Ethernet Combo
2E10
Qlogic 4Gb Fibre Channel and Broadcom 1 Gb Ethernet Combo
2E12
QLogic 8Gb Fibre Channel Expansion Card, (CFFh/PCIe)
2E13
QLogic 4Gb Fibre Channel 1Xe PCI-Express Expansion Card (CIOv)
2E14
QLogic 8Gb Fibre Channel 1Xe PCI-Express Expansion Card (CIOv)
2E15
Qlogic 8Gb PCI-E FC Blade Expansion Adapter
2E21
QLogic 10Gb FCoCEE daughtercard
2E22
QLogic 10Gb FCoCEE daughtercard
2E33
Gigabit Ethernet-SX CFFh Adapter
Error logs
The power-on self-test (POST), the POWER Hypervisor™ (PHYP), and the service processor write errors
to the BladeCenter management module event log.
Select the Monitors > Event Log option in the management module Web interface to view entries that are
currently stored in the management-module event log. This log includes entries for events that are
detected by the blade servers. The log displays the most recent entries first.
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event
log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each
word after the first word to show relative word positions. The seventh word is the direct select address,
which is 77777777 in the example.
Table 30. Nine-word system reference code in the management-module event log
Index
Sev
Source
Date/Time
Text
1
E
Blade_05
01/21/2008,
17:15:14
(PS700-BC1BLD5E) SYS F/W: Error. Replace UNKNOWN
(5008FECF B7001111 22222222 33333333 44444444 55555555
66666666 77777777 88888888 99999999)
Depending on your operating system and the utilities you have installed, error messages might also be
stored in an operating system log. See the documentation that comes with the operating system for more
information.
Chapter 2. Diagnostics
183
See the online information or the BladeCenter Management Module User's Guide for more information about
the event log.
Checkout procedure
The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the
blade server.
About the checkout procedure
Review this information before performing the checkout procedure.
v Read the Safety topic and the “Installation guidelines” on page 233.
v The firmware diagnostic program provides the primary methods of testing the major components of
the blade server. If you are not sure whether a problem is caused by the hardware or by the software,
you can use the firmware diagnostic program to confirm that the hardware is working correctly. The
firmware diagnostic program runs automatically when the blade server is turned on.
v A single problem might cause more than one error message. When this happens, correct the cause of
the first error message. The other error messages usually will not occur the next time you run the
diagnostic programs.
Exception: If there are multiple error codes or light path diagnostic LEDs that indicate a
microprocessor error, the error might be in a microprocessor or in a microprocessor socket. See
“Microprocessor problems” on page 194 for information about diagnosing microprocessor problems.
v If the blade server hangs on a POST checkpoint, see “POST progress codes (checkpoints)” on page 84.
If the blade server is halted and no error message is displayed, see “Troubleshooting tables” on page
191 and “Solving undetermined problems” on page 227.
v For intermittent problems, check the management-module event log and “POST progress codes
(checkpoints)” on page 84.
v If the blade server front panel shows no LEDs, verify the blade server status and errors in the
BladeCenter Web interface; also see “Solving undetermined problems” on page 227.
v If device errors occur, see “Troubleshooting tables” on page 191.
Performing the checkout procedure
Follow this procedure to perform the checkout.
Step ▌001▐
Perform the following steps:
1. Update the firmware to the current level, as described in “Updating the firmware” on page
263.
2. You might also have to update the management module firmware.
3. If you did not update the firmware for some reason, power off the blade server for 45 seconds
before powering it back on.
4. Establish an SOL session; then continue to Step ▌002▐. If the blade server does not start, see
“Troubleshooting tables” on page 191.
Step ▌002▐
Verify that you have looked up each error code or hung checkpoint and attempted the corrective
action before going to Step ▌003▐:
1. If the firmware hangs on an eight-digit progress code, see “POST progress codes
(checkpoints)” on page 84.
2. If the firmware records an eight-digit error code, see “System reference codes (SRCs)” on page
16.
184
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
3. If the AIX operating system records a service request number (SRN), see “Service request
numbers (SRNs)” on page 129.
4. Check the BladeCenter management-module event log. If an error was recorded by the
system, see “POST progress codes (checkpoints)” on page 84 or “System reference codes
(SRCs)” on page 16.
5. If no error was recorded, or if the login prompt appears and you still suspect a problem,
continue to Step ▌003▐.
Step ▌003▐
Is the operating system AIX?
Yes
Record any information or messages that may be in the management module event log;
then go to Step ▌005▐.
No
Go to Step ▌004▐.
Step ▌004▐
Is the operating system Linux?
Yes
Record any information or messages that may be in the management module event log;
then go to Step ▌007▐. If you cannot load the stand-alone Diagnostics CD, answer this
question No.
No
Go to “Solving undetermined problems” on page 227.
Step ▌005▐
Perform the following steps:
Note: When possible, run AIX online diagnostics in concurrent mode. AIX online diagnostics
perform more functions than the stand-alone Diagnostics.
1. Perform the AIX online diagnostics, see “Starting AIX concurrent diagnostics” on page 186.
Record any diagnostic results and see the “Service request numbers (SRNs)” on page 129 to
identify the failing component.
Note: When replacing a component, perform system verification for the component. See
“Using the diagnostics program” on page 189.
2. If you cannot perform AIX concurrent online diagnostics, continue to Step ▌006▐.
Step ▌006▐
Perform the following steps:
1. Use the management-module Web interface to make sure that the device from which you load
the stand-alone diagnostics is set as the first device in the blade server boot sequence.
2. Turn off the system unit power and wait 45 seconds before proceeding.
3. Turn on the blade server and establish an SOL session.
4. Check for the following responses:
a. Progress codes are recorded in the management-module event log.
b. Record any messages or diagnostic information that might be in the log.
5. Load the stand-alone diagnostics. Go to “Starting stand-alone diagnostics from a CD” on page
187 or “Starting stand-alone diagnostics from a NIM server” on page 188.
6. If you have replaced the failing component, perform system verification for the component.
See “Using the diagnostics program” on page 189
This ends the AIX procedure.
Step ▌007▐
Perform the following steps:
1. Use the management-module Web interface to make sure that the device from which you load
the stand-alone diagnostics is set as the first device in the blade server boot sequence.
Chapter 2. Diagnostics
185
2. Turn off the blade server and wait 45 seconds before proceeding.
3. Turn on the blade server and establish an SOL session.
4. Check for the following responses:
a. Progress codes are recorded in the management-module event log.
b. Record any messages or diagnostic information that might be in the log.
Continue with step ▌008▐.
Step ▌008▐
Load the stand-alone diagnostics. Go to “Starting stand-alone diagnostics from a CD” on page
187 or “Starting stand-alone diagnostics from a NIM server” on page 188.
Can you load the stand-alone diagnostics?
No
Go to “Solving undetermined problems” on page 227.
Yes
Select the resources to be tested and record any SRNs; then go to “Service request
numbers (SRNs)” on page 129.
This ends the Linux procedure.
For more information about installing and using all supported operating systems, search the IBM Support
Site.
Verifying the partition configuration
Perform this procedure if there is a configuration problem with the system or a logical partition.
1. Check the processor and memory allocations of the system or the partition. Processor or memory
resources that fail during system startup could cause the startup problem in the partition. Make sure
that there are enough functioning processor and memory resources in the system for all the partitions.
2. Check the bus and virtual adapter allocations for the partition. Make sure that the partition has load
source and console I/O resources.
3. Make sure that the Boot Mode partition properties are set to Normal.
4. If the problem remains, contact your software service provider for further assistance.
Running the diagnostics program
You can start or run the diagnostics program from the AIX operating system, from a CD, or from a
management server.
Starting AIX concurrent diagnostics
Perform this procedure to start AIX concurrent diagnostics from the AIX operating system.
1. Log in to the AIX operating system as root user, or use the CE login. See “Creating a CE login” on
page 266 for more information. If you need help, contact the system operator.
2. Type diag and press Enter at the operating system prompt to start the diagnostics program and
display its Function Selection menu. See “Using the diagnostics program” on page 189 for more
information about running the diagnostics program.
186
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
3. When testing is complete, press F3 until the Diagnostic Operating Instructions panel is displayed, then
press F3 to exit the diagnostic program.
Starting stand-alone diagnostics from a CD
Perform these procedures to start the stand-alone diagnostics from a CD. These procedures can be used if
the blade server is running a Linux operating system or if an AIX operating system cannot start the
concurrent diagnostics program.
You can download the latest version of the stand-alone diagnostics from the Standalone Diagnostics CD
page.
1. Verify with the system administrator and systems users that the blade server may be shut down. Stop
all programs; then, shut down the operating system and shut down the blade server. Refer to the
documentation that comes with your operating system documentation for information about shutting
down the operating system.
2. Press the CD button on the front of the blade server to give it ownership of the BladeCenter media
tray.
3. Using the management module Web interface, make sure that:
v The blade server firmware is at the latest version.
v SOL is enabled for the blade server.
v The CD or DVD drive is selected as the first boot device for the blade server.
4. Insert the stand-alone diagnostics CD into the CD or DVD drive.
5. Turn on the blade server and establish an SOL session.
Note: It can take from 3 to 5 minute to load the stand-alone diagnostics from the CD. Please be
patient.
The screen will display “Please define the System Console.”
6. Type 1 and press Enter to continue.
The Diagnostic Operating Instructions screen will display.
7. Press Enter to continue.
The Function Selection screen will display. See “Using the diagnostics program” on page 189 for more
information about running the diagnostics program.
Note: If the Define Terminal screen is displayed, type the terminal type and press Enter. The use of
“vs100” as the terminal type is recommended; however, the function keys (F#) may not work. In this
case, press Esc and the number in the screen menus. For example, instead of F3 you can press Esc and
3.
8. When testing is complete, press F3 until the Diagnostic Operating Instructions screen is displayed;
then press F3 again to exit the diagnostic program.
9. Remove the CD from the CD or DVD drive.
Chapter 2. Diagnostics
187
Starting stand-alone diagnostics from a NIM server
Perform this procedure to start the stand-alone diagnostics from a network installation management
(NIM) server.
Note: See Network Installation Management in the AIX Information Center for information about
configuring the blade server as a NIM server client. Also see the Configuring the NIM Master and
Creating Basic Installation Resources Web page.
1. Verify with the system administrator and systems users that the blade server can be shut down. Stop
all programs; then, shut down the operating system and shut down the blade server. Refer to the
documentation that comes with your operating system for information about shutting down the
operating system.
2. If the system is running in a full-machine partition, turn on the blade server and establish an SOL
session.
3. Perform the following steps to check the NIM server boot settings:
a. When the POST menu is displayed, press 1 to start the SMS utility.
b. From the SMS main menu, select Setup Remote IPL (Initial Program Load).
c. From the NIC Adapters menu, select the network adapter that is attached to the NIM server.
d. From the Network Parameters menu, select IP Parameters.
e. Enter the client, server, and gateway IP addresses (if applicable), and enter the subnet mask. If
there is no gateway between the NIM server and the client, set the gateway address to 0.0.0.0 See
your network administrator to determine if there is a gateway.
f. If the NIM server is set up to allow pinging the client system, use the Ping Test option on the
Network Parameters menu to verify that the client system can ping the NIM server.
Note: If the ping fails, see “Boot problem resolution” on page 190; then, follow the steps for
network boot problems.
4. When the ping is successful, start the blade server from the NIM server.
5. Establish an SOL session.
If the Diagnostic Operating Instructions screen is displayed, the diagnostics program has started
successfully.
Note: If the AIX login prompt is displayed, the diagnostics program did not load. See “Boot problem
resolution” on page 190; then, follow the steps for network boot problems.
6. Press Enter to continue.
The Function Selection screen will display. See “Using the diagnostics program” on page 189 for more
information about running the diagnostics program.
Note: If the Define Terminal screen is displayed, type the terminal type and press Enter. The use of
“vs100” as the terminal type is recommended; however, the function keys (F#) may not work. In this
case, press Esc and the number in the screen menus. For example, instead of F3 you can press Esc and
3.
188
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
7. When testing is complete, press F3 until the Diagnostic Operating Instructions screen is displayed;
then press F3 again to exit the diagnostic program.
Using the diagnostics program
Follow the basic procedures for running the diagnostics program.
1. Start the diagnostics from the AIX operating system, from a CD, or from a management server. See
“Starting AIX concurrent diagnostics” on page 186, “Starting stand-alone diagnostics from a CD” on
page 187, or “Starting stand-alone diagnostics from a NIM server” on page 188.
2. The Function Selection menu is displayed. Use the steps listed to perform one of the following tasks:
v Problem Determination
a. From the Function Selection menu, select Diagnostic Routines and press Enter.
b. From the Diagnostic Mode Selection menu, select Problem Determination
c. Select the resource to be tested and press F7=Commit.
d. Record any results provided and go to “Service request numbers (SRNs)” on page 129 to
identify the failure and perform the action(s).
e. When testing is complete, press F3 to return to the Diagnostic Selection menu. If you want to
run another test, press F3 again to return to the Function Selection menu.
v System Verification
a. From the Function Selection menu, select Diagnostic Routines and press Enter.
b. From the Diagnostic Mode Selection menu, select System Verification.
c. Select the resource to be tested and press F7=Commit.
d. Record any results provided and go to “Service request numbers (SRNs)” on page 129 to
identify the failure and perform the action(s).
e. When testing is complete, press F3 to return to the Diagnostic Selection menu. If you want to
run another test, press F3 again to return to the Function Selection menu.
v Task selection
a. From the Function Selection menu, select Task Selection and press Enter.
b. Select the task to be run and press Enter.
c. If the Resource Selection List menu is displayed, select the resource on which the task is to be
run and press F7=Commit.
d. Follow the instruction for the selected task.
e. When the task is complete, press F3 to return to the Task Selection List menu. If you want to
run another test, press F3again to return to the Function Selection menu.
3. When testing is complete, press F3 until the Diagnostic Operating Instructions screen is displayed;
then press F3 again to exit the diagnostic program.
Chapter 2. Diagnostics
189
Boot problem resolution
Depending on the boot device, a checkpoint might be displayed in the list of checkpoints in the
management module for an extended period of time while the boot image is retrieved from the device.
This situation is particularly true for CD and network boot attempts. When booting from a CD, watch for
a blinking activity LED on the CD or DVD drive. A blinking activity LED indicates that the loading of
either the boot image, or additional information required by the operating system being booted, is still in
progress. If the checkpoint is displayed for an extended period of time and the CD-drive or DVD-drive
activity LED is not blinking, there might be a problem loading the boot image from the device.
Note: For network boot attempts, if the system is not connected to an active network, or if there is no
server configured to respond to the system's boot request, the system will still attempt to boot. Because
time-out durations are necessarily long to accommodate retries, the system might appear to be hung.
If you suspect a problem loading the boot image, complete the following steps.
1. Make sure that your boot list is correct.
a. From the BladeCenter management-module Web interface, display the boot sequences for the
blade servers in your BladeCenter unit: Blade Tasks > Configuration > Boot Sequence.
b. Find your blade server on the list that is displayed and make sure that the device from which you
are attempting to boot is the first device in the boot sequence. If it is not, select your blade server
from the list of servers and modify the boot sequence. Cycle power on your blade server to retry
the boot.
Note: If Network is selected, the blade server will try to boot from both Ethernet ports on the
system board.
c. If this boot attempt fails, do the following:
1) If you are attempting to boot from the network, go to Step ▌002▐.
2) If you are attempting to boot from the CD or DVD drive, go to Step ▌003▐.
3) If you are attempting to boot from a hard disk drive, go to Step ▌004▐.
2. If you are attempting to boot from the network:
a. Make sure that the network cabling to the BladeCenter network switch is correct.
b. Check with the network administrator to make sure that the network is up.
c. Verify that the blade server for your system is running and configured to respond to your system.
d. Turn the blade server power off; then, turn it on and retry the boot operation.
e. If the boot still fails, replace the system-board and chassis assembly.
3. If you are attempting to boot from the CD or DVD drive:
a. From the BladeCenter management-module Web interface, make sure that the media tray is
assigned to your blade server: Blade Tasks → Remote Control.
b. Turn the blade server power off; then, turn it on and retry the boot operation.
c. If the boot fails, try a known-good bootable CD.
d. If possible, try to boot another blade server in the BladeCenter unit to verify that the CD or DVD
drive is functional.
v If the CD boots on the second server, replace the system-board and chassis assembly in the
PS700 blade server you were originally trying to boot.
v If the CD fails on the second server, replace the CD or DVD drive in the media tray.
e. If replacing the CD or DVD drive does not resolve the problem, replace the media tray.
f. If booting on all servers fails using the new media tray, replace the following in the BladeCenter
unit:
v Management module
190
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Midplane
4. If you are attempting to boot from a hard disk drive.
a. Verify that the hard disk drive is installed.
b. Select the CD or DVD drive as the boot device.
c. Go to “Performing the checkout procedure” on page 184.
d. Reload the operating system onto the hard disk drive if the boot attempts from that disk continue
to fail.
e. Replace the suspect hard disk drive if you are not able to load the operating system.
f. Replace the system-board; then, retry loading the operating system.
Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
If these symptoms relate to shared BladeCenter unit resources, see “Solving shared BladeCenter resource
problems” on page 222. If you cannot find the problem in these tables, see “Running the diagnostics
program” on page 186 for information about testing the blade server.
If you have just added new software or a new optional device and the blade server is not working,
complete the following steps before using the troubleshooting tables:
1. Remove the software or device that you just added.
2. Run the diagnostic tests to determine whether the blade server is running correctly.
3. Reinstall the new software or new device.
General problems
Identify general problem symptoms and corrective actions.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
An LED is not working or a
similar problem has occurred.
If the part is a CRU, replace it. If the part is a FRU, the part must be replaced by a
trained service technician.
Chapter 2. Diagnostics
191
Drive problems
Identify hard disk drive problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
Not all drives are recognized by 1. Run diagnostics
the disk drive firmware or
2. Reseat the drive
operating system.
3. Run the diagnostics again
4. If the remaining drives are recognized, replace the drive that you removed
with a new one.
System stops responding
during drive operating system
commands to test or look for
bad blocks.
1. Run diagnostics
2. Reseat the drive
3. Run the diagnostics again
4. If the drive diagnostic test runs successfully, replace the drive you removed
with a new one.
Intermittent problems
Identify intermittent problem symptoms and corrective actions.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
A problem occurs only
occasionally and is difficult to
diagnose.
1. Check the Advanced Management Module event log for errors
2. Make sure that:
v When the blade server is turned on, air is flowing from the rear of the blade
server at the blower grill. If there is no airflow, the blower is not working.
This causes the blade server to overheat and shut down.
v Ensure that the self configuring SCSI device (SCSD) bus and devices are
configured correctly.
192
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Management module service processor problems
Determine if a problem is a management module service processor problem and, if so, the corrective
action to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
Service processor in the
management module reports a
general monitor failure.
Disconnect the BladeCenter unit from all electrical sources, wait for 30 seconds,
reconnect the BladeCenter unit to the electrical sources, and restart the blade
server.
If the problem remains, see “Solving undetermined problems” on page 227, or the
Advanced Management Module Guide
Memory problems
Identify memory problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
The amount of system memory
displayed is less than the
amount of physical memory
installed.
1. Make sure that:
v All installed memory is listed in the output of the lscfg -vp command.
v The memory modules are seated properly.
v You have installed the correct type of memory.
v All banks of memory on the DIMMs are enabled. The blade server might
have automatically disabled a DIMM bank when it detected a problem or a
DIMM bank could have been manually disabled.
2. Check the Advanced Management Module event log for an error message.
v If the DIMM was disabled by a system-management interrupt (SMI), replace
the DIMM.
v If the DIMM was disabled by the POST, obtain the eight-digit error code
and location code and replace the failing DIMM.
3. Reseat the DIMM.
4. Replace the DIMM.
5. Replace the system-board.
Chapter 2. Diagnostics
193
Microprocessor problems
Identify microprocessor problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
The blade server will not boot
or a checkpoint or firmware
error code is logged in the
management-module event log
(the startup microprocessor is
not working correctly)
1. If a checkpoint or firmware error was logged in the Advanced Management
Module event log, correct that error.
2. If no error was logged, restart the blade server and check the management
module event log again for error codes.
3. Replace the system-board.
Network connection problems
Identify network connection problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
One or more blade servers are
unable to communicate with
the network.
1. Verify that the I/O module is a switch or pass thru module. Ppass thru
modules require an upstream switch for network traffic to pass.
2. Ensure the correct device drivers are installed.
3. Verify that the optional I/O expansion cards are correctly installed and
configured, see “Removing and installing an I/O expansion card” on page 246
PCI expansion card (PIOCARD) problem isolation procedure
The hardware that controls PCI adapters and PCI card slots detected an error. The direct select address
(DSA) portion of the system reference code (SRC) identifies the location code of the failing component.
194
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event
log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each
word after the first word to show relative word positions. The seventh word is the direct select address,
which is 77777777 in the example.
Table 31. Nine-word system reference code in the management-module event log
Index
Sev
Source
Date/Time
Text
1
E
Blade_05
01/21/2008,
17:15:14
(PS700-BC1BLD5E) SYS F/W: Error. Replace UNKNOWN
(5008FECF B7001111 22222222 33333333 44444444 55555555
66666666 77777777 88888888 99999999)
Depending on your operating system and the utilities you have installed, error messages might also be
stored in an operating system log. See the documentation that comes with the operating system for more
information.
Table 32 shows the procedure for isolating which PCI expansion card is failing.
Table 32. PCI expansion card problem isolation procedure
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
A B700xxxx error message indicates a problem with a PCI
expansion card.
1. Collect the error log information.
2. Get the DSA, which is word 7 of the associated
B700xxxx SRC.
3. Use the hexadecimal value of the DSA to
determine the location code of the failing CRU.
v If the value is 05120010, the location code is
P1-C11.
v If the value is xxxx 0100, the location code is
P1-C12.
4. Reseat the device that you just installed.
5. Replace the device that you just installed.
Chapter 2. Diagnostics
195
Optional device problems
Identify optional device problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
An IBM optional device that
was just installed does not
work.
1. Make sure that:
v The option is designed for the blade server. See the ServerProven list at
www.ibm.com/servers/eserver/serverproven/compat/us/.
v You followed the installation instructions that came with the option.
v The option is installed correctly.
v You have not loosened any other installed devices or cables.
2. If the option comes with its own test instructions, use those instructions to test
the option.
3. Reseat the device that you just installed.
4. Replace the device that you just installed.
Power problems
Identify power problem symptoms and what corrective actions to take.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
Power switch does not work.
1. Use the BladeCenter management module to verify that local power control for
the blade server is enabled.
2. Reseat the control-panel connector.
3. Replace the bezel assembly.
4. Replace the system-board.
196
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
The blade server does not turn
on.
1. Make sure that:
a. The power LED on the front of the BladeCenter unit is on.
b. The LEDs on all the BladeCenter power modules are on.
c. The blade server is in a blade bay that is supported by the power modules
installed in the BladeCenter unit.
d. If the power LED is flashing rapidly , and continues to flash rapidly, the
blade server is not communicating with the management module; reseat
the blade server by following these procedures:
v “Removing the tier 2 management card” on page 255
v “Installing the tier 2 management card” on page 256
v If reseating the blade server or the management card does not resolve
the problem, go to step 3.
e. If the power LED is off instead of flashing slowly or rapidly, the blade bay
is not receiving power, the blade server is defective, or the LED
information panel is loose or defective.
f. Local power control for the blade server is enabled (use the BladeCenter
Advanced Management Module Web interface to erify), or the blade server
was instructed through the management module Web interface to start.
2. If you just installed a device in the blade server, remove it, and restart the
blade server. If the blade server now starts, you might have installed more
devices than the power to that blade bay supports.
3. Try another blade server in the blade bay; if it works, replace the faulty blade
server.
4. See “Solving undetermined problems” on page 227.
The blade server turns off for
no apparent reason
1. Make sure that each blade bay has a blade server, expansion unit, or blade
filler correctly installed. If these components are missing or incorrectly
installed, an over-temperature condition might result in shutdown.
2. Check the Advanced Management Module event log for error messages.
3. Check the blade server light path. See “Light path diagnostics LEDs” on page
215
4. If the system board error LED is lit, replace the system board.
The blade server does not turn
off.
1. Verify whether you are using an ACPI or non-ACPI operating system. If you
are using a non-ACPI operating system:
a. Press Ctrl+Alt+Delete.
b. Turn off the system by holding the power-control button for 4 seconds.
c. If the blade server fails during POST and the power-control button does
not work, remove the blade server from the bay and reseat it.
2. If the problem remains or if you are using an ACPI-aware operating system,
suspect the system board.
3. Verify the Power button is working correctly.
4. Ensure local power control for the blade server is enabled.
Chapter 2. Diagnostics
197
POWER Hypervisor (PHYP) problems
The POWER Hypervisor (PHYP) provides error diagnostics with associated error codes and fault
isolation procedures for troubleshooting.
When the POWER7 Hypervisor error analysis determines a specific fault, the hypervisor logs an error
code that identifies a failing component. When the analysis is not definitive, the hypervisor logs one or
more isolation procedures for you to run to identify and correct the problem.
Table 33 describes the isolation procedures.
Table 33. POWER Hypervisor isolation procedures
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
LPARCFG
Symbolic CRU
Symptom
Action
There is a
configuration
problem with the
system or a logical
partition.
1. Perform the procedure associated with the SRC code that is called
out after the LPARCFG call.
2. Check processor and memory allocations of the system or the
partitions.
Verify that there are enough functioning processor and memory
resources in the system for all of the partitions. Processor or memory
resources that failed or were Garded during system IPL could cause
the IPL problem in the partition.
3. Check the bus and I/O adapter allocations for the partition.
Verify that the partition has load source and console I/O resources.
4. Check the IPL mode of the system or failing partition.
5. For further assistance, contact IBM Support.
198
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 33. POWER Hypervisor isolation procedures (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
MEMDIMM
Symbolic CRU
Symptom
Action
The failing
component is one of
the memory DIMMs.
1. Replace the failing CRU:
DIMM 1 (Px-C1)
P1-C1 is memory module 1;
P2-C1 is memory module 9.
DIMM 2 (Px-C2)
Memory module 2/10
DIMM 3 (Px-C3)
Memory module 3/11
DIMM 4 (Px-C4)
Memory module 4/12
DIMM 5 (Px-C5)
Memory module 5/13
DIMM 6 (Px-C6)
Memory module 6/14
DIMM 7 (Px-C7)
Memory module 7/15
DIMM 8 (Px-C8)
Memory module 8/16
2. See “Removing a memory module” on page 244 for location
information and the removal procedure.
3. Install new memory DIMMs, as described in “Installing a memory
module” on page 245.
See “Supported DIMMs” on page 4 for more information.
NEXTLVL
Symbolic CRU
Contact IBM Support.
PIOCARD
Symbolic CRU
The hardware that
controls PCI adapters
and PCI card slots
detected an error. The
direct select address
(DSA) portion of the
system reference code
(SRC) identifies the
location code of the
failing component.
1. Collect the error log information.
2. Get the DSA, which is word 7 of the associated B700xxxx SRC.
3. Use the hexadecimal value of the DSA to determine the location code
of the failing CRU.
v If the value is 05120010, the location code is P1-C11.
v If the value is xxxx 0100, the location code is P1-C12.
4. Reseat the device that you just installed.
5. Replace the device that you just installed.
Chapter 2. Diagnostics
199
Service processor problems
The baseboard management controller (BMC) is a flexible service processor that provides error
diagnostics with associated error codes, and fault isolation procedures for troubleshooting.
Note: Resetting the service processor causes a POWER7 reset/reload, which generates a dump. The
dump is recorded in the management module event log. The reset/reload dump occurs whenever the
service processor resets, such as when resetting the service processor through the management module
Web interface or through the management module command line interface.
When the advanced POWER7 service processor error analysis determines a specific fault, the service
processor logs an error code to identify the failing component. When the analysis is not definitive, the
service processor logs one or more isolation procedures for you to run to identify and correct the
problem.
The service processor reports fault isolation procedure codes to identify a specific service action. The
isolation procedure code is recorded in the management-module event log.
A message with three procedures might be similar to the following example, except that the entry would
be on one line in the event log:
(SN#YL31W7120029) SYS F/W: CEC Hardware VPD.
See procedure FSPSP07, FSPSP28 then FSP0200
(5000004C B15A3303 22222222 33333333 44444444 55555555
66666666 77777777 88888888 99999999)
B15A3303 is the identifier word of the associated SRC. The rest of the nine words in the SRC are shown in
sequence.
A message that identifies customer replaceable units (CRUs) might be similar to the following example:
(SN#YL31W7120029) SYS F/W: Error. Replace PIOCARD then Sys Brd
(500213A0 B7006973 22222222 33333333 44444444 55555555
66666666 77777777 88888888 99999999)
A message with multiple replacement callouts might be too long to display. In such a case, the message
removes SRC words starting with word 2 and inserts an X for every removed word. The following
example shows an error log entry that did not have enough room for words 2 and 3:
(SN#YL31W7120029) SYS F/W: CEC Hardware VPD.
See procedure FSPSP07, FSPSP28 then FSP0200
(50000014 B15A3303 XX 44444444 55555555 66666666
77777777 888888888 99999999)
200
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
ANYPROC
Symbolic CRU
The failing
component is one of
the system
processors.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BCPROB
Symbolic CRU
Error code 1xxx2670
indicates that the
BladeCenter
encountered a
problem, and the
blade server was
automatically shut
down as a result.
1. Check the management-module event log for entries that were made
around the time that the PS700 blade server shut down.
2. Resolve any problems.
3. Remove the blade from the BladeCenter unit and then reinsert the
blade server.
4. Power on the blade server.
5. Monitor the blade server operation to verify that the problem is
solved.
6. If the BladeCenter unit is functioning normally, but the 1xxx2670
problem persists, replace the system board and chassis assembly, as
described in “Replacing the FRU system-board and chassis assembly”
on page 260.
CAPACTY
Symbolic CRU
The failing
component is the
management card.
1. Replace the management card, as described in “Removing the tier 2
management card” on page 255 and “Installing the tier 2
management card” on page 256.
2. After replacing the card and installing the blade server in the chassis
unit and before rebooting the blade server or performing other
operations, ensure that the initialization of the management card
VPD occurs by waiting for the management module to discover the
blade server. Otherwise, the system might fail to IPL.
CLCKMOD
Symbolic CRU
The logic oscillator is
failing.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
DTRCARD
Symbolic CRU
Error code 1xxx2625,
2626, or 2527
indicates that the
blade server is
reporting a problem
with the PCIe
expansion card.
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page 184.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
201
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
FSPSP01
Symptom
Action
A part vital to system
function has been
deconfigured. Review
the system error logs
for errors that call out
CRUs that are
relevant to each
reason code.
If replacing parts does not resolve the error, perform one of the
following procedures, based on the SRC code that is called out after the
FSPSP01 call.
If the SRC is B1xxB10C or B1xxB10D
The system has detected a deconfigured memory controller that is
required for the system to function, or it has detected that there is not
enough memory or that the memory is plugged incorrectly.
1. Ensure the memory is plugged correctly. See “Supported DIMMs” on
page 4. Is the memory plugged correctly?
v Yes: Go to step 3.
v No: Correct the memory plugging problem and continue with the
next step.
2. Install the blade server into the BladeCenter unit and restart the
blade to verify whether the problem is solved. Does the problem
persist?
v Yes: Continue with the next step.
v No: This ends the procedure.
3. Perform the following steps:
a. Reseat all of the memory DIMMs but do not replace any memory
DIMMs at this time. Reseat the memory DIMMs as described in
“Installing a memory module” on page 245.
b. Install the blade server into the BladeCenter unit and restart the
blade to verify whether the problem is solved.
Does the problem persist?
v Yes: Continue with the next step.
v No: This ends the procedure.
4. Perform the following steps:
a. Replace the first memory DIMM pair.
b. Install the blade server into the BladeCenter unit and restart the
blade to verify whether the problem is solved.
Does the problem persist?
v Yes: Repeat this step and replace the next memory DIMM pair.
If you have replaced all of the memory DIMM pairs, then
continue with the next step.
v No: This ends the procedure.
5. Replace the system-board and chassis assembly. This ends the
procedure.
202
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
If the SRC is B1xxB107 or B1xxB108
The system has detected a problem with a clock card.
1. Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page
260.
If the SRC is B1xxB106
The system has detected that the planars are deconfigured.
1. Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page
260.
If the SRC is B1xxB110 or B1xxB111
The system has detected that all of the I/O bridges are deconfigured.
1. Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page
260.
FSPSP02
This procedure is for 1. Replace the management card, as described in “Removing the tier 2
boot failures that
management card” on page 255 and “Installing the tier 2
terminate very early
management card” on page 256.
in the boot process or
2. After replacing the card and installing the blade server in the chassis
when the
unit and before rebooting the blade server or performing other
management card or
operations, ensure that the initialization of the management card
the VPD data on the
VPD occurs by waiting for the management module to discover the
management card is
blade server. Otherwise, the system might fail to IPL.
not operational or is
3.
If the problem persists, replace the system board and chassis
not present.
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
FSPSP03
A system operator or
user error has
occurred.
Refer to the documentation for the function you were attempting to
perform.
FSPSP04
A problem has been
detected in the
service processor
firmware.
1. Verify that the operating system is running. If it is running, perform
an in-band firmware update, as described in “Updating the
firmware” on page 263.
FSPSP05
2. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The service processor 1. Verify that the operating system is running. If it is running, perform
has detected a
an in-band firmware update, as described in “Updating the
problem in the
firmware” on page 263.
platform firmware.
2. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics
203
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
FSPSP06
The service processor Contact IBM Support.
reported a suspected
intermittent problem.
FSPSP07
The time of day has
been reset to the
default value.
1. Use the chdate command to set the Virtual I/O Server date and time,
using one of the following syntaxes:
chdate [-year YYyy]
[-month mm]
[-day dd]
[-hour HH]
[-minute MM]
[-timezone TZ]
chdate mmddHHMM[YYyy|yy]
[-timezone TZ]
2. If the problem persists, replace the battery, as described in
“Removing the battery” on page 250 and “Installing the battery” on
page 251.
FSPSP09
A problem has been
detected with a
memory DIMM, but
it cannot be isolated
to a specific memory
DIMM.
Replace the CRU called out after this FSPSP call. If the CRU that is
called out is a DIMM CRU, perform the following procedure:
1. Replace both memory DIMMs of the pair on the microprocessor that
contains the failing CRU:
DIMM 1 (Px-C1)
For P1-C1, replace DIMMs 1 and 3; for P2-C1, replace
DIMMs 9 and 11.
DIMM 2 (Px-C2)
Replace DIMMs 2 and 4, or DIMMs 10 and 12.
DIMM 3 (Px-C3)
Replace DIMMs 1 and 3, or DIMMs 9 and 11.
DIMM 4 (Px-C4)
Replace DIMMs 2 and 4, or DIMMs 10 and 12.
DIMM 5 (Px-C5)
Replace DIMMs 5 and 7, or DIMMs 13 and 15.
DIMM 6 (Px-C6)
Replace DIMMs 6 and 8, or DIMMs 14 and 16.
DIMM 7 (Px-C7)
Replace DIMMs 5 and 7, or DIMMs 13 and 15.
DIMM 8 (Px-C8)
Replace DIMMs 6 and 8, or DIMMs 14 and 16.
2. See “Removing a memory module” on page 244 for location
information and the removal procedure.
3. Install new memory DIMMs, as described in “Installing a memory
module” on page 245.
See “Supported DIMMs” on page 4 for more information.
204
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
FSPSP10
Symptom
Action
The part indicated in
the CRU callout that
follows this
procedure is invalid
or missing for this
system's
configuration.
1. If there is only one CRU called out after this FSPSP10 call:
a. Verify that the CRU is installed, connected, and seated properly.
b. If the CRU is seated properly and the problem persists, replace
the CRU.
c. If the CRU is missing, add the CRU.
2. If multiple CRUs are called out, they have identical serial numbers.
Remove all but one of the CRUs.
FSPSP11
The service processor 1. Verify that the operating system is running. If it is running, perform
has detected an error
an in-band firmware update, as described in “Updating the
on the RIO/HSL port
firmware” on page 263.
in the system unit.
2. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
FSPSP12
The DIMM CRU that
was called out failed
to correct the
memory error.
FSPSP14
The service processor 1. View the event log in the management module to locate the system
cannot establish
reference code (SRC) and the time that the event was logged. See
communication with
“Error logs” on page 183.
the server firmware.
If progress codes are being displayed, the server firmware was able
The server firmware
to reset the service processor and solve the problem.
will continue to run
2. Record the time the log was created or when you first noticed this
the system and
SRC.
partitions while it
attempts to recover
3. If progress codes are not being displayed, examine the management
the communications.
module event log to see if an A7006995 SRC has been displayed.
Server firmware
If an A7006995 SRC has been displayed, the blade server is powering
recovery actions will
off partitions and attempting a server dump. Follow the action in the
continue for
A7006995 SRC description if the partitions do not terminate as
approximately 30 to
requested.
40 minutes.
4. If an A7006995 SRC has not been displayed, has the A1xx SRC
remained for more than 40 minutes?
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
If so, the server firmware could not begin terminating the partitions.
Contact your next level of support to assist in attempting to
terminate any remaining partitions and forcing a server dump.
Collect the dump for support and power off and power on the blade
server.
5. If an A1xx SRC has not remained more than 40 minutes, call IBM
Support.
FSPSP16
Save any error log
and dump data and
contact your next
level of support for
assistance.
Contact IBM Support.
Chapter 2. Diagnostics
205
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
A system
uncorrectable error
has occurred.
v Look for other serviceable events.
A problem has been
detected in the
platform licensed
internal code (LIC).
1. Verify that the operating system is running. If it is running, perform
an in-band firmware update, as described in “Updating the
firmware” on page 263.
FSPSP20
A failing item has
been detected by a
hardware procedure.
Call IBM Support.
FSPSP22
The system has
detected that a
processor chip is
missing from the
system configuration
because JTAG lines
are not working.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
FSPSP23
The system needs to
perform a service
processor dump.
1. Save the service processor dump to storage by using the partition
dump pin control on the control panel.
FSPSP17
FSPSP18
v Use the SRCs that those events call out to determine and fix any
problems.
2. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
2. Once the dump is complete, attempt to re-IPL the system.
3. Call IBM Support.
The system is
running in a
degraded mode.
Array bit steering
might be able to
correct this problem
without replacing
hardware.
1. Power off the blade server, as described in “Turning off the blade
server” on page 7.
FSPSP27
An attention line has
been detected as
having a problem.
Replace the CRU that is called out before this FSPSP27 call. If the CRU
does not correct the problem, call IBM Support.
FSPSP28
The resource ID (RID) 1. Find another callout that reads "FSPxxxx" where xxxx is a 4-digit hex
of the CRU could not
number that represents the resource ID. Record the resource ID and
be found in the Vital
the model of the system.
Product Data (VPD)
2. Call IBM Support to find out what CRU the resource ID represents.
table.
3. Replace the CRU that the resource ID represents.
FSPSP24
206
2. Remove the blade server from the BladeCenter unit and reinsert the
blade server into the BladeCenter unit.
3. Power on the blade server, as described in “Turning on the blade
server” on page 6.
4. If the problem persists, replace the CRU that is called out after this
procedure.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
FSPSP29
The system has
detected that all I/O
bridges are missing
from the system
configuration.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
FSPSP30
A problem has been
encountered
accessing the
management card or
the VPD data found
on the management
card has been
corrupted.
1. Replace the management card, as described in “Removing the tier 2
management card” on page 255 and “Installing the tier 2
management card” on page 256.
2. After replacing the card and installing the blade server in the chassis
unit and before rebooting the blade server or performing other
operations, ensure that the initialization of the management card
VPD occurs by waiting for the management module to discover the
blade server. Otherwise, the system might fail to IPL.
This error occurred
3. If the problem persists, replace the system board and chassis
before VPD collection
assembly, as described in “Replacing the FRU system-board and
was completed, so no
chassis assembly” on page 260.
location codes have
been created.
FSPSP31
The service processor 1. When the system reaches the SMS, set the system VPD values that
has detected that one
are required, which automatically resets the service processor.
or more of the
2. Power on the blade server, as described in “Turning on the blade
required fields in the
server” on page 6.
system VPD has not
initialized.
FSPSP32
A problem with the
enclosure has been
found.
The problem is one of
the following
problems:
v The enclosure VPD
cannot be found.
v The enclosure
serial number is
not programmed.
v The enclosure
feature code is not
programmed.
Record the reason code, which is the last four digits of the first word
from the SRC. Perform one of the following procedures based upon the
value of the reason code:
v Reason code A46F
1. Verify that the operating system is running. If it is running,
perform an in-band firmware update.
2. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
3. If the problem persists, call IBM Support.
v Reason code A460
1. Set the enclosure serial number using SMS, which automatically
resets the service processor.
2. If the problem persists, call IBM Support.
v Reason code A45F
1. Set the enclosure feature code using SMS, which automatically
resets the service processor.
2. If the problem persists, call IBM Support.
If you do not see your reason code listed, call IBM Support.
Chapter 2. Diagnostics
207
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
FSPSP34
Symptom
Action
The memory cards
are plugged in an
invalid configuration
and cannot be used
by the system.
Install a DIMM for each of the dual processors on the BladeCenter PS700
blade server. Install the first pair in DIMM connectors 2 and 4.
Look for the following error codes in order. Follow the procedure for the
first code you find.
SRC B1xx C02A A memory card is missing from the system.
The additional parts in the CRU callout list include all memory cards in
the group with the missing card. To correct the error, visually check the
system to determine which card is missing, and add the card.
SRC B1xx C029 A memory card is a different type than the other
memory cards in the same group.
The additional parts in the CRU callout list include all memory cards in
the group that contain the error. To correct the error, exchange the
memory cards of the incorrect type with those of the correct type.
SRC B1xx C02B A group of memory cards are missing and are required
so that other memory cards on the board can be configured.
The additional parts in the CRU callout list include all the missing
memory cards in the group. To correct the error, add or move the
memory cards to the correct locations.
SRC B1xx C036 A memory card is not supported in this system.
The additional parts in the CRU callout list include all memory cards in
the group that contain the unsupported cards. To correct the error,
remove the unsupported cards from the system or replace them with the
correct type.
FSPSP35
The system has
detected a problem
with a memory
controller.
Enable redundant utilization by performing the following procedure:
1. Power off the blade server, as described in “Turning off the blade
server” on page 7.
2. Remove the blade server from the BladeCenter unit and reinsert the
blade server.
3. Power on the blade server, as described in “Turning on the blade
server” on page 6.
FSPSP38
The system has
detected an error
within the JTAG
path.
Replace the CRU that is called out before this FSPSP38 call. If the CRU
that you replace does not correct the problem, call IBM Support.
FSPSP42
An error
communicating
between two system
processors was
detected.
Contact IBM Support.
208
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
Symptom
Action
FSPSP45
The system has
detected an error
within the FSI path.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
FSPSP46
Some corrupt areas of Replace the system board and chassis assembly, as described in
flash or RAM have
“Replacing the FRU system-board and chassis assembly” on page 260.
been detected on the
Service Processor.
FSPSP47
The system has
detected an error
within the PSI link.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
FSPSP48
A diagnostics
function detects an
external processor
interface problem.
If the CRUs called out before this procedure do not fix the problem,
contact IBM Support.
FSPSP49
A diagnostic function If the CRUs called out before this procedure do not fix the problem,
detects an internal
contact IBM Support.
processor interface
problem.
FSPSP50
A diagnostic function If the CRUs called out before this procedure do not fix the problem,
detects a connection
contact IBM Support.
problem between a
processor chip and a
GX chip.
Chapter 2. Diagnostics
209
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
FSPSP51
Symptom
Action
Runtime diagnostics Replace the CRU called out after this FSPSP call. If the CRU that is
has detected a
called out is a DIMM CRU, perform the following procedure:
memory bus
1. Replace both memory DIMMs of the pair on the microprocessor that
correctable error that
contains the failing CRU:
is exceeding
DIMM 1 (Px-C1)
threshold. The
For P1-C1, replace DIMMs 1 and 3; for P2-C1, replace
memory bus
DIMMs 9 and 11.
correctable error does
not threaten the
DIMM 2 (Px-C2)
system operation at
Replace DIMMs 2 and 4, or DIMMs 10 and 12.
the moment.
DIMM 3 (Px-C3)
However, the system
Replace DIMMs 1 and 3, or DIMMs 9 and 11.
is operating under
degraded mode.
DIMM 4 (Px-C4)
Replace DIMMs 2 and 4, or DIMMs 10 and 12.
DIMM 5 (Px-C5)
Replace DIMMs 5 and 7, or DIMMs 13 and 15.
DIMM 6 (Px-C6)
Replace DIMMs 6 and 8, or DIMMs 14 and 16.
DIMM 7 (Px-C7)
Replace DIMMs 5 and 7, or DIMMs 13 and 15.
DIMM 8 (Px-C8)
Replace DIMMs 6 and 8, or DIMMs 14 and 16.
2. See “Removing a memory module” on page 244 for location
information and the removal procedure.
3. Install new memory DIMMs, as described in “Installing a memory
module” on page 245.
See “Supported DIMMs” on page 4 for more information.
FSPSP53
210
A network error has Replace the system board and chassis assembly, as described in
occurred between the “Replacing the FRU system-board and chassis assembly” on page 260.
service processor and
the network switch
on the blade server.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
FSPSP54
Symptom
Action
A processor
over-temperature has
been detected. Check
for any
environmental issues
before replacing any
parts.
1. Measure the ambient room temperature to see if it is in within the
upper limit of the normal operating range. The upper limit is less
than 35 degrees C or 95 degrees F. If the temperature exceeds this
limit, you must bring down the room temperature until it is within
the limit. When the temperature is within range, retry the operation.
2. If the temperature is within the acceptable range, check the front and
rear of the BladeCenter unit to verify that the each is free of
obstructions that would impede the airflow. If there are obstructions,
you must clear the obstructions. Also clean the air inlets and exits in
the BladeCenter unit drawer as required. If you cleared obstructions,
retry the operation.
3. Verify that the fans in the BladeCenter unit are working correctly. If
not, replace fans that are not turning or that are turning slowly. If
you replace fans, wait for the unit to cool and retry the operation.
4. If the cooling components are functioning correctly, replace the
system board and chassis assembly, as described in “Replacing the
FRU system-board and chassis assembly” on page 260.
IOHUB Symbolic The failing
Replace the system board and chassis assembly, as described in
CRU
component is the
“Replacing the FRU system-board and chassis assembly” on page 260.
RIO/HSL NIC on the
IPL path.
IOBRDG
Symbolic CRU
The failing
component is the
RIO/HSL I/O bridge
on the IPL path.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
MEMBRD
Symbolic CRU
The failing
component is the
board the memory
DIMMs plug into.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
MEMCTLR
Symbolic CRU
The failing
component is one of
the memory
controllers.
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics
211
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Isolation
Procedure Code
MEMDIMM
Symbolic CRU
Symptom
Action
The failing
component is one of
the memory DIMMs.
1. Replace the failing CRU:
DIMM 1 (Px-C1)
P1-C1 is memory module 1;
P2-C1 is memory module 9.
DIMM 2 (Px-C2)
Memory module 2/10
DIMM 3 (Px-C3)
Memory module 3/11
DIMM 4 (Px-C4)
Memory module 4/12
DIMM 5 (Px-C5)
Memory module 5/13
DIMM 6 (Px-C6)
Memory module 6/14
DIMM 7 (Px-C7)
Memory module 7/15
DIMM 8 (Px-C8)
Memory module 8/16
2. See “Removing a memory module” on page 244 for location
information and the removal procedure.
3. Install new memory DIMMs, as described in “Installing a memory
module” on page 245.
See “Supported DIMMs” on page 4 for more information.
NO12VDC
Symbolic CRU
Error code 1xxx2647
indicates that the
blade server is
reporting that 12V dc
is not present on the
BladeCenter
midplane.
1. Check the management-module event log for entries that indicate a
power problem with the BladeCenter unit.
2. Resolve any problems.
3. Remove the blade from the BladeCenter unit and then reinsert the
blade server.
4. Power on the blade server.
5. Monitor the blade server operation to verify that the problem is
solved.
6. If the BladeCenter unit is functioning normally, but the 1xxx2647
problem persists, replace the system board and chassis assembly, as
described in “Replacing the FRU system-board and chassis assembly”
on page 260.
NODEPL
Symbolic CRU
The failing
component is the
node midplane.
TOD_BAT
Symbolic CRU
The battery for the
Replace the battery, as described in “Removing the battery” on page 250
time-of-day battery is and “Installing the battery” on page 251.
low or failing.
212
Replace the system board and chassis assembly, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Software problems
Use this information to recognize software problem symptoms and to take corrective actions.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
You suspect a software
problem.
1. To determine whether the problem is caused by the software, make sure that:
v The server has the minimum memory that is needed to use the software. For
memory requirements, see the information that comes with the software.
Note: If you have just installed an adapter or memory, the blade server
might have a memory-address conflict.
v The software is designed to operate on the blade server.
v Other software works on the blade server.
v The software works on another server.
2. If you received any error messages when using the software, see the
information that comes with the software for a description of the messages and
suggested solutions to the problem.
3. Contact your place of purchase of the software.
Universal Serial Bus (USB) port problems
This topic describes USB port problem symptoms and corrective actions.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom
Action
A USB device does not work.
Make sure that:
v The correct USB device driver is installed.
v The operating system supports USB devices.
Chapter 2. Diagnostics
213
Light path diagnostics
Light path diagnostics is a system of LEDs on the control panel and on the system board of the blade
server. When an error occurs, LEDs are lit throughout the blade server. If the control panel indicates an
error, use the descriptions of the LEDs to diagnose the problem and take corrective action.
LEDs are available for the following components:
v Battery
v SAS disk drive
v Management card
v Memory modules (DIMMs)
v PCIe high speed expansion card option
v 1Xe CIOv expansion card option
v System board and chassis assembly
Viewing the light path diagnostic LEDs
After reading required safety information, look at the control panel to determine if the LEDs indicate a
sub-optimal condition or an error.
View the system LEDs remotely through the Advanced Management Module Web interface. The main
LED page shows the external LEDs on the blade server. The internal blade LEDs are also available
through a blade hyperlink from the LED page. This enables you to see the status of the internal LEDs on
the blade server without having to turn off the blade server, remove it from the chassis, and activate the
light path indications.
Before working inside the blade server to view light path diagnostic LEDs, see the Safety topic and the
“Handling static-sensitive devices” on page 234 guidelines.
If an error occurs, view the light path diagnostic LEDs in the following order:
1. Look at the control panel on the front of the blade server. See “Blade server control panel buttons and
LEDs” on page 5.
v If the information LED is lit, it indicates that information about a suboptimal condition in the blade
server is available in the management-module event log.
v If the blade-error LED is lit, it indicates that an error has occurred and you should proceed to the
next step.
2. If an error has occurred, view the light path diagnostics panel and LEDs:
a. Remove the blade server from the BladeCenter unit.
b. Place the blade server on a flat, static-protective surface.
c. Remove the cover from the blade server.
d. Press and hold the light path diagnostics switch (blue button) to relight the LEDs that were lit
before you removed the blade server from the BladeCenter unit. The LEDs will remain lit for as
long as you press the switch, to a maximum of 25 seconds.
Figure 6 on page 215 shows the locations of LEDs on the system board.
214
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 6. LED locations on the system board of the PS700 blade server
Table 34 shows LED descriptions.
Table 34. PS700 LEDs
Callout
Base unit LEDs
▌1▐
3V lithium battery LED
▌2▐
DIMM 1-4 LEDs
▌3▐
Management card LED
▌4▐
Light path power LED
▌5▐
System board LED
▌6▐
HDD1 LED
▌7▐
Interposer LED
▌8▐
CIOv (1Xe) expansion card connector LED
▌9▐
High-Speed (CFFh) expansion card connector LED
▌10▐
HDD2 LED
▌11▐
DIMM 5-8 LEDs
Light path diagnostics LEDs
Light path diagnostics is a system of LEDs on the control panel and on the system board of the blade
server. When an error occurs, LEDs are lit throughout the blade server. If the control panel indicates an
error, use the descriptions of the LEDs to diagnose the problem and take corrective action.
Chapter 2. Diagnostics
215
Table 35 describes the LEDs on the system board and suggested actions for correcting any detected
problems.
Table 35. Light path diagnostic LED descriptions
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Lit light path
diagnostics LED
Description
Action
None
An error has occurred and cannot be isolated,
or the service processor has failed.
An error has occurred that is not represented
by a light path diagnostics LED. Check the
management-module event log for information
about the error.
Battery error
A battery error occurred.
1. Reseat the battery, as described in
“Installing the battery” on page 251.
P1-E1 BATT
DIMM x error
2. Replace the battery, as described
in“Removing the battery” on page 250 and
“Installing the battery” on page 251.
A memory error occurred.
P1-C1 DIMM 1
2. Reseat the DIMM indicated by the lit LED,
as described in “Installing a memory
module” on page 245.
P1-C2 DIMM 2
P1-C3 DIMM 3
3. Replace the DIMM indicated by the lit
LED, as described in “Removing a memory
module” on page 244 and “Installing a
memory module” on page 245.
P1-C4 DIMM 4
P1-C5 DIMM 5
P1-C6 DIMM 6
Note: Multiple DIMM LEDs do not
necessarily indicate multiple DIMM failures. If
more than one DIMM LED is lit, reseat or
replace one DIMM at a time until the error
goes away. See the online information or the
Hardware Maintenance Manual and
Troubleshooting Guide or Problem Determination
and Service Guide for your BladeCenter unit for
further isolation.
P1-C7 DIMM 7
P1-C8 DIMM 8
Hard disk drive
A hard disk drive error occurred.
P1-D1 SAS 0
P1-C11 PCIe
1. Reseat the hard disk drive, as described in
“Installing a drive” on page 242.
2. Replace the hard disk drive, as described
in “Removing a drive” on page 241 and
“Installing a drive” on page 242.
P2-D1 SAS 2
PCIe high speed
expansion card
error
1. Verify that the DIMM indicated by the lit
LED is a supported memory module, as
described on the ServerProven Web site.
An I/O expansion card option error occurred.
1. Make sure that the I/O expansion option is
supported.
2. Reseat the I/O expansion option, as
described in “Removing and installing an
I/O expansion card” on page 246.
3. Replace the I/O expansion option.
If you are still having problems, see
theServerProven Web site.
216
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 35. Light path diagnostic LED descriptions (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Lit light path
diagnostics LED
Management card
error
Description
Action
A system board error occurred.
1. Replace the blade server cover, reinsert the
blade server in the BladeCenter unit, and
then restart the blade server.
P1-C9 Mgmt Crd
2. Check the management-module event log
for information about the error.
3. Replace the management card assembly, as
described in “Removing the tier 2
management card” on page 255 and
“Installing the tier 2 management card” on
page 256.
4. Replace the system board and chassis
assembly, as described in “Replacing the
FRU system-board and chassis assembly”
on page 260.
CIOv expansion
card error
An I/O expansion card option error occurred.
P1-C12 1Xe
1. Make sure that the I/O expansion option is
supported, as described on the
ServerProven Web site. See
http://www.ibm.com/servers/eserver/
serverproven/compat/us/.
2. Reseat the I/O expansion option, as
described in “Removing and installing an
I/O expansion card” on page 246.
3. Replace the I/O expansion option.
See “PCI expansion card (PIOCARD) problem
isolation procedure” on page 194 for more
information.
System board error A system board and chassis assembly error
has occurred. A microprocessor failure shows
P1 Sys Brd
up as a system board and chassis assembly
error.
1. Replace the blade server cover, reinsert the
blade server in the BladeCenter unit, and
then restart the blade server.
2. Check the management-module event log
for information about the error.
3. Replace the system board and chassis
assembly, as described in “Replacing the
FRU system-board and chassis assembly”
on page 260.
Chapter 2. Diagnostics
217
Isolating firmware problems
You can use this procedure to isolate firmware problems.
To isolate a firmware problem, follow the procedure until the problem is solved.
1. If the blade server is operating, shut down the operating system and turn off the blade server.
2. Turn on the blade server.
If the problem no longer occurs, no further action is necessary. You are finished with this procedure.
3. If the blade server boots up far enough to allow the installation of server firmware updates, check for
appropriate updates and install them.
If you install updates, reboot the server and see if the problem still exists. If not, you are finished with
this procedure.
4. Recover the system firmware, as described in “Recovering the system firmware” on page 220.
5. After recovering the system firmware, check for and install any server firmware updates.
Save vfchost map data
Before you replace physical components on your blade server, it might be necessary to first save the
original virtual Fibre Channel vfchost data of your blade server.
To save the original vfchost data and the original physical Fibre Channel adapter port (fcs) location codes,
complete the following steps by using the padmin user account:
1. To save the original vfchost data, use the lsmap command as follows. Enter the command from the
command-line interface on the source Virtual I/O Server (VIOS):
lsmap -all -npiv | tee lsmap_OUTPUT_before
The output might look like the following example:
Name
Physloc
ClntID ClntName
ClntOS
--------------- ---------------------------------------- ------- ---------- -----vfchost1
U7895.42X.9999999-V1-C32
3
Status
FC name
Ports logged in
Flags
VFC client name
NOT_LOGGED_IN
fcs1
0
4<NOT_LOGGED>
Name
--------------vfchost2
Status
FC name
Ports logged in
Flags
VFC client name
Physloc
ClntID ClntName
ClntOS
---------------------------------------- ------- ---------- -----U7895.42X.9999999-V1-C33
4 cli_lpar_1 AIX
LOGGED_IN
fcs0
FC loc code:U78AF.001.startSN-P1-C35-L1-T1
1
a<LOGGED_IN,STRIP_MERGE>
fcs1
VFC client DRC:U9999.999.9999999-V4-C32-T1
FC loc code:U78AF.001.startSN-P1-C35-L1-T2
VFC client DRC
In this example, the original serial number is represented as startSN.
218
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
2. To save the location code information for each Fibre Channel adapter (fcs), enter the lsdev command
as follows:
lsdev -vpd | grep fcs | tee LSDEV_OUTPUT_before
The output might look like the following example:
fcs0
fcs1
U78AF.001.startSN-P1-C35-L1-T1 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
U78AF.001.startSN-P1-C35-L1-T2 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
In this example, the original serial number is represented as startSN.
Restore vfchost map data
After you replace physical components on your blade server, you must review the virtual Fibre Channel
vfchost mapping. If the vfchost mapping has changed after component replacement, you must restore the
mapping.
To restore vfchost data, you must have first saved the original vfchost data and the physical Fibre
Channel adapter (fcs) location code information. For information, see “Save vfchost map data” on page
218.
To restore the vfchost mapping data, use the third field of the Fibre Channel adapter (fcs) location codes,
which is the serial number field, and compare the new Fibre Channel adapter (fcs) location codes to the
original location codes. Perform a vfcmap of the new Fibre Channel adapter (fcs) to the appropriate
vfchost.
To restore the vfchost mapping data, complete the following steps by using the padmin user account:
1. To save the new Fibre Channel adapter (fcs) location codes, use the lsdev –vpd command as follows.
Enter the command from the command-line interface on the source Virtual I/O Server (VIOS):
lsdev -vpd | grep fcs | tee LSDEV_OUTPUT_after1
The output might look like the following example:
fcs0
fcs1
U78AF.001.startSN-P1-C35-L1-T1 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
U78AF.001.startSN-P1-C35-L1-T2 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
2. To associate the new fcsx adapter to the old fcsx adapter use the adapter slot number. Enter the grep
command as shown in the example below:
grep "P1-C35-L1-T1" LSDEV_OUTPUT_before
LSDEV_OUTPUT_after
The output might look like the following example:
LSDEV_OUTPUT_before:
fcs0 U78AF.001.startSN-P1-C35-L1-T1 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
LSDEV_OUTPUT_after1:
fcs0 U78AF.001.newSN99-P1-C35-L1-T1 Dual Port 8Gb FC Mezzanine Card (7710322577107501)
In this example, the original serial number is represented as startSN, the new serial number is
represented as newSN99, and the fcsx remained fcs0; therefore, fcs0 will be mapped to the vfchost.
3. To restore the new fcsx device to the correct vfchost, enter the vfcmap command as follows:
vfcmap -vadapter vfchost0 -fcp fcs0
Chapter 2. Diagnostics
219
4. To check the vfchost mapping, enter the lsmap command as follows:
lsmap -vadapter vfchost2 -npiv
The output might look like the following example:
Name
Physloc
ClntID ClntName
ClntOS
--------------- ---------------------------------------- ------- ---------- -----vfchost2
U7895.42X.9999999-V1-C33
4
Status
FC name
Ports logged in
Flags
VFC client name
NOT_LOGGED_IN
fcs0
0
4<NOT_LOGGED>
FC loc code:U78AF.001.newSN99-P1-C35-L1-T1
VFC client DRC
In this example, the new serial number is represented as newSN99.
Recovering the system firmware
The system firmware is contained in separate temporary and permanent images in the flash memory of
the blade server. These images are referred to as TEMP and PERM, respectively. The blade server
normally starts from the TEMP image, and uses the PERM image as a backup. If the TEMP image
becomes damaged, such as from a power failure during a firmware update, you can recover the TEMP
image from the PERM image.
If your system hangs, access the management module and select Blade Tasks > Configuration > Boot
Mode to show the PS700 blade server in the list of blade servers in the BladeCenter unit. Click the
appropriate blade server and select Permanent to force the system to start from the PERM image.
See the documentation for the management module to learn more.
Starting the PERM image
You can force the blade server to start the PERM (permanent) image.
To force the blade server to start the PERM (permanent) image, complete the following procedure.
1. Access the Advanced Management Module menus.
2. Click Blade Tasks > Configuration > Boot Mode.
3. Click the appropriate PS700 blade server in the list of blade servers in the BladeCenter unit.
4. Select Permanent to force the system to start from the PERM image.
220
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
See the documentation for the management module to learn more.
Starting the TEMP image
Start the TEMP image before you update the firmware.
Perform the following procedure to start the TEMP image.
1. Access the advanced management module.
See the BladeCenter Management Module Command-Line Interface Reference Guide or the BladeCenter
Serial-Over-LAN Setup Guide for more information.
2. Click Blade Tasks > Configuration > Boot Mode.
3. Click the applicable PS700 blade server in the list of blade servers in the BladeCenter unit.
4. Select Temporary to force the system to start from the TEMP image.
5. Restart the blade server.
6. Verify that the system starts from the TEMP image, as described in “Verifying the system firmware
levels.”
Recovering the TEMP image from the PERM image
To recover the TEMP image from the PERM image, you must perform the reject function. The reject
function copies the PERM image into the TEMP image.
To perform the reject function, complete the following procedure.
1. If you have not started the system from the PERM image, do so now. See “Starting the PERM image”
on page 220.
2. Issue the appropriate command for your operating system to reject the TEMP image.
v
If you are using the Red Hat Linux or SUSE Linux operating system, type the following command:
update_flash -r
v
If you are using the AIX operating system, type the following command:
/usr/lpp/diagnostics/bin/update_flash -r
3. Start the TEMP image, as described in “Starting the TEMP image.”
You might need to update the firmware code to the latest version. See “Updating the firmware” on page
263 for more information about how to update the firmware code.
Verifying the system firmware levels
The diagnostics program displays the current system firmware levels for the TEMP and PERM images.
This function also displays which image the blade server used to start up.
1. Start the diagnostics program.
See “Running the diagnostics program” on page 186.
The online BladeCenter information center is available in the IBM BladeCenter Information Center at
http://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp.
Chapter 2. Diagnostics
221
2. From the Function Selection menu, select Task Selection and press Enter.
3. From the Tasks Selection List menu, select Update and Manage System Flash and press Enter.
The Update and Manage System Flash menu is displayed. The top of the window displays the
system firmware level for the PERM and the TEMP images and the image that the blade server used
to start.
Note: If the TEMP image level is more current than the PERM image, commit the TEMP image.
See “Committing the TEMP system firmware image.”
4. When you have verified the firmware levels, press F3 until the Diagnostic Operating Instructions
window is displayed; then press F3 again to exit the diagnostic program.
Committing the TEMP system firmware image
After updating the system firmware and successfully starting up the blade server from the TEMP image,
copy the TEMP image to the PERM image using the diagnostics program commit function.
Note: If you install the server firmware update permanently by committing the temporary firmware level
from the temporary side to the permanent side, the temporary and permanent sides contain the same
level of firmware. You cannot return to the level that was previously on the permanent side.
1. Load the diagnostics program. See “Running the diagnostics program” on page 186.
2. From the Function Selection menu, select Task Selection and press Enter.
3. From the Tasks Selection List menu, select Update and Manage System Flash and press Enter.
4. From the Update and Manage System Flash menu, select Commit the Temporary Image and press
Enter.
5. When the commit function is complete, press F3 until the Diagnostic Operating Instructions screen is
displayed; then press F3 again to exit the diagnostic program.
Solving shared BladeCenter resource problems
Problems with BladeCenter shared resources might appear to be in the blade server, but might actually be
a problem in a BladeCenter unit component.
This information provides procedures to help you isolate blade server problems from shared BladeCenter
resource problems.
If the problem is thought to be with a shared resource, see the online information center or the Problem
Determination and Service Guide or the Hardware Maintenance Manual and Troubleshooting Guide for your
BladeCenter unit, or see the documentation for BladeCenter unit components for additional information.
If the problem cannot be solved, see “Solving undetermined problems” on page 227.
To check the general function of shared BladeCenter resources, complete the following operations.
1. Verify that the BladeCenter unit has the required power modules installed and is connected to a
working power source.
222
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
2. Verify that power management is set correctly for your BladeCenter unit configuration.
3. Verify whether the problem is being experienced on more than one blade server.
4. Perform a test of the failing function on a blade server that is known to be operational.
5. Try the blade server in a different blade bay.
6. Try a blade server that is known to be operational in the blade bay with the failing blade server.
7. Verify that the blade server and the monitor are powered on.
8. Check for problems with the media tray (removable media drives and USB ports), as described in
“Solving shared media tray problems.”
9. Check for network connection problems, as described in “Solving shared network connection
problems” on page 225.
10. Check for power problems, as described in “Solving shared power problems” on page 225.
11. Check for video problems, as described in “Solving shared video problems” on page 226.
Solving shared media tray problems
Problems with BladeCenter shared resources might appear to be in the blade server, but might actually be
a problem in a BladeCenter unit media tray component.
To check the general function of shared BladeCenter media tray resources, perform the following
procedure.
1. Verify that the media-tray select button LED on the front of the blade server is lit.
A lit media-tray select button LED shows that the blade server is connected to the shared media tray.
2. Verify that the media tray devices work with another blade server.
3. Verify which components of the media tray are affected.
Components include:
v USB ports
v Diskette drive
v CD or DVD drive
4. Troubleshoot USB port problems if USB ports are the only failing component.
a. Make sure that the USB device is operational.
b. If using a USB hub, make sure that the hub is operating correctly and that any software the hub
requires is installed.
c. Plug the USB device directly into the USB port, bypassing the hub, to check its operation.
d. Reseat the following components:
v USB device cable
v Media tray cable (if applicable)
v Media tray
e. Replace the following components one at a time, in the order shown, restarting the blade server
each time:
1) USB cable (if applicable)
2) Media tray cable (if applicable)
3) Media tray
Chapter 2. Diagnostics
223
5. Troubleshoot the diskette drive if it is the only failing component. If there is a diskette in the drive,
make sure that:
v The diskette is inserted correctly in the drive.
v The diskette is good and not damaged; the drive LED light flashes once per second when the
diskette is inserted. (Try another diskette if you have one.)
v The diskette contains the necessary files to start the blade server.
v The software program is working properly.
v The distance between monitors and diskette drives is at least 76 mm (3 in).
6. Troubleshoot the CD or DVD drive if it is the only failing component.
v Verify that the CD or DVD is inserted correctly in the drive. If necessary, insert the end of a
straightened paper clip into the manual tray-release opening to eject the CD or DVD. The drive
LED light flashes once per second when the CD or DVD is inserted.
v Verify that the CD or DVD is clean and not damaged. (Try another CD or DVD if you have one.)
v Verify that the software program is working properly.
7. Troubleshoot one or more of the removable media drives if they are the only failing components.
a. Reseat the following components:
v Removable-media drive cable (if applicable)
v Removable-media drive
v Media tray cable (if applicable)
v Media tray
8. Replace the following components one at a time, in the order shown, restarting the blade server each
time:
a. Removable-media drive cable (if applicable)
b. Media tray cable (if applicable)
c. Removable-media drive
d. Media tray
9. Verify that the management module is operating correctly.
See the online information center or the Problem Determination and Service Guide or the Hardware
Maintenance Manual and Troubleshooting Guide for your BladeCenter unit.
Some BladeCenter unit types have several management-module components that you might test or
replace. See the online information or the Installation Guide for your management module for more
information.
10. Replace the management module.
See the online information center or the Problem Determination and Service Guide or the Hardware
Maintenance Manual and Troubleshooting Guide for your BladeCenter unit.
224
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
If these steps do not resolve the problem, it is likely a problem with the blade server. See “Universal
Serial Bus (USB) port problems” on page 213 for more information.
Solving shared network connection problems
Problems with BladeCenter shared resources might appear to be in the blade server, but might actually be
a problem in a BladeCenter unit network connection resource.
To check the general function of shared BladeCenter network connection resources, perform the following
procedure.
1. Verify that the network cables are securely connected to the I/O module.
2. Verify that the network cables are securely connected to the I/O module.
3. Verify that the power configuration of the BladeCenter unit supports the I/O module configuration.
4. Verify that the installation of the I/O-module type is supported by the BladeCenter unit and blade
server hardware.
5. Verify that the I/O modules for the network interface are installed in the correct BladeCenter bays.
6. Verify that the I/O modules for the network interface are configured correctly.
7. Verify that the settings in the I/O module are correct for the blade server. Some settings in the I/O
module are specifically for each blade server.
8. Verify that the I/O modules for the network interface are operating correctly.
Troubleshoot and replace the I/O module as indicated in the documentation for the I/O module.
9. Verify that the management module is operating correctly.
See the online information center or the Problem Determination and Service Guide or the Hardware
Maintenance Manual and Troubleshooting Guide for your BladeCenter unit.
Some BladeCenter unit types have several management-module components that you might test or
replace. See the online information or the Installation Guide for your management module for more
information
10. Replace the management module.
See the online information center or the Problem Determination and Service Guide or the Hardware
Maintenance Manual and Troubleshooting Guide for your BladeCenter unit.
If these steps do not resolve the problem, it is likely a problem with the blade server. See “Network
connection problems” on page 194 for more information.
Solving shared power problems
Problems with BladeCenter shared resources might appear to be in the blade server, but might actually be
a problem in a BladeCenter unit power component.
Chapter 2. Diagnostics
225
To check the general function of shared BladeCenter power resources, perform the following procedure.
1. Verify that the LEDs on all the BladeCenter power modules are lit.
2. Verify that power is being supplied to the BladeCenter unit.
3. Verify that the installation of the blade server type is supported by the BladeCenter unit.
4. Verify that the power configuration of the BladeCenter unit supports the blade bay where your blade
server is installed. See the online documentation for your BladeCenter unit.
5. Verify that the BladeCenter unit power management configuration and status support blade server
operation.
See the online information for your management module or the Management Module User's Guide or
the Management Module Command-Line Interface Reference Guide for more information.
6. Verify that the local power control for the blade server is set correctly.
See the online information for your management module or the Management Module User's Guide or
the Management Module Command-Line Interface Reference Guide for more information.
7. Verify that the BladeCenter unit blowers are correctly installed and operational.
If these steps do not resolve the problem, it is likely a problem with the blade server. See “Power
problems” on page 196 for more information.
Solving shared video problems
Problems with BladeCenter shared resources might appear to be in the blade server, but might actually be
a problem in a BladeCenter unit video component.
Some IBM monitors have their own self-tests. If you suspect a problem with the monitor, see the
information that comes with the monitor for instructions for adjusting and testing the monitor.
To check for video problems, perform the following procedure.
1. Verify that the monitor brightness and contrast controls are correctly adjusted.
2. Verify that the keyboard/video select button LED on the front of the blade server is lit.
A lit indicator shows that the blade server is connected to the shared BladeCenter monitor
3. Verify that the video cable is securely connected to the BladeCenter management-module. Non-IBM
monitor cables might cause unpredictable problems.
4. Verify that the monitor works with another blade server.
5. Move the device and the monitor at least 305 mm (12 in.) apart, then turn on the monitor.
Attention:
Moving a color monitor while it is turned on might cause screen discoloration.
If the monitor self-tests show that the monitor is working correctly, the location of the monitor might
be affecting its operation. Magnetic fields around other devices (such as transformers, appliances,
fluorescent lights, and other monitors) can cause screen jitter or wavy, unreadable, rolling, or distorted
screen images. If this happens, turn off the monitor.
6. Verify that the management module is operating correctly.
See the documentation for your BladeCenter unit.
Some BladeCenter unit types have several management-module components that you might test or
replace.
See the online information or the Installation Guide for your management module for more
information.
226
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
7. Replace the monitor cable, if applicable.
8. Replace the monitor.
9. Replace the management module.
See the online information center or the Problem Determination and Service Guide or the Hardware
Maintenance Manual and Troubleshooting Guide for your BladeCenter unit.
Solving undetermined problems
When you are diagnosing a problem in the PS700 blade server, you must determine whether the problem
is in the blade server or in the BladeCenter unit.
v If all of the blade servers have the same symptom, it is probably a BladeCenter unit problem; for more
information, See the online information or the Hardware Maintenance Manual and Troubleshooting Guide
or Problem Determination and Service Guide for your BladeCenter unit.
v If the BladeCenter unit contains more than one blade server and only one of the blade servers has the
problem, troubleshoot the blade server that has the problem.
Check the LEDs on all the power supplies of the BladeCenter unit where the blade server is installed. If
the LEDs indicate that the power supplies are working correctly, and reseating the blade server does not
correct the problem, complete the following steps:
1. Make sure that the control panel connector is correctly seated on the system board. See “System-board
connectors” on page 8 for the location of the connector.
2. If no LEDs on the control panel are working, replace the bezel assembly; then, try to power-on the
blade server from the BladeCenter Web interface. See the online information or the BladeCenter
Management Module User's Guide for more information.
3. Turn off the blade server.
4. Remove the blade server from the BladeCenter unit and remove the cover.
5. Remove or disconnect the following devices, one at a time, until you find the failure. Reinstall, turn
on, and reconfigure the blade server each time.
v I/O expansion option.
v Hard disk drives.
v Memory modules. The minimum configuration requirement is 2 GB (two 1 GB DIMMs).
The following minimum configuration is required for the blade server to start:
v System-board and chassis assembly (with two microprocessors)
v Two 2 GB DIMMs
v A functioning BladeCenter unit
6. Install and turn on the blade server. If the problem remains, suspect the following components in
order:
a. DIMM
b. System-board and chassis assembly
Chapter 2. Diagnostics
227
If the problem is solved when you remove an I/O expansion option from the blade server but the
problem recurs when you reinstall the same expansion option, suspect the expansion option; if the
problem recurs when you replace the expansion option with a different one, suspect the System-board
and chassis assembly.
If you suspect a networking problem and the blade server passes all the system tests, suspect a network
cabling problem that is external to the system.
Calling IBM for service
Call IBM for service after you collect as much as possible of the following information.
Before calling for service, collect as much as possible of the following available information:
v Machine type and model
v Hard disk drive upgrades
v Failure symptoms:
– Does the blade server fail the diagnostic programs? If so, what are the error codes?
– What occurs? When? Where?
– Is the failure repeatable?
– Has the current server configuration ever worked?
– What changes, if any, were made before it failed?
– Is this the original reported failure, or has this failure been reported before?
v Diagnostic program type and version level
v Hardware configuration (print screen of the system summary)
v Firmware level
v Operating-system type and version level
v Advanced Management Module service data. See Advanced Management Module Messages Guide.
v SNAP data. See Blade server Data Collection Guide
You can solve some problems by comparing the configuration and software setups between working and
nonworking blade server. When you compare blade servers to each other for diagnostic purposes,
consider them identical only if all the following factors are exactly the same in all of the blade servers:
v Machine type and model
v Firmware level
v Adapters and attachments, in the same locations
v Software versions and levels
v Diagnostic program type and version level
v Configuration option settings
v Operating-system control-file setup
228
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 3. Parts listing, Type 8406
The parts listing identifies each replaceable part and its part number.
Figure 7 shows replaceable components that are available for the PS700 blade server.
Figure 7. Parts illustration, Type 8406. PS700 base unit with cover.
© Copyright IBM Corp. 2010, 2011
229
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If IBM
installs a Tier 1 CRU at your request, you will be charged for the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request IBM to install it, at
no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service technicians.
For information about the terms of the warranty and getting service and assistance, see the information
center or the Warranty and Support Information document on the IBM BladeCenter Documentation CD.
Table 36. Parts listing, Type 8406
CRU part number
Index Description
(Tier 1)
(Tier 2)
FRU part
number
Failing
function
code (FFC)
74Y1810
Various,
see
“Failing
function
codes 151
through
2E33” on
page 181
Base system-board and chassis assembly, with 1
POWER7 four core microprocessor
1
Cover
46C7341
2
4x InfiniBand DDR Expansion Card (CFFh) for
BladeCenter (option)
49Y9976
2
Combo 4Gb Fibre Channel Expansion Card, 1 GB
Ethernet Card, (CFFh/PCIe) (option)
39Y9304
2
QLogic 8 Gb Fibre Channel and 1 Gb Ethernet
expansion card (CFFh/PCIe) (option)
44X1943
2
QLogic 2 port 10 GB Ethernet FCoE Expansion Card,
(CFFh/PCIe) (option)
42C1832
3
3Gb SAS Passthrough Expansion Card (CIOv) (option)
46C4069
2506
3
QLogic 4Gb Fibre Channel 1Xe PCI-Express Expansion
Card (CIOv) (option)
49Y4237
2E13
3
QLogic 8Gb Fibre Channel 1Xe PCI-Express Expansion
Card (CIOv) (option)
44X1948
2E14
3
Emulex 8Gb Fibre Channel Expansion Card (CIOv)
(option)
46M6138
2607
4
Bezel assembly with control panel
46K5760
4
OEM Bezel assembly with control panel
46K4683
5
DIMM filler
60H2962
2E12
Memory, 4GB DDR3, 1066MHz VLP RDIMM (option)
6
219
77P8691
2C6
Memory, 8GB DDR3, 800MHz VLP RDIMM (option)
6
219
77P8692
2C6
7
Management card
8
Tray, SAS hard disk drive
31R2239
9
300GB 10K RPM SFF SAS HDD and screws (4) (option)
42D0628
9
600GB 10K RPM SFF SAS HDD and screws(4) (option)
49Y2023
230
74Y1978
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 36. Parts listing, Type 8406 (continued)
CRU part number
Index Description
(Tier 1)
Hard drive filler
40K5928
Service Label
46K5891
IBM FRU/CRU Label
46K5893
OEM IBM FRU/CRU Label
46K5894
Cover warning label
90P4799
Miscellaneous parts kit
32R2451
3.0V Battery
33F8354
RFID Tag for North America, Latin America, Asia
Pacific
46K5362
RFID Tag for Europe
46K5363
RFID Tag for Japan
46K5364
(Tier 2)
FRU part
number
Failing
function
code (FFC)
Chapter 3. Parts listing, Type 8406
231
232
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 4. Removing and replacing blade server components
Use this information to remove and replace components of the PS700 blade server that are replaceable.
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If IBM
installs a Tier 1 CRU at your request, you will be charged for the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request IBM to install it, at
no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service technicians.
See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine whether a part is a Tier 1 CRU, Tier 2
CRU, or FRU component.
For information about the terms of the warranty and getting service and assistance, see the Warranty and
Support Information document.
Installation guidelines
Follow these guidelines to remove and replace blade server components.
v Read the safety information in the Safety topic and the guidelines in “Handling static-sensitive
devices” on page 234. This information will help you work safely.
v When you install a new blade server, download and apply the most recent firmware updates.
Download and install updated device drivers and the PS700 firmware. Go to the IBM Support site to
download the updates. Select your product, type, model, and operating system, and then click Go.
Click the Download tab, if necessary, for device driver and firmware updates.
Note: Changes are made periodically to the IBM Web site. Procedures for locating firmware and
documentation might vary slightly from what is described in this documentation.
v Observe good housekeeping in the area where you are working. Place removed covers and other parts
in a safe place.
v Back up all important data before you make changes to disk drives.
v Before you remove a hot-swap blade server from the BladeCenter unit, you must shut down the
operating system and turn off the blade server. You do not have to shut down the BladeCenter unit
itself.
v Blue on a component indicates touch points, where you can grip the component to remove it from or
install it in the blade server, open or close a latch, and so on.
v Orange on a component or an orange label on or near a component indicates that the component can
be hot-swapped, which means that if the blade server and operating system support hot-swap
capability, you can remove or install the component while the blade server is running. (Orange can also
indicate touch points on hot-swap components.) See the instructions for removing or installing a
specific hot-swap component for any additional procedures that you might have to perform before you
remove or install the component.
v When you are finished working on the blade server, reinstall all safety shields, guards, labels, and
ground wires.
See the ServerProven Web site for information about supported operating-system versions and all PS700
blade server optional devices.
© Copyright IBM Corp. 2010, 2011
233
System reliability guidelines
Follow these guidelines to help ensure proper cooling and system reliability.
v Verify that the ventilation holes on the blade server are not blocked.
v Verify that you are maintaining proper system cooling in the unit.
Do not operate the BladeCenter unit without a blade server, expansion unit, or filler blade installed in
each blade bay. See the documentation for your BladeCenter unit for additional information.
v Verify that you have followed the reliability guidelines for the BladeCenter unit.
v Verify that the blade server battery is operational. If the battery becomes defective, replace it
immediately, as described in “Removing the battery” on page 250 and “Installing the battery” on page
251.
Handling static-sensitive devices
Static electricity can damage the blade server and other electronic devices. To avoid damage, keep
static-sensitive devices in their static-protective packages until you are ready to install them.
Attention:
To reduce the possibility of damage from electrostatic discharge, observe the following precautions:
v Limit your movement. Movement can cause static electricity to build up around you.
v Handle the device carefully, holding it by its edges or its frame.
v Do not touch solder joints, pins, or exposed circuitry.
v Do not leave the device where others can handle and damage it.
v While the device is still in its static-protective package, touch it to an unpainted metal part of the
BladeCenter unit or any unpainted metal surface on any other grounded rack component in the rack
you are installing the device in for at least 2 seconds. This drains static electricity from the package and
from your body.
v Remove the device from its package and install it directly into the blade server without setting down
the device. If it is necessary to set down the device, put it back into its static-protective package. Do
not place the device on the blade server cover or on a metal surface.
v Take additional care when handling devices during cold weather. Heating dry winter air further
reduces its humidity and increases static electricity.
Returning a device or component
If you are instructed to return a device or component, follow all packaging instructions, and use any
packaging materials for shipping that are supplied to you.
234
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Removing the blade server from a BladeCenter unit
Remove the blade server from the BladeCenter unit to access options, connectors, and system-board
indicators.
Figure 8. Removing the blade server from the BladeCenter unit
Attention:
v To maintain proper system cooling, do not operate the BladeCenter unit without a blade server,
expansion unit, or blade filler installed in each blade bay.
v When you remove the blade server, note the bay number. Reinstalling a blade server into a different
bay from the one where it was removed might have unintended consequences. Some configuration
information and update options are established according to bay numbers. If you reinstall the blade
server into a different bay, you might have to reconfigure the blade server.
Perform the following procedure to remove the blade server.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. If the blade server is operating, shut down the operating system.
3. Press the power-control button (behind the control-panel door) to turn off the blade server. See
“Turning off the blade server” on page 7.
4. Wait at least 30 seconds for the hard disk drive to stop spinning.
5. Open the two release handles, as shown by ▌1▐ in Figure 8. The blade server moves out of the bay
approximately 0.6 cm (0.25 inch).
6. Pull the blade server out of the bay. Spring-loaded doors farther back in the bay move into place to
cover the bay temporarily.
7. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
8. Place either a blade filler or another blade server in the bay within 1 minute. The recessed
spring-loaded doors move out of the way as you insert the blade server or filler blade.
Chapter 4. Removing and replacing blade server components
235
Installing the blade server in a BladeCenter unit
Install the blade server in a BladeCenter unit to use the blade server.
Figure 9. Installing the blade server in a BladeCenter unit
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always replace
the blade server cover before installing the blade server.
Perform the following procedure to install a blade server in a BladeCenter unit.
1. Go to http://www.ibm.com/systems/support/ to download the latest firmware for the blade server.
Download the firmware so that you can use it later to update the blade server after you start it.
2. Read the Safety topic and the “Installation guidelines” on page 233.
3. If you have not done so already, install any optional devices that you want, such as a SAS drive or
memory modules.
4. Select the bay for the blade server.
v See the online information or the Installation and User's Guide that comes with your BladeCenter
unit to verify that the bay you choose is powered.
v Ensure proper cooling, performance, and system reliability by installing a blade server, expansion
unit, or blade filler in each blade bay.
v Reinstall a blade server in the same blade bay to preserve configuration information and update
options that are established by blade bay. Reinstalling into a different blade bay can have
unintended consequences, which might include re-configuring the blade server.
236
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
5. Verify that the release handles on the blade server are in the open position (perpendicular to the
blade server, as shown in ▌1▐ in Figure 9 on page 236).
6. If you installed a filler blade or another blade server in the bay from which you removed the blade
server, remove it from the bay.
7. Slide the blade server into the blade bay from which you removed it until the blade server stops.
The spring-loaded doors farther back in the bay that cover the bay opening move out of the way as
you insert the blade server.
8. Push the release handles on the front of the blade server to close and lock them.
The discovery and initialization process can take up to three minutes to complete. The discovery and
initialization process is complete when the green LED stops flashing rapidly and begins to flash
slowly. At this point, you can power on the blade server.
9. Turn on the blade server. See “Turning on the blade server” on page 6.
10. Verify that the power-on LED on the blade server control panel is lit continuously. The continuous
light indicates that the blade server is receiving power and is turned on.
11. Optional: Write identifying information on one of the user labels that come with the blade servers
and place the label on the BladeCenter unit bezel.
Important: Do not place the label on the blade server or in any way block the ventilation holes on
the blade server. See the online information or the documentation that comes with your BladeCenter
unit for information about label placement.
12. Use the SMS Utility program to configure the blade server. See “Using the SMS utility” on page 265.
13. Also use the management module to configure the blade server. See the documentation for the
management module to understand the functions that the management module provides.
If you have changed the configuration of the blade server or if this is a different blade server than the
one you removed, you must configure the blade server. You might also have to install the blade server
operating system.
See the "Installing the operating system" in the online information or the Installation and User's Guide PDF
for detailed information about these tasks.
Removing and replacing Tier 1 CRUs
Replacement of Tier 1 customer-replaceable units (CRUs) is your responsibility.
If IBM installs a Tier 1 CRU at your request, you will be charged for the installation.
The illustrations in this documentation might differ slightly from your hardware.
Removing the blade server cover
Remove the blade server from the chassis unit and press the blade server cover releases to open and
remove the blade server cover.
Chapter 4. Removing and replacing blade server components
237
Figure 10. Removing the cover
Perform the following procedure to open and remove the blade server cover.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Press the blade-cover release (as shown by ▌1▐ for the base unit in Figure 10) on each side of the
blade server, rotate the cover on the cover pins (▌2▐) and lift the cover open.
5. Lay the cover flat, or lift it from the cover pins on the blade server and store the cover for future use.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
238
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Installing and closing the blade server cover
Install and close the cover of the blade server before you insert the blade server into the BladeCenter
unit. Do not attempt to override this important protection.
Figure 11. Installing the cover
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always replace
the blade server cover before installing the blade server.
Perform the following procedure to replace and close the blade server cover.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Lower the cover so that the slots at the rear slide down onto the pins (▌1▐ in Figure 11) at the rear of
the blade server. Before you close the cover, verify that all components are installed and seated
correctly and that you have not left loose tools or parts inside the blade server.
Chapter 4. Removing and replacing blade server components
239
3. Pivot the cover to the closed position until the releases (as shown by ▌2▐) click into place in the cover.
4. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
Removing the bezel assembly
Remove the bezel assembly.
Figure 12. Removing the bezel assembly
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Press the cover-assembly release (as shown by ▌3▐ in Figure 12) on each side of the blade server and
pull the bezel assembly (▌4▐) away from the blade server approximately 1.2 cm (0.5 inch).
6. Disconnect the control-panel (▌1▐) from the control-panel connector (▌2▐).
7. Pull the bezel assembly away from the blade server.
8. If you are instructed to return the bezel assembly, follow all packaging instructions, and use any
packaging materials for shipping that are supplied to you.
Installing the bezel assembly
Install the bezel assembly.
240
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 13. Installing the bezel assembly
1. Connect the control-panel cable (▌1▐ in Figure 13) to the control-panel connector (▌2▐) on the system
board.
2. Carefully slide the bezel assembly (▌4▐) onto the blade server until the two bezel-assembly releases
(▌3▐) click into place in the bezel assembly.
3. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
4. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
Removing a drive
You can remove the SAS hard disk drive.
Chapter 4. Removing and replacing blade server components
241
Figure 14. Removing a drive
Perform the following procedure to remove the drive.
1. Back up the data from the drive to another storage device.
2. Read the Safety topic and the “Installation guidelines” on page 233.
3. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
4. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
5. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
6. Remove the drive:
a. Pull and hold the blue release lever ▌1▐ at the front of the drive tray.
b. Slide the drive forward ▌2▐ to disengage the connector.
c. Lift the drive ▌3▐ out of the drive tray.
Installing a drive
You can install a hard disk drive in drive tray.
Figure 15 on page 243 shows how to install the disk drive.
242
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 15. Installing a drive
All drive connectors are on the same bus. If the two drives are both SAS hard disk drives, you can use
them to implement and manage a redundant array of independent disks (RAID) level-1 array. See
“Configuring a RAID array” on page 269 for information about RAID configuration.
To install a drive, complete the following steps.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Locate the connector for the drive.
6. Place the drive ▌1▐ into the drive tray and push it toward the rear of the blade, into the connector
until the drive moves past the lever at the front of the tray.
Attention:
Do not press on the top of the drive. Pressing the top might damage the drive.
7. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
Chapter 4. Removing and replacing blade server components
243
8. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
Removing a memory module
You can remove a very low profile (VLP) dual-inline memory module (DIMM).
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Remove the bezel. See “Removing the bezel assembly” on page 240
6. Locate the DIMM connector that contains the DIMM that is to be replaced.
Figure 16. DIMM connectors. Base unit connectors
Attention: To avoid breaking the DIMM retaining clips or damaging the DIMM connectors, open
and close the clips gently.
7. Carefully open the retaining clips on each end of the DIMM connector and remove the DIMM.
Note: Install a DIMM filler in any location where a DIMM is not present to avoid machine damage.
8. If you are instructed to return the DIMM, follow all packaging instructions, and use any packaging
materials for shipping that are supplied to you.
244
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Installing a memory module
Install dual inline memory modules (DIMMs) in the blade server.
Table 37 shows allowable placement of DIMM modules:
Table 37. Memory module combinations
DIMM
count
PS700 Base blade planar (P1) DIMM slots
1
2
3
4
2
X
X
4
X
X
6
X
X
X
X
8
X
X
X
X
5
X
6
7
8
X
X
X
X
X
X
X
Figure 17. DIMM connectors. Base unit connectors
See “Supported DIMMs” on page 4 for additional information about the type of memory that is
compatible with the blade server.
To install a DIMM, complete the following steps:
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Read the documentation that comes with the DIMMs.
3. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
4. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
5. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
Chapter 4. Removing and replacing blade server components
245
6. Remove the bezel. See “Removing the bezel assembly” on page 240
7. Locate the DIMM connectors on the system board. See the illustration in “System-board connectors”
on page 8. Determine the connector into which you will install the DIMM.
8. Touch the static-protective package that contains the part to any unpainted metal surface on the
BladeCenter unit or any unpainted metal surface on any other grounded rack component; then,
remove the part from its package.
9. Verify that both of the connector retaining clips are in the fully open position.
10. Turn the DIMM so that the DIMM keys align correctly with the connector on the system board.
Attention: To avoid breaking the DIMM retaining clips or damaging the DIMM connectors, handle
the clips gently.
11. Insert the DIMM by pressing the DIMM along the guides into the connector.
Verify that each retaining clip snaps into the closed position.
Important: If there is a gap between the DIMM and the retaining clips, the DIMM is not correctly
installed. Open the retaining clips to remove and reinsert the DIMM. Install a DIMM filler in any
location where a DIMM is not present to avoid machine damage.
12. Attach the bezel. See “Installing the bezel assembly” on page 240
13. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
14. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
Removing and installing an I/O expansion card
Add an I/O expansion card to the blade server to provide additional connections for communicating on a
network.
The blade server supports various types of I/O expansion cards, including Gigabit Ethernet, Fibre
Channel, and Myrinet expansion cards.
Verify that any expansion card that you are using is listed on the ServerProven Web site in the list of
supported expansion cards for the PS700 blade server. For example, the following expansion cards are not
supported by the PS700 blade server:
v BladeCenter SFF Gb Ethernet
v Cisco 1X InfiniBand
v Qlogic iSCSI TOE Expansion Card (LFF)
v Broadcom 1Gb Ethernet (CIOv)
v SAS 3Gb Expansion Card (CIOv)
v Emulex 4Gb Fibre Channel Expansion Card (CIOv)
v Qlogic 4Gb SFF Fibre Channel Expansion card (CIOv)
See the ServerProven Web site for information about supported operating-system versions and all PS700
blade server optional devices.
246
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Removing a CIOv form-factor expansion card
You can remove a CIOv form-factor expansion card from the 1Xe connector.
Figure 18. Removing a CIOv form factor expansion card from the 1Xe connector
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Lift the expansion card ▌1▐ up and away from the 1Xe connector and out of the blade server.
6. If you are instructed to return the expansion card, follow all packaging instructions, and use any
packaging materials for shipping that are supplied to you.
Installing a CIOv form-factor expansion card
You can install a CIOv form-factor expansion card on the 1Xe connector to expand the I/O capabilities of
the blade server.
Chapter 4. Removing and replacing blade server components
247
Figure 19. Installing a CIOv form-factor expansion card on the 1Xe connector
To install a CIOv form-factor expansion card, complete the following steps:
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
4. Touch the static-protective package that contains the part to any unpainted metal surface on the
BladeCenter unit or any unpainted metal surface on any other grounded rack component; then, remove
the part from its package.
5. Orient the expansion card ▌1▐ over the system board.
6. Lower the card to the system board, aligning the connectors on the card with the 1Xe connector on
the system board.
7. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
8. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
9. Use the documentation that comes with the expansion card to install device drivers and to perform
any configuration that the expansion card requires.
248
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Removing a combination-form-factor expansion card
Complete this procedure to remove a combination-form-factor expansion card.
Figure 20. Removing a combination-form-factor expansion card
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
4. Remove the horizontal (CFFh) CFFe expansion card ▌2▐.
a. Pull up on the camming lever to disengage the card from the high-speed PCI-Express connector.
b. Gently pivot the card up and out of the expansion card standoff ▌3▐ on the system board.
c. Lift the card out of the blade server.
d. Optional: Reattach the plastic cover ▌1▐ for the PCI-Express connector, if it is available.
5. If you are instructed to return the expansion card, follow all packaging instructions, and use any
packaging materials for shipping that are supplied to you.
Installing a combination-form-factor expansion card
Install a combination-form-factor expansion card to expand the I/O capabilities of the blade server.
Figure 21. Installing a combination-form-factor expansion card
To install a combination-form-factor expansion card, complete the following steps:
Chapter 4. Removing and replacing blade server components
249
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
4. Remove the plastic cover for the PCI-Express (PCI-e) connector ▌1▐.
5. Touch the static-protective package that contains the part to any unpainted metal surface on the
BladeCenter unit or any unpainted metal surface on any other grounded rack component; then, remove
the part from its package.
6. Install the expansion card ▌2▐.
a. Slide the card into the expansion card standoff ▌3▐ on the system board.
b. Gently pivot the card down and attach it to the high speed PCI-Express connector.
7. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
8. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
9. Use the documentation that comes with the expansion card to install device drivers and to perform
any configuration that the expansion card requires.
Removing the battery
You can remove and replace the battery.
Figure 22. Removing the battery
Perform the following procedure to remove the battery.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
250
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Locate the battery on the system board. See “System-board connectors” on page 8 for the location of
the battery connector.
6. Use your finger to press down on one side of the battery; then, slide the battery out from its socket.
The spring mechanism will push the battery out toward you as you slide it from the socket.
Note: You might need to lift the battery edge slightly with a pen, fingernail, or other small object to
make it easier to release the battery.
7. Use your thumb and index finger to pull the battery from under the battery clip.
Note: After you remove the battery, press gently on the clip to make sure that the battery clip is
touching the base of the battery socket.
Installing the battery
You can install the battery.
Figure 23. Installing the battery
The following notes describe information that you must consider when replacing the battery in the blade
server.
v When replacing the battery, you must replace it with a lithium battery of the same type from the same
manufacturer.
v To order replacement batteries, call 1-800-426-7378 within the United States, and 1-800-465-7999 or
1-800-465-6666 within Canada. Outside the U.S. and Canada, call your IBM marketing representative or
authorized reseller.
v After you replace the battery:
1. Set the time and date.
2. Set the Network IP addresses (for blade servers that start up from a network).
3. Reconfigure any other blade server settings.
v To avoid possible danger, read and follow the following safety statement.
Statement 2:
Chapter 4. Removing and replacing blade server components
251
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery
recommended by the manufacturer. If your system has a module containing a lithium battery, replace
it only with the same module type made by the same manufacturer. The battery contains lithium and
can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Perform the following procedure to install the battery.
1. Follow any special handling and installation instructions that come with the battery.
2. Tilt the battery so that you can insert it into the socket, under the battery clip. Make sure that the side
with the positive (+) symbol is facing up.
3. As you slide it under the battery clip, press the battery down into the socket.
4. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
5. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
6. Turn on the blade server and reset the system date and time through the operating system that you
installed. For additional information, see your operating-system documentation.
7. Make sure that the boot list is correct using the management module Web interface. See the
management module documentation for more information) or the SMS Utility. See “Using the SMS
utility” on page 265 for more information.
Removing the disk drive tray
You can remove the disk drive tray.
252
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 24. Removing the disk drive tray
Perform the following procedure to remove the disk drive tray.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Remove the disk drive if one is installed. See “Removing a drive” on page 241.
6. Remove the four screws that secure the drive tray (▌1▐ in Figure 24) to the system board and remove
the drive tray.
Installing the disk drive tray
You can install the disk drive tray.
Chapter 4. Removing and replacing blade server components
253
Figure 25. Installing the disk drive tray
To install the disk drive tray, complete the following steps:
1. Place the drive tray (▌1▐ in Figure 25) into position on the system board and install the four screws to
secure it.
2. Install the disk drive that was removed from the drive tray. See “Installing a drive” on page 242 for
instructions.
3. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
4. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
254
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Removing the tier 2 management card
You can remove this tier 2 CRU yourself or request IBM to remove it, at no additional charge, under the
type of warranty service that is designated for the blade server. Remove the management card to replace
the card or to reuse the card in a new system board and chassis assembly.
Attention: Replacing the management card and the system board at the same time might result in the
loss of vital product data (VPD) and information concerning the number of active processor cores. If the
management card and system board must both be replaced, replace them one at a time. For further
assistance, contact your next level of support.
To remove the management card, which is shown by ▌1▐ in Figure 26, complete the following steps:
Note: See ▌3▐ in “System-board connectors” on page 8 for the location of the management card on the
system board.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Turn off the blade server and remove the blade server from the BladeCenter unit. See “Removing the
blade server from a BladeCenter unit” on page 235.
3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
4. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
5. Grasp the management card and pull it vertically out of unit to disengage the connectors.
Figure 26. Removing the management card
6. If you are instructed to return the management card, follow all packaging instructions, and use any
packaging materials for shipping that are supplied to you.
7. Replace the management card. See “Installing the tier 2 management card” on page 256.
Chapter 4. Removing and replacing blade server components
255
Installing the tier 2 management card
You can install this tier 2 CRU yourself or request IBM to install it, at no additional charge, under the
type of warranty service that is designated for the blade server. Use this procedure to install the
management card into the currently installed system board. If you are also installing a new system board,
you must complete this procedure before installing the new system board.
Attention: Replacing the management card and the system board at the same time might result in the
loss of vital product data (VPD) and information concerning the number of active processor cores. If the
management card and system board must both be replaced, replace them one at a time. For further
assistance, contact your next level of support.
To install the management card, complete the following steps.
Figure 27. Installing the management card
To install the management card, which is shown by ▌1▐ in Figure 27, complete the following steps:
1. Read the documentation that comes with the management card, if you ordered a replacement card.
2. Locate the connector on the currently installed system board into which the management card will
be installed. See “System-board connectors” on page 8 for the location.
3. Touch the static-protective package that contains the management card to any unpainted metal
surface on the BladeCenter unit or any unpainted metal surface on any other grounded rack
component; then, remove the management card from its package.
4. Insert the management card (as shown by ▌1▐ in Figure 27) and verify that the card is securely on
the connector and pushed down all the way to the main board.
5. Were you sent to this procedure from the “Replacing the FRU system-board and chassis assembly”
on page 260 procedure?
Yes:
Return to the Return to the Replacing the system-board and chassis assembly procedure.
No:
Continue with the next step.
6. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
256
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
7. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
8. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
Attention: If the management card was not properly installed, the power-on LED blinks rapidly
and a communication error is reported to the management module. If this occurs, remove the blade
server from the BladeCenter, as described in “Removing the tier 2 management card” on page 255.
Reseat the management card, then reinstall the blade server in the BladeCenter.
9.
Power on the blade server. If a Virtual I/O Server (VIOS) partition is installed, power on only the
VIOS partition. If VIOS is not installed, power on one of the partitions. If the blade server is
managed by a management console, wait for the blade server to be discovered by the management
console before continuing with the next step.
Attention: If the management card was not properly installed, the power-on LED blinks rapidly
and a communication error is reported to the management module. If this occurs, remove the blade
server from the BladeCenter, as described in “Removing the tier 2 management card” on page 255.
Reseat the management card, then reinstall the blade server in the BladeCenter.
10. A new Virtualization Engine technologies (VET) code must be generated and activated. Perform
“Obtaining a PowerVM Virtualization Engine system technologies activation code.” Wait at least five
minutes to ensure that the VET activation code is stored in the vital product data on the
management card and then power off the operating system and the blade server. Continue with the
next step of this procedure.
11. Power on the operating system of all the remaining partitions of the blade server.
12. If you replaced the part because of a service action, verify the repair by checking that the amber
enclosure fault LED is off. For more information, see “Blade server control panel buttons and LEDs”
on page 5.
Obtaining a PowerVM Virtualization Engine system technologies
activation code
After you replace the management card, you must reenter the activation code for the PowerVM function
to enable virtualization.
Chapter 4. Removing and replacing blade server components
257
Before you complete this procedure, install the management card, as described in Installing the
management card.
PowerVM is one of the Capacity on Demand advanced functions. Capacity on Demand advanced
functions are also referred to as Virtualization Engine systems technologies or Virtualization Engine
technologies (VET).
To locate your VET code and then install the code on your blade server, complete the following steps:
1. Power on the blade server. If a Virtual I/O Server (VIOS) partition is installed, power on only the
VIOS partition. If VIOS is not installed, power on one of the partitions.
2. List the activation information that you must supply when the new VET activation code is ordered
through one of the following methods:
v By using Hardware Management Console (HMC):
a. In the navigation area, expand Systems Management.
b. Select Servers.
c. In the contents area, select the destination blade server.
d. Click Tasks > Capacity on Demand (CoD) > PowerVM > View Code Information.
The following is an example of the PowerVM output:
Note: When you request the activation code, you must supply the information that is
emphasized in the following example.
CoD VET Information
System type: 7895
System serial number: 12-34567
Anchor card CCIN: 52EF
Anchor card serial number: 01-231S000
Anchor card unique identifier: 30250812077C3228
Resource ID: CA1F
Activated Resources: 0000
Sequence number: 0040
Entry check: EC
Go to step 3.
v By using Integrated Virtualization Manager (IVM):
a. Start the IVM, if it is not running already.
b. In an IVM session, enter the lsvet -t code command.
The following is an example of the lsvet command output:
Note: When you request the activation code, you must supply the information that is
emphasized in the following example.
sys_type=7895,sys_serial_num=12-34567,anchor_card_ccin=52EF,
anchor_card_serial_num=01-231S000, anchor_card_unique_id=30250812077C3228,
resource_id=CA1F,activated_resources=0000,sequence_num=0040, entry_check=EC
Go to step 3.
v By using IBM Systems Director Management Console (SDMC):
a. On the Welcome page, click Resources.
b. Expand Hosts, and select the destination host.
c. Click Actions, and select System Configuration > Capacity on Demand (CoD).
d. In the Capacity on Demand window, select Advanced Functions from the Select On Demand
Type list, and then select PowerVM.
e. Click View Code Information.
The following is an example of the PowerVM output:
258
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Note: When you request the activation code, you must supply the information that is
emphasized in the following example.
CoD VET Information
System type: 7895
System serial number: 12-34567
Anchor card CCIN: 52EF
Anchor card serial number: 01-231S000
Anchor card unique identifier: 30250812077C3228
Resource ID: CA1F
Activated Resources: 0000
Sequence number: 0040
Entry check: EC
Go to step 3.
3. Send a request for the VET activation code for your replacement management card to the System p
Capacity on Demand mailbox at pcod@us.ibm.com.
If the HMC or SDMC was used to get the VET information, include the following fields and their
values:
v System type
v System serial number
v Anchor card CCIN
v Anchor card serial number
v Anchor card unique identifier
If the IVM lsvet -t code command was used to get the VET information, include the following fields
and their values:
v sys_type
v sys_serial_num
v anchor_card_ccin
v anchor_card_serial_num
v anchor_card_unique_id
Following is an example of a statement that you can include in the email:
Provide a VET activation code for the new
management card (anchor card) in my blade server.
The System p Capacity on Demand site then generates the code and posts the VET activation code on
the website.
4. Go to the Capacity on Demand Activation code web page to retrieve your code.
a. Enter the system type, which is the value of the sys_type field from the IVM or the System type
field from the HMC or SDMC.
b. Enter the serial number, which is the value of the sys_serial_num from the IVM or the System
serial number field from the HMC or SDMC.
If the code has not yet been assigned, then the following note is displayed:
No matching activation code found.
Otherwise, look for a VET entry with a date that aligns with your request date. Then, record the
corresponding VET activation code.
5. Enter the VET activation code to activate PowerVM virtualization through one of the following
methods:
v For HMC or IVM graphical user interface (GUI), see Entering the PowerVM activation code.
v For SDMC GUI, see Entering the activation code for PowerVM Editions using the SDMC.
v For IVM command-line interface (CLI), complete the following steps:
a. In an IVM session, enter the following command to activate the 34-character VET activation
code.
Chapter 4. Removing and replacing blade server components
259
chvet -o e -k <activation_code>
Where, <activation_code> is the Activation Code.
For example, if the activation code is 4D8D6E7A81409365CA1F000028200041FD, enter the following
command:
chvet -o e -k 4D8D6E7A81409365CA1F000028200041FD
b. In an IVM session, validate that the code entry is successful by using the lsvet -t hist
command.
The following is an example of the command output:
time_stamp=03/06/2013 16:25:08,"entry=[VIOSI05000400-0331] CoD advanced
functions activation code entered, resource ID: CA1F, capabilities: 2820."
time_stamp=03/06/2013 16:25:08,entry=[VIOSI05000403-0332] Virtual I/O server
capability enabled.
time_stamp=03/06/2013 16:25:08,entry=[VIOSI05000405-0333] Micro-partitioning
capability enabled.
This ends the procedure.
The procedure does not require you to restart the blade server to activate the PowerVM virtualization
functions.
Note: If you intend to restart the blade server after you complete this procedure, wait at least 5 minutes
before you restart the blade server, or before you power off and power on the blade server. This is to
ensure that the activation code is stored in the vital product data on the management card.
Replacing the FRU system-board and chassis assembly
FRUs must be replaced only by trained service technicians. Replace the system board and chassis
assembly. When replacing the system board, you will replace the system board, blade base (chassis),
microprocessors, and heat sinks as one assembly. After replacement, you must either update the system
with the latest firmware or restore the pre-existing firmware that the customer provides on a diskette or
CD image.
Note: See “System-board layouts” on page 8 for more information on the locations of the connectors and
LEDs on the system board.
Perform the following procedure to replace the system-board and chassis assembly.
1. Read the Safety topic and the “Installation guidelines” on page 233.
2. Is the blade server managed by a management console?
Yes
Continue with step 3.
No
Continue with step 5 on page 261.
3. If the blade server has the Virtual I/O Server (VIOS) installed or utilizes more than one partition,
back up partition profile data by using one of the following methods:
v If this system is managed by an Hardware Management Console (HMC), back up the partition
profile data by using the HMC. See Backing up the partition profile data (http://
pic.dhe.ibm.com/infocenter/powersys/v3r1m5/topic/p7hbm/backupprofdata.htm).
v If this system is managed by an Integrated Virtualization Manager (IVM), back up the partition
profile data. Using the IVM command line, enter the bkprofdata command. For more information
about the bkprofdata command, see IVM bkprofdata command (http://pic.dhe.ibm.com/
infocenter/powersys/v3r1m5/topic/p7hcg/bkprofdata.htm).
Note: Although the HMC management console automatically saves partition profile data, which is
used for recovery in step 18 on page 262, the manual backup completed in step 3 is recommended as
a precaution and best practice before system board replacement.
260
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
4. Does the blade server have Fibre Channel adapters?
Yes
“Save vfchost map data” on page 218. Then, continue with the next step.
No
Continue with the next step.
5. Shut down the operating system, turn off the blade server, and remove the blade server from the
BladeCenter unit. See “Removing the blade server from a BladeCenter unit” on page 235.
6. Carefully lay the blade server on a flat, static-protective surface, with the cover side up.
7. Open and remove the blade server cover. See “Removing the blade server cover” on page 237.
8. Remove the blade server bezel assembly. See “Removing the bezel assembly” on page 240.
9. Remove any of the installed components listed below from the system board; then, place them on a
non-conductive surface or install them on the new system board and chassis assembly.
v I/O expansion card. See “Removing and installing an I/O expansion card” on page 246.
v Hard disk drives. See “Removing a drive” on page 241.
v DIMMs. See “Removing a memory module” on page 244.
v Management card. See “Removing the tier 2 management card” on page 255.
v Battery. See “Removing the battery” on page 250.
10. Touch the static-protective package that contains the system-board and chassis assembly to any
unpainted metal surface on the BladeCenter unit or any unpainted metal surface on any other
grounded rack component; then, remove the assembly from its package.
11. Install any of the components listed below that were removed from the old system-board and chassis
assembly.
v I/O expansion card. See “Removing and installing an I/O expansion card” on page 246.
v Hard disk drives. See “Installing a drive” on page 242.
v DIMMs. See “Installing a memory module” on page 245.
v Management card. See “Installing the tier 2 management card” on page 256.
v Battery. See “Installing the battery” on page 251.
Note: Install a DIMM filler or a hard drive filler in any location where a DIMM or hard drive is not
present to avoid machine damage.
12. Install the bezel assembly. See “Installing the bezel assembly” on page 240 for instructions.
13. Install and close the blade server cover. See “Installing and closing the blade server cover” on page
239.
Statement 21
CAUTION:
Hazardous energy is present when the blade server is connected to the power source. Always
replace the blade server cover before installing the blade server.
Chapter 4. Removing and replacing blade server components
261
14. Write the machine type, model number, and serial number of the blade server on the repair
identification (RID) tag that comes with the replacement system-board and chassis assembly. This
information is on the identification label that is behind the control-panel door on the front of the
blade server.
Important: Completing the information on the RID tag ensures future entitlement for service.
15. Place the RID tag on the bottom of the blade server chassis.
16. Install the blade server into the BladeCenter unit. See “Installing the blade server in a BladeCenter
unit” on page 236.
17. Is the blade server managed by a management console?
Yes
Continue with step 18.
No
Turn on the blade server and then continue with step 19.
Note: Because you are using a management card that was initialized from your old system board
and chassis assembly, the firmware retrieves the vital product data (VPD) from the management card
and caches it in blade server memory and you do not see any prompts for data.
18. If the blade server has the VIOS installed or utilizes more than one partition, restore partition profile
data by using one of the following methods:
v If this system is managed by an HMC, recover the partition profile data because the blade server
is in Recovery state. For instructions, see Correcting a Recovery state for a managed system
(http://pic.dhe.ibm.com/infocenter/powersys/v3r1m5/topic/p7eav/
aremanagedsystemstate_recovery.htm).
Note: The blade server is turned on as part of the recovery process.
v If this system is managed by an IVM, turn on the blade server, then restore the partition profile
data. Using the IVM command line, enter the rstprofdata command. For more information about
the rstprofdata command, see IVM rstprofdata command (http://pic.dhe.ibm.com/infocenter/
powersys/v3r1m5/topic/p7hcg/rstprofdata.htm).
Note: Because you are using a management card that was initialized from your old system board
and chassis assembly, the firmware retrieves the VPD from the management card and caches it in
blade server memory and you do not see any prompts for data.
19. Does the blade server have Fibre Channel adapters?
Yes
“Restore vfchost map data” on page 219. Then, continue with the next step.
No
Continue with the next step.
20. Reset the system date and time through the operating system that you installed.
For additional information, see the documentation for your operating system.
262
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 5. Configuring
Update the firmware and use the management module and the system management services (SMS) to
configure the PS700 blade server.
Updating the firmware
IBM periodically makes firmware updates available for you to install on the blade server, the
management module, or expansion cards in the blade server.
Important: To avoid problems and to maintain proper system performance, always verify that the blade
server BIOS, service processor, and diagnostic firmware levels are consistent for all blade servers within
the BladeCenter unit. See “Verifying the system firmware levels” on page 221 for more information.
Plan to use a method of applying blade server firmware updates other than the management module.
The enhanced service processor has a larger firmware image that makes it impractical to download and
install over the RS-485 bus of the management module. Therefore, a firmware update for the blade server
is not supported from the management module.
You can still use the other methods of performing firmware updates for the blade server:
v In-band operating system capabilities, such as the update_flash command for Linux and AIX or the
ldfware command for Virtual I/O Server
v The firmware update function of AIX diagnostics
v The firmware update function of the stand-alone diagnostics CD
Attention: Before the installation of the new firmware to the temporary side begins, the contents of the
temporary side are copied into the permanent side. After the firmware installation begins, the previous
level of firmware on the permanent side is no longer available.
Use the following procedure to install updated firmware.
1. Start the TEMP image, as described in “Starting the TEMP image” on page 221.
2. Download the PS700 firmware.
a. Go to the IBM Support site at http://www.ibm.com/systems/support/ to download the updates.
b. Select your product, type, model, and operating system, and then click Go.
c. Click the Download tab, if necessary, for device driver and firmware updates.
d. Download the firmware to the /tmp/fwupdate directory.
3. Log on to the AIX or Linux system as root, or log on to the Virtual I/O Server as padmin.
4. Type ls /tmp/fwupdate to identify the name of the firmware.
The result of the command lists any firmware updates that you downloaded to the directory, such as
the following update, for example:
01AA7xx_yyy_zzz
© Copyright IBM Corp. 2010, 2011
263
5. Install the firmware update with one of the following methods:
v Install the firmware with the in-band diagnostics of your AIX system, as described in Using the AIX
diagnostics to install the server firmware update through AIX.
v Install the firmware with the update_flash command on AIX:
cd /tmp/fwupdate
/usr/lpp/diagnostics/bin/update_flash -f 01AA7xx_yyy_zzz
v Install the firmware with the update_flash command on Linux:
cd /tmp/fwupdate
/usr/sbin/update_flash -f 01AA7xx_yyy_zzz
v Install the firmware with the ldfware command on Virtual I/O Server:
cd /tmp/fwupdate
ldfware -file 01AA7xx_yyy_zzz
Reference codes CA2799FD and CA2799FF are displayed alternately on the control panel during the
server firmware installation process. The system automatically powers off and on when the
installation is complete.
To install firmware by using the SDMC, complete the following steps:
a. Log on to the SDMC console.
b. Under Welcome to IBM Systems Director, click on the Manage tab.
c. Click on Update Manager.
The Update Manager guides you through the steps to update the firmware. For more information, see
the SDMC User's Guide.
To install firmware by using the HMC, see Managed system updates.
6. Verify that the update installed correctly, as described in “Verifying the system firmware levels” on
page 221.
7. Optional: After testing the updated server, you might decide to install the firmware update
permanently, as described in “Committing the TEMP system firmware image” on page 222.
You can also install an update permanently on either AIX or Linux, as described in:
v Using AIX commands to install a firmware update permanently
v Using Linux commands to install a firmware update permanently
Configuring the blade server
While the firmware is running POST and before the operating system starts, a POST menu with POST
indicators is displayed. The POST indicators are the words Memory, Keyboard, Network, SCSI, and Speaker
that are displayed as each component is tested. You can then select configuration utilities from the POST
menu.
264
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v System management services (SMS)
Use the system management services (SMS) utility to view information about your system or partition
and to perform tasks such as setting up remote IPL, changing self configuring SCSI device (SCSD)
settings, and selecting boot options. The SMS utility can be used for AIX or Linux partitions. See
“Using the SMS utility” for more information.
v Default boot list
Use this utility to initiate a system boot in service mode through the default service mode boot list.
This mode attempts to boot from the first device of each type that is found in the list.
Note: This is the preferred method of starting the stand-alone AIX diagnostics from CD.
v Stored boot list
Use this utility to initiate a system boot in service mode, using the customized service mode boot list
that was set up by AIX when AIX was first booted, or manually using the AIX service aids.
v Open firmware prompt
This utility is for advanced users of the IEEE 1275 specifications only.
v Management module
Use the management module to change the boot list, determine which firmware image to boot, and
perform other configuration tasks.
Using the SMS utility
Use the System Management Services (SMS) utility to perform a variety of configuration tasks on the
PS700 blade server.
Starting the SMS utility
Start the SMS utility to configure the blade server.
1. Turn on or restart the blade server, and establish an SOL session with it.
See the BladeCenter Management Module Command-Line Interface Reference Guide or the BladeCenter
Serial-Over-LAN Setup Guide for more information.
2. When the POST menu and indicators are displayed, press the 1 key after the word Keyboard is
displayed and before the word Speaker is displayed.
3. Follow the instructions on the screen.
SMS utility menu choices
Select SMS tasks from the SMS utility main menu. Choices on the SMS utility main menu depend on the
version of the firmware in the blade server. Some menu choices might differ slightly from these
descriptions.
Chapter 5. Configuring
265
v Select Language
Select this choice to change the language that is used to display the SMS menus.
v Setup Remote IPL (Initial Program Load)
Select this choice to enable and set up the remote startup capability of the blade server or partition.
v Change SCSD Settings
Select this choice to view and change the addresses of the self configuring SCSI device (SCSD)
controllers that are attached to the blade server.
v Select Console
Select this choice to select the console on which the SMS menus are displayed.
v Select Boot Options
Select this choice to view and set various options regarding the installation devices and boot devices.
Note: If a device that you are trying to select (such as a USB CD drive in the BladeCenter media tray)
is not displayed in the Select Device Type menu, select List all Devices and select the device from that
menu.
Creating a CE login
If the blade server is running an AIX operating system, you can create a customer engineer (CE) login to
perform operating system commands that are required to service the system without being logged in as a
root user.
The CE login must have a role of Run Diagnostics and be a primary group of System. This enables the
CE login to perform the following tasks:
v Run the diagnostics, including the service aids, certify, and format.
v Run all the operating-system commands that are run by system group users.
v Configure and unconfigure devices that are not in use.
In addition, this login can have Shutdown Group enabled to allow use of the Update System Microcode
service aid and the shutdown and reboot operations.
The recommended CE login user name is qserv.
Configuring the Gigabit Ethernet controllers
Two Ethernet controllers are integrated on the blade server system board. You must install a device driver
to enable the blade server operating system to address the Ethernet controllers.
266
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Each controller provides a 1000 Mbps full-duplex interface for connecting to one of the
Ethernet-compatible I/O modules in I/O-module bays 1 and 2, which enables simultaneous transmission
and reception of data on the Ethernet local area network (LAN).
The routing from an Ethernet controller to an I/O-module bay varies, depending on the type of
BladeCenter that the blade is installed in. For example, each Ethernet controller on the system board is
routed to a different I/O module in I/O module bay 1 or module bay 2 of the BladeCenter H or HT.
See “Blade server Ethernet controller enumeration” for information about how to determine the routing
from an Ethernet controller to an I/O-module bay for the blade server.
Note: Other types of blade servers, such as the BladeCenter HS20 Type 8678 blade server, in the same
BladeCenter unit as the PS700 blade server might have different Ethernet controller routing. See the
documentation for a blade server for information.
You must install a device driver for the blade server operating system to address the Ethernet controllers.
For device drivers and information about configuring Ethernet controllers, see the Broadcom NetXtreme
Gigabit Ethernet Software CD that comes with the blade server. For updated information about configuring
the controllers, see http://www.ibm.com/systems/support/.
The Ethernet controllers in your blade server support failover, which provides automatic redundancy for
the Ethernet controllers. Failover capabilities vary per BladeCenter unit.
Without failover, only one Ethernet controller can be connected from each server to each virtual LAN or
subnet. With failover, you can configure more than one Ethernet controller from each server to attach to
the same virtual LAN or subnet. Either one of the integrated Ethernet controllers can be configured as the
primary Ethernet controller. If you have configured the controllers for failover and the primary link fails,
the secondary controller takes over. When the primary link is restored, the Ethernet traffic switches back
to the primary Ethernet controller. See the operating-system device-driver documentation for information
about configuring for failover.
Important: To support failover on the blade server Ethernet controllers, the Ethernet switch modules in
the BladeCenter unit must have identical configurations.
Blade server Ethernet controller enumeration
The enumeration of the Ethernet controllers in a blade server is operating-system dependent. You can
verify the Ethernet controller designations that a blade server uses through the operating-system settings.
Chapter 5. Configuring
267
The routing of an Ethernet controller to a particular I/O-module bay depends on the type of blade server.
You can verify which Ethernet controller is routed to which I/O-module bay by using the following test:
1. Install only one Ethernet switch module or pass-thru module in I/O-module bay 1.
2. Make sure that the ports on the switch module or pass-thru module are enabled. Click I/O Module
Tasks > Admin/Power/Restart in the management-module Web interface.
3. Enable only one of the Ethernet controllers on the blade server. Note the designation that the blade
server operating system has for the controller.
4. Ping an external computer on the network that is connected to the switch module or pass-thru
module. If you can ping the external computer, the Ethernet controller that you enabled is associated
with the switch module or pass-thru module in I/O-module bay 1. The other Ethernet controller in
the blade server is associated with the switch module or pass-thru module in I/O-module bay 2.
If you have installed an I/O expansion card in the blade server, communication from the expansion card
should be routed to I/O-module bays 3 and 4, if these bays are supported by your BladeCenter unit. You
can verify which controller on the card is routed to which I/O-module bay by performing the same test
and using a controller on the expansion card and a compatible switch module or pass-thru module in
I/O-module bay 3 or 4.
MAC addresses for host Ethernet adapters
Two integrated Ethernet controllers in the PS700 blade server provide a Host Ethernet Adapter (HEA)
that, in turn, provides virtual logical host Ethernet adapters (LHEAs) to client logical partitions (LPARs).
The Virtual I/O Server software uses LHEAs as if they were real physical adapters.
The logical HEAs in the PS700 blade server bypass the need for further bridging from Virtual I/O Server,
because the LHEAs connect directly to the integrated Ethernet controllers in the blade server, and from
there to the I/O modules in the BladeCenter unit.
The PS700 blade servers use two physical HEA ports and 14 logical HEA ports to share the two
integrated physical Ethernet adapters on the blade server. The 14 logical HEA medium access control
(MAC) addresses are in the same range as the two integrated Ethernet controllers (eth0 and eth1) and the
two associated physical HEA ports on the blade server.
The MAC addresses of the two physical HEAs are displayed in the advanced management module. The
MAC address of the first integrated Ethernet controller (eth0) is listed on a label on the blade server. The
label also lists the last MAC address. Table 38 shows the relative addressing scheme.
Table 38. MAC addressing scheme for physical and logical host Ethernet adapters
Node
Name in management
module
Integrated Ethernet
controller eth0
Integrated Ethernet
controller eth1
Relation to the MAC that
is listed on the PS700 label
Example
Same as first MAC address
00:1A:64:44:0e:c4
MAC + 1
00:1A:64:44:0e:c5
HEA port 0
MAC address 1
MAC + 2
00:1A:64:44:0e:c6
HEA port 1
MAC address 2
MAC + 3
00:1A:64:44:0e:c7
MAC +4 to MAC +16
00:1A:64:44:0ec8 to
00:1A:64:44:0ed4
Logical HEA ports
MAC +17
Logical HEA port
268
Same as last MAC address
on the label
00:1A:64:44:0ec8 to
00:1A:64:44:0ed4
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
For more information about planning, deploying, and managing the use of host Ethernet adapters, see the
Configuring section of the PowerVM Information Roadmap.
Configuring a RAID array
Configuring a RAID array applies to a blade server in which two SAS hard disk drives are installed.
Two SAS disk drives in the PS700 blade server can be used to implement and manage RAID level-0 and
RAID level-1 arrays in operating systems that are on the ServerProven list. For the blade server, you must
configure the RAID array through "smit sasdam," which is the SAS RAID Disk Array Manager for AIX.
The AIX Disk Array Manager is packaged with the Diagnostics utilities on the Diagnostics CD. Use "smit
sasdam" to configure the disk drives for use with the SAS controller. For more information, see “Using
the Disk Array Manager” in the Systems Hardware Information Center at http://
publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/arebj/
sasusingthesasdiskarraymanager.htm.
Important: Depending on your RAID configuration, you might have to create the array before you install
the operating system in the blade server.
Before you can create a RAID array, you must reformat the drives so that the sector size of the drives
changes from 512 bytes to 528 bytes. If you later decide to remove the drives, delete the RAID array
before you remove the drives. If you decide to delete the RAID array and reuse the drives, you must
reformat the drives so that the sector size of the drives changes from 528 bytes to 512 bytes.
Updating IBM Director
If you plan to use IBM Director to manage the blade server, you must check for the latest applicable IBM
Director updates and interim fixes.
To install the IBM Director updates and any other applicable updates and interim fixes, complete the
following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from
what is described in this procedure.
1. Check for the latest version of IBM Director:
a. Go to the IBM Director download site at http://www.ibm.com/systems/management/director/
downloads.html.
b. If the drop-down list shows a newer version of IBM Director than the version that comes with the
blade server, follow the instructions on the Web page to download the latest version.
2. Install IBM Director.
3. Download and install any applicable updates or interim fixes for the blade server:
a. Go to the IBM Support site at http://www.ibm.com/systems/support/.
b. Under Product support, click BladeCenter.
c. Under Popular links, click Software and device drivers.
d. Click BladeCenter PS700 to display the list of downloadable files for the blade server.
Chapter 5. Configuring
269
270
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Appendix. Notices
This information was developed for products and services offered in the U.S.A.
The manufacturer may not offer the products, services, or features discussed in this document in other
countries. Consult the manufacturer's representative for information on the products and services
currently available in your area. Any reference to the manufacturer's product, program, or service is not
intended to state or imply that only that product, program, or service may be used. Any functionally
equivalent product, program, or service that does not infringe any intellectual property right of the
manufacturer may be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any product, program, or service.
The manufacturer may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these patents. You can
send license inquiries, in writing, to the manufacturer.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: THIS INFORMATION IS PROVIDED “AS IS” WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain
transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication.
The manufacturer may make improvements and/or changes in the product(s) and/or the program(s)
described in this publication at any time without notice.
Any references in this information to Web sites not owned by the manufacturer are provided for
convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at
those Web sites are not part of the materials for this product and use of those Web sites is at your own
risk.
The manufacturer may use or distribute any of the information you supply in any way it believes
appropriate without incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the
results obtained in other operating environments may vary significantly. Some measurements may have
been made on development-level systems and there is no guarantee that these measurements will be the
same on generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.
Information concerning products not produced by this manufacturer was obtained from the suppliers of
those products, their published announcements or other publicly available sources. This manufacturer has
not tested those products and cannot confirm the accuracy of performance, compatibility or any other
claims related to products not produced by this manufacturer. Questions on the capabilities of products
not produced by this manufacturer should be addressed to the suppliers of those products.
All statements regarding the manufacturer's future direction or intent are subject to change or withdrawal
without notice, and represent goals and objectives only.
© Copyright IBM Corp. 2010, 2011
271
The manufacturer's prices shown are the manufacturer's suggested retail prices, are current and are
subject to change without notice. Dealer prices may vary.
This information is for planning purposes only. The information herein is subject to change before the
products described become available.
This information contains examples of data and reports used in daily business operations. To illustrate
them as completely as possible, the examples include the names of individuals, companies, brands, and
products. All of these names are fictitious and any similarity to the names and addresses used by an
actual business enterprise is entirely coincidental.
If you are viewing this information in softcopy, the photographs and color illustrations may not appear.
The drawings and specifications contained herein shall not be reproduced in whole or in part without the
written permission of the manufacturer.
The manufacturer has prepared this information for use with the specific machines indicated. The
manufacturer makes no representations that it is suitable for any other purpose.
The manufacturer's computer systems contain mechanisms designed to reduce the possibility of
undetected data corruption or loss. This risk, however, cannot be eliminated. Users who experience
unplanned outages, system failures, power fluctuations or outages, or component failures must verify the
accuracy of operations performed and data saved or transmitted by the system at or near the time of the
outage or failure. In addition, users must establish procedures to ensure that there is independent data
verification before relying on such data in sensitive or critical operations. Users should periodically check
the manufacturer's support websites for updated information and fixes applicable to the system and
related software.
Ethernet connection usage restriction
This product is not intended to be connected directly or indirectly by any means whatsoever to interfaces
of public telecommunications networks.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at
Copyright and trademark information at www.ibm.com/legal/copytrade.shtml.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks
of Adobe Systems Incorporated in the United States, and/or other countries.
INFINIBAND, InfiniBand Trade Association, and the INFINIBAND design marks are trademarks and/or
service marks of the INFINIBAND Trade Association.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or
both.
Red Hat, the Red Hat "Shadow Man" logo, and all Red Hat-based trademarks and logos are trademarks
or registered trademarks of Red Hat, Inc., in the United States and other countries.
Other product or service names might be trademarks of IBM or other companies.
272
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Electronic emission notices
When attaching a monitor to the equipment, you must use the designated monitor cable and any
interference suppression devices supplied with the monitor.
Class A Notices
The following Class A statements apply to the IBM servers that contain the POWER7 processor and its
features unless designated as electromagnetic compatibility (EMC) Class B in the feature information.
Federal Communications Commission (FCC) statement
Note: This equipment has been tested and found to comply with the limits for a Class A digital device,
pursuant to Part 15 of the FCC Rules. These limits are designed to provide reasonable protection against
harmful interference when the equipment is operated in a commercial environment. This equipment
generates, uses, and can radiate radio frequency energy and, if not installed and used in accordance with
the instruction manual, may cause harmful interference to radio communications. Operation of this
equipment in a residential area is likely to cause harmful interference, in which case the user will be
required to correct the interference at his own expense.
Properly shielded and grounded cables and connectors must be used in order to meet FCC emission
limits. IBM is not responsible for any radio or television interference caused by using other than
recommended cables and connectors or by unauthorized changes or modifications to this equipment.
Unauthorized changes or modifications could void the user's authority to operate the equipment.
This device complies with Part 15 of the FCC rules. Operation is subject to the following two conditions:
(1) this device may not cause harmful interference, and (2) this device must accept any interference
received, including interference that may cause undesired operation.
Industry Canada Compliance Statement
This Class A digital apparatus complies with Canadian ICES-003.
Avis de conformité à la réglementation d'Industrie Canada
Cet appareil numérique de la classe A est conforme à la norme NMB-003 du Canada.
European Community Compliance Statement
This product is in conformity with the protection requirements of EU Council Directive 2004/108/EC on
the approximation of the laws of the Member States relating to electromagnetic compatibility. IBM cannot
accept responsibility for any failure to satisfy the protection requirements resulting from a
non-recommended modification of the product, including the fitting of non-IBM option cards.
This product has been tested and found to comply with the limits for Class A Information Technology
Equipment according to European Standard EN 55022. The limits for Class A equipment were derived for
commercial and industrial environments to provide reasonable protection against interference with
licensed communication equipment.
European Community contact:
IBM Deutschland GmbH
Technical Regulations, Department M456
IBM-Allee 1, 71139 Ehningen, Germany
Tele: +49 7032 15-2937
email: tjahn@de.ibm.com
Appendix. Notices
273
Warning: This is a Class A product. In a domestic environment, this product may cause radio
interference, in which case the user may be required to take adequate measures.
VCCI Statement - Japan
The following is a summary of the VCCI Japanese statement in the box above:
This is a Class A product based on the standard of the VCCI Council. If this equipment is used in a
domestic environment, radio interference may occur, in which case, the user may be required to take
corrective actions.
Japanese Electronics and Information Technology Industries Association (JEITA)
Confirmed Harmonics Guideline (products less than or equal to 20 A per phase)
Japanese Electronics and Information Technology Industries Association (JEITA)
Confirmed Harmonics Guideline with Modifications (products greater than 20 A per
phase)
Electromagnetic Interference (EMI) Statement - People's Republic of China
Declaration: This is a Class A product. In a domestic environment this product may cause radio
interference in which case the user may need to perform practical action.
274
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Electromagnetic Interference (EMI) Statement - Taiwan
The following is a summary of the EMI Taiwan statement above.
Warning: This is a Class A product. In a domestic environment this product may cause radio interference
in which case the user will be required to take adequate measures.
IBM Taiwan Contact Information:
Electromagnetic Interference (EMI) Statement - Korea
Appendix. Notices
275
Germany Compliance Statement
Deutschsprachiger EU Hinweis: Hinweis für Geräte der Klasse A EU-Richtlinie zur
Elektromagnetischen Verträglichkeit
Dieses Produkt entspricht den Schutzanforderungen der EU-Richtlinie 2004/108/EG zur Angleichung der
Rechtsvorschriften über die elektromagnetische Verträglichkeit in den EU-Mitgliedsstaaten und hält die
Grenzwerte der EN 55022 Klasse A ein.
Um dieses sicherzustellen, sind die Geräte wie in den Handbüchern beschrieben zu installieren und zu
betreiben. Des Weiteren dürfen auch nur von der IBM empfohlene Kabel angeschlossen werden. IBM
übernimmt keine Verantwortung für die Einhaltung der Schutzanforderungen, wenn das Produkt ohne
Zustimmung von IBM verändert bzw. wenn Erweiterungskomponenten von Fremdherstellern ohne
Empfehlung von IBM gesteckt/eingebaut werden.
EN 55022 Klasse A Geräte müssen mit folgendem Warnhinweis versehen werden:
"Warnung: Dieses ist eine Einrichtung der Klasse A. Diese Einrichtung kann im Wohnbereich
Funk-Störungen verursachen; in diesem Fall kann vom Betreiber verlangt werden, angemessene
Maßnahmen zu ergreifen und dafür aufzukommen."
Deutschland: Einhaltung des Gesetzes über die elektromagnetische Verträglichkeit von Geräten
Dieses Produkt entspricht dem “Gesetz über die elektromagnetische Verträglichkeit von Geräten
(EMVG)“. Dies ist die Umsetzung der EU-Richtlinie 2004/108/EG in der Bundesrepublik Deutschland.
Zulassungsbescheinigung laut dem Deutschen Gesetz über die elektromagnetische Verträglichkeit von
Geräten (EMVG) (bzw. der EMC EG Richtlinie 2004/108/EG) für Geräte der Klasse A
Dieses Gerät ist berechtigt, in Übereinstimmung mit dem Deutschen EMVG das EG-Konformitätszeichen
- CE - zu führen.
Verantwortlich für die Einhaltung der EMV Vorschriften ist der Hersteller:
International Business Machines Corp.
New Orchard Road
Armonk, New York 10504
Tel: 914-499-1900
Der verantwortliche Ansprechpartner des Herstellers in der EU ist:
IBM Deutschland GmbH
Technical Regulations, Abteilung M456
IBM-Allee 1, 71139 Ehningen, Germany
Tel: +49 7032 15-2937
email: tjahn@de.ibm.com
Generelle Informationen:
Das Gerät erfüllt die Schutzanforderungen nach EN 55024 und EN 55022 Klasse A.
276
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Electromagnetic Interference (EMI) Statement - Russia
Class B Notices
The following Class B statements apply to features designated as electromagnetic compatibility (EMC)
Class B in the feature installation information.
Federal Communications Commission (FCC) statement
This equipment has been tested and found to comply with the limits for a Class B digital device,
pursuant to Part 15 of the FCC Rules. These limits are designed to provide reasonable protection against
harmful interference in a residential installation.
This equipment generates, uses, and can radiate radio frequency energy and, if not installed and used in
accordance with the instructions, may cause harmful interference to radio communications. However,
there is no guarantee that interference will not occur in a particular installation.
If this equipment does cause harmful interference to radio or television reception, which can be
determined by turning the equipment off and on, the user is encouraged to try to correct the interference
by one or more of the following measures:
v Reorient or relocate the receiving antenna.
v Increase the separation between the equipment and receiver.
v Connect the equipment into an outlet on a circuit different from that to which the receiver is
connected.
v Consult an IBM-authorized dealer or service representative for help.
Properly shielded and grounded cables and connectors must be used in order to meet FCC emission
limits. Proper cables and connectors are available from IBM-authorized dealers. IBM is not responsible for
any radio or television interference caused by unauthorized changes or modifications to this equipment.
Unauthorized changes or modifications could void the user's authority to operate this equipment.
This device complies with Part 15 of the FCC rules. Operation is subject to the following two conditions:
(1) this device may not cause harmful interference, and (2) this device must accept any interference
received, including interference that may cause undesired operation.
Appendix. Notices
277
Industry Canada Compliance Statement
This Class B digital apparatus complies with Canadian ICES-003.
Avis de conformité à la réglementation d'Industrie Canada
Cet appareil numérique de la classe B est conforme à la norme NMB-003 du Canada.
European Community Compliance Statement
This product is in conformity with the protection requirements of EU Council Directive 2004/108/EC on
the approximation of the laws of the Member States relating to electromagnetic compatibility. IBM cannot
accept responsibility for any failure to satisfy the protection requirements resulting from a
non-recommended modification of the product, including the fitting of non-IBM option cards.
This product has been tested and found to comply with the limits for Class B Information Technology
Equipment according to European Standard EN 55022. The limits for Class B equipment were derived for
typical residential environments to provide reasonable protection against interference with licensed
communication equipment.
European Community contact:
IBM Deutschland GmbH
Technical Regulations, Department M456
IBM-Allee 1, 71139 Ehningen, Germany
Tele: +49 7032 15-2937
email: tjahn@de.ibm.com
VCCI Statement - Japan
Japanese Electronics and Information Technology Industries Association (JEITA)
Confirmed Harmonics Guideline (products less than or equal to 20 A per phase)
278
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Japanese Electronics and Information Technology Industries Association (JEITA)
Confirmed Harmonics Guideline with Modifications (products greater than 20 A per
phase)
IBM Taiwan Contact Information
Electromagnetic Interference (EMI) Statement - Korea
Germany Compliance Statement
Deutschsprachiger EU Hinweis: Hinweis für Geräte der Klasse B EU-Richtlinie zur
Elektromagnetischen Verträglichkeit
Dieses Produkt entspricht den Schutzanforderungen der EU-Richtlinie 2004/108/EG zur Angleichung der
Rechtsvorschriften über die elektromagnetische Verträglichkeit in den EU-Mitgliedsstaaten und hält die
Grenzwerte der EN 55022 Klasse B ein.
Um dieses sicherzustellen, sind die Geräte wie in den Handbüchern beschrieben zu installieren und zu
betreiben. Des Weiteren dürfen auch nur von der IBM empfohlene Kabel angeschlossen werden. IBM
übernimmt keine Verantwortung für die Einhaltung der Schutzanforderungen, wenn das Produkt ohne
Zustimmung von IBM verändert bzw. wenn Erweiterungskomponenten von Fremdherstellern ohne
Empfehlung von IBM gesteckt/eingebaut werden.
Deutschland: Einhaltung des Gesetzes über die elektromagnetische Verträglichkeit von Geräten
Dieses Produkt entspricht dem “Gesetz über die elektromagnetische Verträglichkeit von Geräten
(EMVG)“. Dies ist die Umsetzung der EU-Richtlinie 2004/108/EG in der Bundesrepublik Deutschland.
Zulassungsbescheinigung laut dem Deutschen Gesetz über die elektromagnetische Verträglichkeit von
Geräten (EMVG) (bzw. der EMC EG Richtlinie 2004/108/EG) für Geräte der Klasse B
Appendix. Notices
279
Dieses Gerät ist berechtigt, in Übereinstimmung mit dem Deutschen EMVG das EG-Konformitätszeichen
- CE - zu führen.
Verantwortlich für die Einhaltung der EMV Vorschriften ist der Hersteller:
International Business Machines Corp.
New Orchard Road
Armonk, New York 10504
Tel: 914-499-1900
Der verantwortliche Ansprechpartner des Herstellers in der EU ist:
IBM Deutschland GmbH
Technical Regulations, Abteilung M456
IBM-Allee 1, 71139 Ehningen, Germany
Tel: +49 7032 15-2937
email: tjahn@de.ibm.com
Generelle Informationen:
Das Gerät erfüllt die Schutzanforderungen nach EN 55024 und EN 55022 Klasse B.
Terms and conditions
Permissions for the use of these publications is granted subject to the following terms and conditions.
Personal Use: You may reproduce these publications for your personal, noncommercial use provided that
all proprietary notices are preserved. You may not distribute, display or make derivative works of these
publications, or any portion thereof, without the express consent of the manufacturer.
Commercial Use: You may reproduce, distribute and display these publications solely within your
enterprise provided that all proprietary notices are preserved. You may not make derivative works of
these publications, or reproduce, distribute or display these publications or any portion thereof outside
your enterprise, without the express consent of the manufacturer.
Except as expressly granted in this permission, no other permissions, licenses or rights are granted, either
express or implied, to the publications or any information, data, software or other intellectual property
contained therein.
The manufacturer reserves the right to withdraw the permissions granted herein whenever, in its
discretion, the use of the publications is detrimental to its interest or, as determined by the manufacturer,
the above instructions are not being properly followed.
You may not download, export or re-export this information except in full compliance with all applicable
laws and regulations, including all United States export laws and regulations.
THE MANUFACTURER MAKES NO GUARANTEE ABOUT THE CONTENT OF THESE
PUBLICATIONS. THESE PUBLICATIONS ARE PROVIDED "AS-IS" AND WITHOUT WARRANTY OF
ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED
WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, AND FITNESS FOR A PARTICULAR
PURPOSE.
280
Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Appendix. Notices
281
IBM®
Printed in USA
GI11-9831-00