Enterprise Storage Server
Service Guide
2105 Models E10/E20, F10/F20, and
Expansion Enclosure
Volume 1
Chapters 1, 2, and 3
SY27-7605-06
Enterprise Storage Server
Service Guide
2105 Models E10/E20, F10/F20, and
Expansion Enclosure
Volume 1
Chapters 1, 2, and 3
SY27-7605-06
Note
Before using this information and the product it supports, be sure to read the general information under “Notices” on
page xiii.
First Edition (December 2000)
This edition applies to the first release of the IBM 2105 Enterprise Storage Server and to all following releases and
changes until otherwise indicated in new editions.
Order publications through your IBM representative or the IBM branch office serving your locality. Publications are
not stocked at the address given below.
IBM welcomes your comments. A form for readers’ comments may be supplied at the back of this publication, or you
may mail your comments to the following address:
International Business Machines Corporation
Department G26
5600 Cottle Road
San Jose, CA 95193-0001
U.S.A.
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any
way it believes suitable without incurring any obligation to you.
© Copyright International Business Machines Corporation 1999. All rights reserved.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Notices . . . . . . . . . . . . . . . . . . . . . . .
Safety Notices . . . . . . . . . . . . . . . . . . . . .
Translated Safety Notices . . . . . . . . . . . . . . . .
Environmental Notices . . . . . . . . . . . . . . . . . .
Product Recycling. . . . . . . . . . . . . . . . . . .
Product Disposal . . . . . . . . . . . . . . . . . . .
Electronic Emission Notices . . . . . . . . . . . . . . . .
Federal Communications Commission (FCC) Statement. . . . .
Industry Canada Compliance Statement . . . . . . . . . .
European Community Compliance Statement . . . . . . . .
Japanese Voluntary Control Council for Interference (VCCI) Class A
Statement . . . . . . . . . . . . . . . . . . . . .
Korean Government Ministry of Communication (MOC) Statement .
Taiwan Class A Compliance Statement . . . . . . . . . . .
Trademarks . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xiii
xiii
xiii
xiii
xiii
xiv
xiv
xiv
xiv
xiv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xv
xv
xvi
xvi
Using This Service Guide .
Where to Start . . . . . .
Limited Vocabulary . . . .
Publications . . . . . . .
ESS Product Library . .
Ordering Publications . .
Related Publications . .
Web Sites . . . . . .
Other Related Publications
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xvii
xvii
xvii
xvii
xvii
xviii
xviii
xviii
xviii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 1
. 1
. 3
. 4
. 4
. 4
. 5
. 5
. 7
. 7
. 8
. 8
. 9
. 10
. 12
. 13
. 16
. 17
. 17
. 21
. 27
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1: Reference Information. . . . . . . . . . . .
2105 Model Exx/Fxx and Expansion Enclosure Overview . . . .
Host Systems Supported by the IBM ESS . . . . . . . .
Web Interfaces . . . . . . . . . . . . . . . . . .
Web Connection Security . . . . . . . . . . . . . .
IBM Enterprise Storage Server Network (ESSNet) . . . . .
Accessing ESS Specialist and Copy Services . . . . . . .
ESS Specialist . . . . . . . . . . . . . . . . . .
Service Interface. . . . . . . . . . . . . . . . . .
Fibre Channel Connection . . . . . . . . . . . . . .
Fibre Channel Host Card Indicators . . . . . . . . . . .
DDM Bay and SSA DASD Drawer Reference Information. . . .
SSA DASD Model 020 Drawer Indicators and Power Switch. .
SSA DASD Model 040 Drawer Indicators and Switches . . .
DDM Bay Indicators and Switches . . . . . . . . . . .
Disk Drive Module Indicators . . . . . . . . . . . . .
Internal Connections (SSA DASD Model 020 and 040 Drawer)
Internal Connections (DDM Bay) . . . . . . . . . . .
External SSA Connections (DDM Bay) . . . . . . . . .
External SSA Connections (SSA DASD Model 040 Drawer) .
Special Tools . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 2: Entry MAP for All Service Actions . . . . . . . . . . . . 29
SIM Generation and Usage . . . . . . . . . . . . . . . . . . . . 33
Repair Using a SIM Console Message . . . . . . . . . . . . . . . . 34
© Copyright IBM Corp. 1999
iii
Customer Receives Sense Data Without a SIM . .
Repair Using an EREP Report . . . . . . . . .
EREP Reports . . . . . . . . . . . . . .
Decode a Refcode . . . . . . . . . . . . .
Generating a Refcode from Sense Bytes . . . .
Media SIM Maintenance Procedures . . . . . . .
Customer Media Maintenance Procedure Examples
iv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 3: Problem Isolation Procedures . . . . . . . . . . . .
MAPs 1XXX: General Isolation Procedures . . . . . . . . . . . .
MAP 1200: Prioritizing Visual Symptoms and Problem Logs For Repair .
MAP 1210: Displaying and Repairing a Problem Record. . . . . . .
MAP 1300: Isolating Cluster to Modem Communication Problems . . .
MAP 1301: Isolating Call Home / Remote Services Failure . . . . . .
MAP 1320: Isolating Problems Using Visual Symptoms . . . . . . .
MAP 1460: Isolating E-Mail Reported Errors . . . . . . . . . . .
MAP 1480: Replacing a FRU, Without Using a Problem Log . . . . .
MAP 1500: Ending a Service Action . . . . . . . . . . . . . .
MAP 1600: ESSNet Console Problem . . . . . . . . . . . . .
MAPs 2XXX: Power and Cooling Isolation Procedures . . . . . . . .
MAP 2000: Model 100 Power Problems. . . . . . . . . . . . .
MAP 2020: Isolating Power Symptoms . . . . . . . . . . . . .
MAP 20A0: Cluster Not Ready . . . . . . . . . . . . . . . .
MAP 20B0: Cluster Did Not Power On, OK Displayed . . . . . . .
MAP 2210: Electronics Cage Power Supply Problem . . . . . . . .
MAP 2320: Installed Unit Does Not Match Logical Unit . . . . . . .
MAP 2340: PPS Status Code 06 . . . . . . . . . . . . . . .
MAP 2350: Isolating PPS Status Indicator Codes . . . . . . . . .
MAP 2360: 2105 Model Exx/Fxx UEPO Problems . . . . . . . . .
MAP 2370: Automatic Power On Problem . . . . . . . . . . . .
MAP 2380: Isolating 2105 Expansion Enclosure UEPO Problems . . .
MAP 2390: Remote Power On Not Working . . . . . . . . . . .
MAP 2400: 2105 Model Exx/Fxx Local Power On Problems . . . . .
MAP 2410: RPC Power Mode Switch Mismatch . . . . . . . . . .
MAP 2420: 2105 Expansion Enclosure Power On Problem. . . . . .
MAP 2430: One RPC Card Firmware Down Level . . . . . . . . .
MAP 2440: Isolating 2105 Model Exx/Fxx Power Off Problems . . . .
MAP 2460: Battery Charge Low . . . . . . . . . . . . . . .
MAP 2470: Battery Set Detection Problem . . . . . . . . . . .
MAP 2490: PPS Input Phase Missing . . . . . . . . . . . . .
MAP 24A0: PPS Power On Problem . . . . . . . . . . . . .
MAP 24B0: Cannot Power Off, Pinned Data. . . . . . . . . . .
MAP 24F0: Both RPC Cards Firmware Down Level . . . . . . . .
MAP 2520: PPS Output Circuit Breaker Tripped . . . . . . . . .
MAP 2540: Power Problem Detected By Cluster Bay . . . . . . .
MAPs 3XXX SSA DASD Drawer Isolation Procedures . . . . . . . .
Using the SSA DASD drawer Maintenance Analysis Procedures (MAPs)
MAP 3000: Isolating an SSA Link Error . . . . . . . . . . . .
MAP 3010: Isolating a Degraded SSA Link . . . . . . . . . . .
MAP 3050: Isolating an SSA Link Error . . . . . . . . . . . .
MAP 3060: Isolating a Degraded SSA Link . . . . . . . . . . .
MAP 3077: Isolating an SSA Link Error . . . . . . . . . . . .
MAP 3078: Isolating a Degraded SSA Link . . . . . . . . . . .
MAP 3080: Isolating an SSA Link Error . . . . . . . . . . . .
MAP 3081: Isolating a Degraded SSA Link . . . . . . . . . . .
MAP 3082: Isolating an SSA Link Error . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
VOLUME 1, ESS Service Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
34
34
35
36
37
38
38
. 41
. 52
. 52
. 53
. 54
. 58
. 58
. 67
. 67
. 68
. 69
. 70
. 70
. 71
. 72
. 74
. 76
. 77
. 77
. 80
. 82
. 84
. 86
. 88
. 91
. 95
. 96
. 99
. 99
. 102
. 103
. 104
. 104
. 106
. 107
. 107
. 108
. 108
108
. 109
. 111
. 113
. 117
. 121
. 126
. 129
. 133
. 135
MAP 3083: Isolating a Degraded SSA Link Error . . . . . . . . . . .
MAP 3085: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3086: Isolating a Degraded SSA Link . . . . . . . . . . . . .
MAP 3095: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3096: Isolating a Degraded SSA Link . . . . . . . . . . . . .
MAP 3100: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3101: Isolating a Degraded SSA Link . . . . . . . . . . . . .
MAP 3105: Isolating a Loss of Power to a SSA DASD Model 040. . . . .
MAP 3120: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3121: Isolating a Degraded SSA Link . . . . . . . . . . . . .
MAP 3123: Array Repair Required . . . . . . . . . . . . . . . .
MAP 3124: Isolating Between DDM Hardware and Microcode Failures
MAP 3125: Isolating an Unexpected SSA SRN. . . . . . . . . . . .
MAP 3126: Isolating an Unexpected SSA Test Result . . . . . . . . .
MAP 3127: Formatting of a DDM Has Not Completed . . . . . . . . .
MAP 3128: Isolating an Unknown DDM Failure . . . . . . . . . . .
MAP 3129: Isolating an Array Repair Required Failure . . . . . . . . .
MAP 3142: Isolating Multiple DDMs on an SSA Loop Cannot be Accessed
MAP 3150: Isolating an SSA DASD Drawer Power Problem . . . . . . .
MAP 3151: Isolating an SSA DASD Drawer Visual Power Problem . . . .
MAP 3155: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3158: Isolating an SSA Link Error . . . . . . . . . . . . . .
MAP 3160: SSA DASD Drawer Isolating a Single DDM Redundant Power
Fault . . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3180: Controller Card Failed or Wrong Drawer Type Installed . . . .
MAP 3190: Wrong Drawer Type Installed . . . . . . . . . . . . . .
MAP 3200: Uninstalled SSA DDMs Connected to Loop A . . . . . . . .
MAP 3210: Uninstalled SSA DDMs Connected to Loop B . . . . . . . .
MAP 3220: Isolating too Few DDMs in an SSA DASD DDM Bay . . . . .
MAP 3280: Isolating too Few DDMs in an SSA Drawer. . . . . . . . .
MAP 3300: Repair Alternate Cluster to Run SSA Loop Test . . . . . . .
MAP 3350: Isolating SSA DASD Drawer Power Problems . . . . . . .
MAP 3351: Isolating SSA DASD Drawer Visual Power Problems . . . . .
MAP 3352: Isolating SSA DASD Drawer Power Problems . . . . . . .
MAP 3353: Isolating SSA DASD Drawer Visual Power Problems . . . . .
MAP 3354: Isolating an SSA DASD Drawer Multiple DDM Redundant Visual
Power Fault . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3355: Isolating an SSA DASD Drawer Multiple DDM Redundant Power
Fault . . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3356: Isolating SSA DASD Drawer Power On Problems . . . . . .
MAP 3360: Ending a DASD Service Action . . . . . . . . . . . . .
MAP 3375: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
MAP 3378: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
MAP 3379: Analyzing a Storage Cage Fan/Power Sense Card Check
Summary Indicator On . . . . . . . . . . . . . . . . . . . .
MAP 3380: Isolating 7133 Model 040 SSA DASD Drawer Power Problems
MAP 3381: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
MAP 3384: Isolating a Storage Cage Fan Failure . . . . . . . . . . .
MAP 3387: Isolating a Storage Cage Power Supply Failure . . . . . . .
MAP 3390: Isolating SSA DASD Drawer Visual Power Problems, Model 040
Drawer . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3391: Isolating a Storage Cage Power System Problem . . . . . .
MAP 3395: Isolating an SSA DASD DDM Bay Power Problem . . . . . .
MAP 3397: Isolating an SSA DASD DDM Bay Controller Card Problem
MAP 3398: Isolating a DDM bay Controller Card Communications Failure
MAP 3400: Replacing an SSA DASD Drawer Backplane or Frame . . . .
Contents
140
144
148
150
155
158
168
172
173
180
183
184
184
185
186
186
187
187
188
192
196
198
201
202
203
204
205
207
208
211
212
216
219
221
223
225
227
231
232
233
233
234
238
239
242
247
253
259
261
262
263
v
MAP 3421: Storage Cage Fan/Power Sense Card R2 Cable Problem . . .
MAP 3422: Storage Cage Fan/Power Sense Card R2 Jumper and Cable
Problems. . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3423: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Missing Error . . . . . . . . . . . . . . . . . . . . . . .
MAP 3424: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Failing Error. . . . . . . . . . . . . . . . . . . . . . . .
MAP 3425: Isolating a Storage Cage Fan/Power Sense Card R2 Cable
Error . . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3426: Isolating a Storage Cage Fan/Power Sense Card Location Error
MAP 3427: Isolating a Storage and DDM Bay Location Error . . . . . .
MAP 3428: Isolating an SSA DASD Drawer Location Error . . . . . . .
MAP 3429: Isolating a DDM Location Error . . . . . . . . . . . . .
MAP 3500: Verifying an SSA DASD Drawer Repair . . . . . . . . . .
MAP 3520: SSA DASD Drawer Verification for Possible Problems . . . .
MAP 3540: Unrelated Occurrence, Retry Web Operation . . . . . . . .
MAP 3560: Unrelated Occurrence, Retry Verification Test . . . . . . . .
MAP 3570: Unrelated Event Caused Resume Fail . . . . . . . . . .
MAP 3600: Multiple DDMs Isolated on an SSA Loop . . . . . . . . .
MAP 3605: Isolating an Unexpected Result . . . . . . . . . . . . .
MAP 3610: DDM Installation with New Rank Site Capacity . . . . . . .
MAP 3612: DDM Installation with Mixed Capacity Rank Site . . . . . . .
MAP 3614: DDM Installation Introduces Different RPM . . . . . . . . .
MAP 3616: No Intermix of Bus Speeds is Allowed . . . . . . . . . .
MAP 3618: Replacement DDM Has Slower RPM Than Called For . . . .
MAP 3619: This Repair Requires a Larger Capacity DDM . . . . . . .
MAP 3620: Multiple DDMs Isolated on an SSA Loop . . . . . . . . .
MAP 3621: New DDM Storage Capacity Smaller Than Original DDMs
MAP 3623: New DDM Storage Capacity Less Than 4.5 GB . . . . . . .
MAP 3625: All DDMs on SSA Loop A Do Not Have the Same
Characteristics. . . . . . . . . . . . . . . . . . . . . . .
MAP 3626: All DDMs on SSA Loop B Do Not Have the Same
Characteristics. . . . . . . . . . . . . . . . . . . . . . .
MAP 3630: Isolating an SSA Device Card/DRAM Problem . . . . . . .
MAP 3640: Other Cluster Fenced - Unable to Verify SSA Loop . . . . . .
MAP 3650: Wrong, Missing, or Failing Bypass Card . . . . . . . . . .
MAP 3652: Wrong, Missing, or Failing Passthrough Card . . . . . . . .
MAP 3654: Bypass Card Jumpers Wrong . . . . . . . . . . . . .
MAP 3656: 20 MB SSA Cable Installed Where 40 MB Cable Expected
MAP 3680: Isolating a Two DDMs Detect Over-Temperature Problem . . .
MAP 3685: Isolating a Multiple DDMs Detect Over-Temperature Problem
MAPs 4XXX: Cluster Bay Isolation Procedures. . . . . . . . . . . . .
MAP 4020: Performing the SCSI Hard Drive Build Process . . . . . . .
MAP 4030: CPI Hardware Version Mismatch . . . . . . . . . . . .
MAP 4040: Entry MAP for CPI Problems . . . . . . . . . . . . . .
MAP 4050: Isolating a CPI Problem . . . . . . . . . . . . . . . .
MAP 4060: Replacement of Cluster FRUs for CPI Problems. . . . . . .
MAP 4070: Replacement of Host Bay FRUs for CPI Problems . . . . . .
MAP 4080: Powering the 2105 Model Exx/Fxx Off to Replace CPI FRUs
MAP 4090: CPI Address Mismatch . . . . . . . . . . . . . . . .
MAP 4100: Isolating a LIC Process Read/Display Problem . . . . . . .
MAP 4120: Handling Unexpected Resources . . . . . . . . . . . .
MAP 4130: Handling a Missing or Failing Resource . . . . . . . . . .
MAP 4140: Isolating a LIC Activation Process Failure . . . . . . . . .
MAP 4240: Isolating a Blinking 888 Error on the Cluster Operator Panel
MAP 4320: Isolating E1xx SCSI Hard Drive Code Boot Problems . . . . .
vi
VOLUME 1, ESS Service Guide
264
266
267
269
270
271
273
275
278
279
280
280
281
282
282
285
285
288
291
293
295
296
296
297
298
298
300
301
302
304
305
307
308
309
313
316
316
320
321
322
326
327
329
329
331
331
332
333
334
336
MAP 4340: Isolating a E3xx Memory Test Hang Problem . . . . . . .
MAP 4350: Isolating Cluster Code Load Counter=2 . . . . . . . . .
MAP 4360: Isolation Using Codes Displayed by the Cluster Operator Panel
MAP 4370: Error Displaying Problems Needing Repair . . . . . . . .
MAP 4380: Isolating a Customer LAN Connection Problem . . . . . .
MAP 4390: Isolating a Cluster to Cluster Ethernet Problem . . . . . .
MAP 4400: Displaying Cluster SMS Error Logs . . . . . . . . . .
MAP 4420: Displaying I/O Planar UAA LAN Address . . . . . . . .
MAP 4440: ESSNet Console to Cluster Bay Problem . . . . . . . .
MAP 4450: ESSNet Cluster Bay to Customer Network Problem . . . .
MAP 4480: Isolating a Cluster / RPC Problem . . . . . . . . . . .
MAP 44F0: Electronics Cage Cooling Problem . . . . . . . . . . .
MAP 4500: Isolating an ESC=5xxx . . . . . . . . . . . . . . .
MAP 4510: Isolating a Cluster to Cluster CPI Communication Failure . .
MAP 4520: Pinned Data and/or Volume Status Unknown . . . . . . .
MAP 4540: Isolating Problems on a Minimum Configuration Cluster . . .
MAP 4550: NVS FRU Replacement . . . . . . . . . . . . . . .
MAP 4560: No Valid Subsystem Status Available . . . . . . . . . .
MAP 4580: Pinned Data In Single Cluster NVS . . . . . . . . . .
MAP 4600: Isolating a CD-ROM Test Failure . . . . . . . . . . .
MAP 4610: Cluster SP/System Firmware Down-level . . . . . . . .
MAP 4620: Isolating a Diskette Drive Failure . . . . . . . . . . .
MAP 4630: Listed FRUs May Be Incomplete or Need Isolation . . . . .
MAP 4700: Replacing Cluster FRUs . . . . . . . . . . . . . .
MAP 4710: Isolating a DDM LIC Update Problem. . . . . . . . . .
MAP 4720: Cluster or Host Bay Fails to Power Off . . . . . . . . .
MAP 4730: Isolating a Cluster Power Off Request Problem . . . . . .
MAP 4740: Fan Check Detected by I/O Planar, Model Exx Only . . . .
MAP 4750: Cluster Bay Power is Off, Had to Force it Off . . . . . . .
MAP 4760: Recovering from Corrupted Files or Functions . . . . . .
MAP 4770: Isolating a E152 Cluster Hang . . . . . . . . . . . .
MAP 4780: Isolating a Functional Code Not Running Problem . . . . .
MAP 4790: Repairing the Electronics Cage . . . . . . . . . . . .
MAP 4810: Unexpected Host Bay Power Off . . . . . . . . . . .
MAP 4820: Isolating a SCSI Card Configuration Timeout . . . . . . .
MAP 4840: CPI Diagnostic Communication Problem . . . . . . . .
MAP 4970: Isolating a Software Problem . . . . . . . . . . . . .
MAP 4980: Customer Copy Services Problems . . . . . . . . . .
MAP 4990: LIC Feature License Failure . . . . . . . . . . . . .
MAPs 5XXX: Host Interface Isolation Procedures . . . . . . . . . . .
MAP 5000: ESS Specialist Cannot Access Cluster . . . . . . . . .
MAP 5220: Isolating a SCSI Bus Error. . . . . . . . . . . . . .
MAP 5230: Isolating a Fixed Block Read Data Failure . . . . . . . .
MAP 5240: Isolating a Customer Data Check Failure . . . . . . . .
MAP 5250: Isolating a Meta Data Check Failure . . . . . . . . . .
MAP 5300: ESCON Link Fault . . . . . . . . . . . . . . . . .
MAP 5310: ESCON Bit Error Validation . . . . . . . . . . . . .
MAP 5320: ESCON Optical Power Measurement . . . . . . . . . .
MAP 5340: CKD Read Data Failure . . . . . . . . . . . . . . .
MAP 5400: Fibre Channel Link Fault . . . . . . . . . . . . . .
MAP 5410: Fibre Channel Bit Error Validation . . . . . . . . . . .
MAP 5420: Fibre Channel Optical Power Measurement . . . . . . .
MAP 5430: Host Fibre Channel Fails to Recognize ESS LUNs . . . . .
MAP 5440: Fibre Host Card Reports a Loss of Light . . . . . . . .
MAPs 6XXX: Service Terminal Isolation Procedures . . . . . . . . . .
MAP 6040: Isolating a Service Terminal Login Failure To Both Clusters
. 339
. 341
342
. 344
. 346
. 347
. 351
. 351
. 352
. 354
. 357
. 360
. 361
. 362
. 363
. 364
. 370
. 370
. 372
. 373
. 373
. 374
. 374
. 375
. 384
. 385
. 387
. 387
. 388
. 389
. 390
. 393
. 395
. 396
. 399
. 400
. 401
. 402
. 404
. 405
. 405
. 406
. 409
. 410
. 413
. 414
. 416
. 418
. 421
. 422
. 424
. 425
. 428
. 430
. 430
431
Contents
vii
MAP 6060: Isolating a Service Terminal Login Failure To One Cluster . . . 432
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
viii
VOLUME 1, ESS Service Guide
Figures
1. 2105 Model Exx/Fxx Front and Rear Views (s007725m) . . . . . . . . . . . . . . . . . 2
2. 2105 Expansion Enclosure Front and Rear Views (S007726m) . . . . . . . . . . . . . . 2
3. SSA DASD Model 020 Drawer Indicators and Power Switch (t007290n) . . . . . . . . . . 10
4. SSA DASD Model 040 Drawer Indicators (t007661p) . . . . . . . . . . . . . . . . . 12
5. DDM Bay Indicators (S008108l) . . . . . . . . . . . . . . . . . . . . . . . . . 13
6. SSA DASD Model 020 Drawer Disk Drive Module Indicators (t007383m) . . . . . . . . . . 14
7. SSA DASD Model 040 Drawer Disk Drive Module Indicators (t007660m) . . . . . . . . . . 15
8. SSA DASD Model 020 and 040 Drawer Internal SSA Connections (t007304m) . . . . . . . . 16
9. DDM Bay Internal SSA Connections (S008107l) . . . . . . . . . . . . . . . . . . . 17
10. DDM Bay Diagram Explanation (S008122l) . . . . . . . . . . . . . . . . . . . . . 17
11. One DDM Bay External SSA Connections (S008129m) . . . . . . . . . . . . . . . . . 18
12. Two DDM Bay Initial External SSA Connections (S008128m) . . . . . . . . . . . . . . 18
13. Two DDM Bay Final External SSA Connections (S008127m). . . . . . . . . . . . . . . 19
14. Three DDM Bay External SSA Connections (S008126m) . . . . . . . . . . . . . . . . 19
15. Four DDM Bay External SSA Connections (S008125m) . . . . . . . . . . . . . . . . 20
16. Five DDM Bay External SSA Connections (S008124m) . . . . . . . . . . . . . . . . . 20
17. Six DDM Bay External SSA Connections (S008123m) . . . . . . . . . . . . . . . . . 21
18. SSA DASD Model 040 Drawer Diagram Explanation (S008134m) . . . . . . . . . . . . . 22
19. One SSA DASD Model 040 Drawer External SSA Connections (S008139m) . . . . . . . . . 22
20. Two SSA DASD Model 040 Drawer Initial External SSA Connections (S008137p) . . . . . . . 23
21. Two SSA DASD Model 040 Drawer Final External SSA Connections (S008138p) . . . . . . . 24
22. Three SSA DASD Model 040 Drawer External SSA Connections (S008136s) . . . . . . . . 25
23. Four SSA DASD Model 040 Drawer External SSA Connections (S008135s) . . . . . . . . . 26
24. Service Information Messages Report (S008595n) . . . . . . . . . . . . . . . . . . 35
25. Event History Report (S008596m) . . . . . . . . . . . . . . . . . . . . . . . . 36
26. Decoding the Refcode (s008597m) . . . . . . . . . . . . . . . . . . . . . . . . 37
27. Refcode in the 2105 SIM Sense Bytes (S008594n) . . . . . . . . . . . . . . . . . . 37
28. Example of ICKDSF Analyze Drivetest Output . . . . . . . . . . . . . . . . . . . . 39
29. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . . 83
30. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . . 87
31. 2105 Model Exx/Fxx RPC Local/Remote Switch Location (S008612m) . . . . . . . . . . . 92
32. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . . 93
33. 2105 Model Exx/Fxx Operator Panel Locations (S008811m) . . . . . . . . . . . . . . . 94
34. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . . 97
35. 2105 Model Exx/Fxx Operator Panel Locations (S008811m) . . . . . . . . . . . . . . . 98
36. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . 101
37. 2105 Primary Power Supply Locations (s009048) . . . . . . . . . . . . . . . . . . 105
38. SSA Link Failure, Two Adjoining DDMs (S007656l) . . . . . . . . . . . . . . . . . . 110
39. SSA Link Failure, Two Adjoining DDMs (S007656l) . . . . . . . . . . . . . . . . . . 112
40. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008041l) 113
41. DDM bay SSA Connectors (S007693l) . . . . . . . . . . . . . . . . . . . . . . 115
42. Cluster SSA Device Card Connector Locations (S008022m) . . . . . . . . . . . . . . 115
43. DDM bay DDM Indicator Locations (S008021l) . . . . . . . . . . . . . . . . . . . 116
44. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008041l) 118
45. DDM bay SSA Connectors (S007693l) . . . . . . . . . . . . . . . . . . . . . . 119
46. Cluster SSA Device Card Connector Locations (S008022m) . . . . . . . . . . . . . . 119
47. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008141l) 121
48. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . . 123
49. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . . 123
50. DDM bay DDM Indicator Locations (S008021l) . . . . . . . . . . . . . . . . . . . 124
51. SSA Link Failure, Passthrough and Bypass Card Link Between a DDM and SSA Device Card
(S008141l). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
52. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . . 127
© Copyright IBM Corp. 1999
ix
53. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
54. SSA Link Failure, Bypass Card and Two DDMs (S008144m) . . . . . . . . . . . . . .
55. SSA Link Failure, Bypass Card and Two DDMs (S008143l) . . . . . . . . . . . . . . .
56. SSA Link Failure, Bypass Card and Two DDMs (S008144m) . . . . . . . . . . . . . .
57. SSA Link Failure, Bypass Card and Two DDMs (S008143l) . . . . . . . . . . . . . . .
58. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008142l)
59. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
60. Drawer SSA Connector Locations (S008762p) . . . . . . . . . . . . . . . . . . .
61. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008142l)
62. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
63. Drawer SSA Connector Locations (S008762p) . . . . . . . . . . . . . . . . . . .
64. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S007649l)
65. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
66. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
67. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S007649l)
68. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
69. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
70. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008140l)
71. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
72. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
73. SSA Link Degraded, Two Passthrough and Bypass Card Link Between Two DDMs (S008384l)
74. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
75. SSA Link Failure, Passthrough/Bypass Cards and Two DDMs (S007650l) . . . . . . . . .
76. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . . . . . .
77. SSA DASD Model 040 Power Supply Assembly Indicator Locations (S008019m) . . . . . . .
78. DDM Bay DDM Indicator Locations (S008021l) . . . . . . . . . . . . . . . . . . .
79. SSA DASD Model 020 and 040 drawer SSA Connectors (S008762p) . . . . . . . . . . .
80. DDM Bay SSA Connectors (S007693l) . . . . . . . . . . . . . . . . . . . . . .
81. SSA Link Failure, Passthrough/Bypass Cards and Two DDMs (S007650l) . . . . . . . . .
82. SSA DASD Model 020 and 040 Drawer SSA Connectors (S008762p) . . . . . . . . . . .
83. DDM bay SSA Connectors (S007693l) . . . . . . . . . . . . . . . . . . . . . .
84. SSA DASD Model 040 Power Supply Locations (S008019m) . . . . . . . . . . . . . .
85. SSA Link Failure, Passthrough or Bypass Card Link Between a DDM and SSA Device Card
(S007652l). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . . . . . .
87. SSA DASD Model 040 Power Supply Indicator Locations (S008019m). . . . . . . . . . .
88. DDM bay DDM Indicator Locations (S008021l) . . . . . . . . . . . . . . . . . . .
89. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
90. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
91. SSA Link Failure, Passthrough or Bypass Card Link Between a DDM and SSA Device Card
(S007652l). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92. DDM bay SSA Connector Locations (S007693l) . . . . . . . . . . . . . . . . . . .
93. Cluster SSA Device Card SSA Connector Locations (S008022m) . . . . . . . . . . . .
94. SSA DASD Drawer Fan-and-Power-Supply Assembly Indicators (S008029l) . . . . . . . .
95. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . . . . . .
96. SSA DASD Drawer Fan-and-Power-Supply Assembly Indicators (S008029l) . . . . . . . .
97. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . . . . . .
98. SSA Link Failure, Two SSA DASD Drawers (S007653n) . . . . . . . . . . . . . . . .
99. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . . . . . .
100. SSA Link Failure, Two SSA DASD Drawers (S007654n) . . . . . . . . . . . . . . . .
101. SSA DASD Model 040 Power Supply Assembly Locations (S008019m) . . . . . . . . . .
102. Cluster SSA Device Card Locations (S008022m) . . . . . . . . . . . . . . . . . .
103. Cluster SSA Device Card Locations (S008022m) . . . . . . . . . . . . . . . . . .
104. Expected SSA DASD Drawer DDM Locations (S007657l) . . . . . . . . . . . . . . .
105. DDM bay Indicator Locations (S008018l) . . . . . . . . . . . . . . . . . . . . .
106. Expected SSA DASD Drawer DDM Locations (s007319l) . . . . . . . . . . . . . . .
x
VOLUME 1, ESS Service Guide
127
130
130
134
134
136
137
138
141
141
142
144
145
146
148
149
149
151
152
153
156
157
159
161
162
163
166
166
169
170
170
173
174
175
177
178
178
179
181
182
182
189
190
193
194
196
197
199
200
205
206
207
208
209
107. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . .
108. SSA DASD Model 040 Power Supply Assembly Indicators (S008019m) . . . . . .
109. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008030p).
110. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . .
111. 2105 Primary Power Supply Connectors (S007380l) . . . . . . . . . . . . .
112. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008030p).
113. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . .
114. SSA DASD Model 020 Fan-and-Power-Supply Assembly Indicators (S008029l) . . .
115. SSA DASD Model 020 Fan-and-Power-Supply Assembly Indicators (S008029l) . . .
116. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . .
117. SSA DASD drawer Power Card Indicators (s007227l) . . . . . . . . . . . . .
118. SSA DASD drawer Power Card Indicators (s007227l) . . . . . . . . . . . . .
119. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . .
120. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . .
121. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008019m)
122. 2105 Model E10/E20 Operator Panel Locations (S008810m) . . . . . . . . . .
123. 2105 Primary Power Supply Connectors (5007380l) . . . . . . . . . . . . .
124. Storage Cage Power Planar Fan Jumper Locations (S008352p) . . . . . . . . .
125. Storage Cage Power Supply Locations (S008495m) . . . . . . . . . . . . .
126. Primary Power Supply CB and Connector Locations (S008496l) . . . . . . . . .
127. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (s007602l) .
128. Model 040 Drawer Indicators (S008416l) . . . . . . . . . . . . . . . . .
129. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (s007604p) .
130. 2105 Model Exx/Fxx Operator Panel Locations (S008810m) . . . . . . . . . .
131. 2105 Primary Power Supply Connectors (S007380l) . . . . . . . . . . . . .
132. Storage Cage Power Supply Locations (S008495m) . . . . . . . . . . . . .
133. Storage Cage Power Supply Locations (S008495m) . . . . . . . . . . . . .
134. 2105 Primary Power Supply Connectors (5008774m) . . . . . . . . . . . . .
135. 2105 Primary Power Supply Connectors (5008774m) . . . . . . . . . . . . .
136. 2105 Primary Power Supply Connectors (5008774m) . . . . . . . . . . . . .
137. 2105 Primary Power Supply Connectors (5008774m) . . . . . . . . . . . . .
138. 2105 Primary Power Supply Connectors (5008774m) . . . . . . . . . . . . .
139. Fan Sense Card Jumper and Cable Locations (S008774m). . . . . . . . . . .
140. Fan Sense Card Jumper and Cable Locations (S008774m). . . . . . . . . . .
141. DDM Bay Front Power Cable Locations (S008812s) . . . . . . . . . . . . .
142. DDM Bay Rear Power Cable Locations (S008813s) . . . . . . . . . . . . .
143. SSA DASD Model 020 Power Control Panel Locations (S008020m) . . . . . . .
144. DDM bay Indicator Locations (S008018l) . . . . . . . . . . . . . . . . .
145. SSA DASD Model 040 Power Supply Assembly Indicators (S008019m) . . . . . .
146. CD-ROM Drive Jumpers (S008413l) . . . . . . . . . . . . . . . . . . .
147. 2105 Model Exx/Fxx ESD Discharge Pad Locations (S008339m) . . . . . . . .
148. Measuring Optical Transmit Power (S008185m) . . . . . . . . . . . . . . .
149. Measuring Optical Receive Power (s008186n) . . . . . . . . . . . . . . .
150. Measuring Fibre Channel Optical Transmit Power (S008840l) . . . . . . . . . .
151. Measuring Fibre Channel Optical Receive Power (S008841m) . . . . . . . . .
152. 2105 Model Exx/Fxx Host Bay Connector Locations (S008024r) . . . . . . . . .
153. Service Terminal Connections to Controllers and Power (S007525n) . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
210
211
213
214
215
218
219
220
222
223
224
226
228
229
235
236
237
241
243
243
248
249
250
251
252
254
260
265
266
268
269
270
272
273
277
278
284
285
297
338
409
419
420
426
427
429
433
Figures
xi
xii
VOLUME 1, ESS Service Guide
Notices
References in this book to IBM products, programs, or services do not imply that
IBM intends to make these available in all countries in which IBM operates. Any
reference to an IBM product, program, or service is not intended to state or imply
that only that IBM product, program, or service may be used. Subject to IBM’s valid
intellectual property or other legal protected rights, any functionally equivalent
product, program, or service may be used instead of the IBM product, program, or
service. The evaluation and verification of operation in conjunction with other
products, except those expressly designated by IBM, are the responsibility of the
user.
IBM may have patents or pending patent applications covering subject matter in this
document. The furnishing of this document does not give you any license to these
patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
USA
Safety Notices
Safety notices are printed throughout this book. Danger notices warn you of
conditions or procedures that can result in death or severe personal injury. Caution
notices warn you of conditions or procedures that can cause personal injury that is
neither lethal nor extremely hazardous. Attention notices warn you of conditions or
procedures that can cause damage to machines, equipment, or programs.
Translated Safety Notices
Several countries require that caution and danger safety notices be shown in their
national languages.
Translations of the caution and danger safety notices are provided in a separate
document, IBM Storage Solution Safety Notices book, form number GC26-7229.
Environmental Notices
This section contains information about:
v Product recycling for this product
v Environmental guidelines for this product
Product Recycling
This unit contains recyclable materials. These materials should be recycled where
processing sites are available and according to local regulations. In some areas,
IBM provides a product take-back program that ensures proper handling of the
product. Contact your IBM representative for more information.
© Copyright IBM Corp. 1999
xiii
Product Disposal
This unit contains several types of batteries. Return all Pb-acid (lead-acid) batteries
to IBM for proper recycling, according to the instructions received with the
replacement batteries.
Electronic Emission Notices
Federal Communications Commission (FCC) Statement
Note: This equipment has been tested and found to comply with the limits for a
Class A digital device, pursuant to Part 15 of the FCC Rules. These limits
are designed to provide reasonable protection against harmful interference
when the equipment is operated in a commercial environment. This
equipment generates, uses, and can radiate radio frequency energy and, if
not installed and used in accordance with the instruction manual, may cause
harmful interference to radio communications. Operation of this equipment in
a residential area is likely to cause harmful interference, in which case the
user will be required to correct the interference at his own expense.
Properly shielded and grounded cables and connectors must be used in order to
meet FCC emission limits. IBM is not responsible for any radio or television
interference caused by using other than recommended cables and connectors or by
unauthorized changes or modifications to this equipment. Unauthorized changes or
modifications could void the user’s authority to operate the equipment.
This device complies with Part 15 of the FCC Rules. Operation is subject to the
following two conditions: (1) this device may not cause harmful interference, and (2)
this device must accept any interference received, including interference that may
cause undesired operation.
Industry Canada Compliance Statement
This Class A digital apparatus complies with Canadian ICES-003.
Avis de conformité à la réglementation d’Industrie Canada
Cet appareil numérique de la classe A est conform à la norme NMB-003 du
Canada.
European Community Compliance Statement
This product is in conformity with the protection requirements of EC Council
Directive 89/336/EEC on the approximation of the laws of the Member States
relating to electromagnetic compatibility. IBM cannot accept responsibility for any
failure to satisfy the protection requirements resulting from a non-recommended
modification of the product, including the fitting of non-IBM option cards.
Conformity with the Council Directive 73/23/EEC on the approximation of the laws
of the Member States relating to electrical equipment designed for use within
certain voltage limits is based on compliance with the following harmonized
standard: EN60950.
Germany Only
Zulassungsbescheinigung laut Gesetz ueber die elektromagnetische
Vertraeglichkeit von Geraeten (EMVG) vom 30. August 1995.
xiv
VOLUME 1, ESS Service Guide
Dieses Geraet ist berechtigt, in Uebereinstimmung mit dem deutschen EMVG das
EG-Konformitaetszeichen - CE - zu fuehren.
Der Aussteller der Konformitaetserklaeung ist die IBM Deutschland.
Informationen in Hinsicht EMVG Paragraph 3 Abs. (2) 2: .bx 0 80
erfuellt die Schutzanforderungen nach EN 50082-1 un EN 55022
off
Das Geraet
Klasse A. .bx
EN 55022 Klasse A Geraete beduerfen folgender Hinweise:
Nach dem EMVG: ″Geraete duerfen an Orten, fuer die sie nicht ausreichend
entstoert sind, nur mit besonderer Genehmigung des Bundesministeriums fuer Post
und Telekommunikation oder des Bundesamtes fuer Post und Telekommunikation
betrieben werden. Die Genehmigung wird erteilt, wenn keine elektromagnetischen
Stoerungen zu erwarten sind.″ (Auszug aus dem EMVG, Paragraph 3, Abs.4)
Dieses Genehmigungsverfahren ist nach Paragraph 9 EMVG in Verbindung mit der
entsprechenden Kostenverordnung (Amtsblatt 14/93) kostenpflichtig.
Nach der EN 55022: ″Dies ist eine Einrichtung der Klasse A. Diese Einrichtung
kann im Wohnbereich Funkstoerungen verursachen; in diesem Fall kann vom
Betreiber verlangt werden, angemessene Massnahmen durchzufuehren und dafuer
aufzukommen.″
Anmerkung: Um die Einhaltung des EMVG sicherzustellen, sind die Geraete wie in
den Handbuechern angegeben zu installieren und zu betreiben.
Japanese Voluntary Control Council for Interference (VCCI) Class A
Statement
Korean Government Ministry of Communication (MOC) Statement
Please note that this device has been approved for business purpose with regard to
electromagnetic interference. If you find this is not suitable for your use, you may
exchange it for a non-business purpose one.
Notices
xv
Taiwan Class A Compliance Statement
Trademarks
The following terms are trademarks of the IBM Corporation in the United States or
other countries or both:
IBM
AIX
AS/400
IOPath Optimizer
OS/2
RETAIN
RISC System/6000
RISC System/6000 Series Parallel
RS/6000
RS/6000 SP
Enterprise
StorWatch
Versatile Storage Server
AViiON, is a trademark of Data General
HP-UX and Hewlett-Packard, are trademarks of Hewlett-Packard Company.
Sun, SPARCS, SunOS, and Solaris, are trademarks of Sun Microsystems, Inc.
Windows, Windows NT, and Alpha Windows NT are trademarks of Microsoft
Corporation.
UNIX, is a registered trademark in the United States and other countries licensed
exclusively through X/Open Company Limited.
Other company, product, and service names, may be trademarks or service marks
of others.
xvi
VOLUME 1, ESS Service Guide
Using This Service Guide
This guide is for service representatives who are taught to install and repair the IBM
2105 Enterprise Storage Server. Internal components of this machine are designed
and certified to be serviced by trained personnel only.
Where to Start
Start all service actions at “Chapter 2: Entry MAP for All Service Actions” on
page 29.
Attention: When performing any service action on the IBM 2105 Enterprise Storage
Server, follow the directions given in “Chapter 2: Entry MAP for All Service Actions”
on page 29 or from the service terminal. This ensures that you use the correct
remove, replace, or repair procedure, including the correct power on/off procedure,
for this machine. Failure to follow these instructions can cause damage to the
machine and might or might not also cause an unexpected loss of access to
customer data.
Limited Vocabulary
This manual uses a specific range of words so that the text can be understood by
IBM service representatives in countries where English is not the primary language.
Publications
This section describes the ESS library and publications for related products. It also
gives ordering information.
ESS Product Library
The ESS is an IBM Enterprise architecture-based product. See the following
publications for more information on the ESS:
v Enterprise Storage Server Service Guide 2105 Models E10/E20, F10/F20, and
Expansion Enclosure, Volume 2 book, GC27–7608
This is volume 2 of this book.
v Enterprise Storage Server Service Guide 2105 Models E10/E20, F10/F20, and
Expansion Enclosure, Volume 3 book, GC27–7609
This is volume 3 of this book.
v 2105 Model 100 Attachment to ESS Service Guide book, SY27-7615
This guide is for service representatives who are taught to install and repair a
VSS attached to an ESS.
v Enterprise Systems Link Fault Isolation book, form number SY22-9533
v Maintenance Information for S/390 Fiber Optic Links (ESCON, FICON, Coupling
Links, and Open System Adapters) book, form number SY27-2597.
v IBM Enterprise Storage Server Introduction and Planning Guide book,
GC26-7294
This book introduces the product and lists the features you can order. It also
provides guidelines on planning for installation and configuration of the ESS.
v IBM Enterprise Storage Server User’s Guide book, SC26-7295
This book provides instructions for setting up and operating the ESS.
v IBM Enterprise Storage Server SCSI Command Reference book, SC26-7297
© Copyright IBM Corp. 1999
xvii
v
v
v
v
v
v
This book describes the functions of the ESS and gives reference information
such as channel commands, sense bytes, and error recovery procedures.
Enterprise Storage Serve Parts Catalog book, S127-0974
IBM Storage Solutions Safety Notices book, GC26-7229
This book provides translations of the Danger and Caution notices used in the
ESS publications.
IBM Enterprise Storage Server Web Users Interface Guide book, SC26-7346
IBM Enterprise Storage Server Host Systems Attachment Guide book,
SC26-7296
IBM Enterprise Storage Server System/390 Command Reference book,
SC26-7298
DFSMS/MVS Software Support for the IBM Enterprise Storage Server book,
SC26-7318
v IBM Enterprise Storage Server Quick Configuration Guide book, SC26-7354
v IBM Enterprise Storage Server Configuration Planner book, SC26-7353
This book provides work sheets for planning the logical configuration of ESS.
This book is only available on the product Web site:
http://www.ibm.com/storage/ess
Ordering Publications
All of the above publications are available on a CD-ROM that comes with the ESS.
You can also order a hard copy of each of the publications. For additional
CD-ROMs, order:
v ESS Service Documents CD-ROM, SK2T-8771
v ESS Customer Documents CD-ROM, SK2T-8770
Related Publications
The following publications provide information on software products that the IBM
Enterprise Storage Server supports:
v IBM Subsystem Device Driver book, SH26-7291
v IBM Storage Area Network Data Gateway Installation and User’s Guide book,
SC26-7304
v IBM Advanced Copy Services book, SC35-0355
v IBM S/360, S/370, and S/390 Channel to Control Unit Original Equipment
Manufacture’s Information book, SH26-7291
Web Sites
v IBM Storage home page:
http://www.storage.ibm.com/
v IBM Enterprise Storage Server home page:
http://www.ibm.com/storage/ess
http://www.storage.ibm.com/hardsoft/product/refinfo.htm
Other Related Publications
The following is a list of other related books.
7133 Model D40 Serial Disk Systems Service Guide book, GY33-0192
7133 Model D40 Serial Disk System Installation Guide book, GA33-3279
7133 SSA Disk Subsystem Service Guide book, SY33-0185
7133 Models 010 and 020 SSA Disk Subsystem Installation Guide book,
GA33-3260
xviii
VOLUME 1, ESS Service Guide
IBM Versatile Storage Server Service Guide, 2105 Models B09 and 100 book,
SY27-7603
IBM Input/Output Equipment, Installation Manual–Physical Planning ,
GC22-7064
IBM Storage Solutions Safety Notices , GC26-7229
Electrical Safety for IBM Customer Engineers S229-8124
Using This Service Guide
xix
xx
VOLUME 1, ESS Service Guide
Chapter 1: Reference Information
2105 Model Exx/Fxx and Expansion Enclosure Overview . . . . . . . . . . 1
Host Systems Supported by the IBM ESS . . . . . . . . . . . . . . 3
SCSI Host Systems . . . . . . . . . . . . . . . . . . . . . 3
Fibre Channel Host Systems . . . . . . . . . . . . . . . . . . 3
OS/390 Host Systems. . . . . . . . . . . . . . . . . . . . . 4
Web Interfaces . . . . . . . . . . . . . . . . . . . . . . . . 4
Web Connection Security . . . . . . . . . . . . . . . . . . . . 4
IBM Enterprise Storage Server Network (ESSNet) . . . . . . . . . . . 4
Accessing ESS Specialist and Copy Services . . . . . . . . . . . . . 5
ESS Specialist . . . . . . . . . . . . . . . . . . . . . . . . 5
2105 Copy Services . . . . . . . . . . . . . . . . . . . . . 6
ESS Expert . . . . . . . . . . . . . . . . . . . . . . . . 6
Service Interface. . . . . . . . . . . . . . . . . . . . . . . . 7
Remote Services Support . . . . . . . . . . . . . . . . . . . 7
Fibre Channel Connection . . . . . . . . . . . . . . . . . . . . 7
Fibre Channel Host Card Indicators . . . . . . . . . . . . . . . . . 8
DDM Bay and SSA DASD Drawer Reference Information. . . . . . . . . . 8
SSA DASD Model 020 Drawer Indicators and Power Switch. . . . . . . . 9
SSA DASD Model 040 Drawer Indicators and Switches . . . . . . . . . 10
DDM Bay Indicators and Switches . . . . . . . . . . . . . . . . . 12
Disk Drive Module Indicators . . . . . . . . . . . . . . . . . . . 13
SSA DASD Model 020 Drawer Disk Drive Module Indicators . . . . . . 14
SSA DASD Model 040 Drawer and DDM Bay Disk Drive Module Indicators 15
Internal Connections (SSA DASD Model 020 and 040 Drawer) . . . . . . 16
SSA DASD Model 020 and 040 Drawer Internal Connections . . . . . . 16
Internal Connections (DDM Bay) . . . . . . . . . . . . . . . . . 17
DDM Bay Internal Connections . . . . . . . . . . . . . . . . . 17
External SSA Connections (DDM Bay) . . . . . . . . . . . . . . . 17
External SSA Connections (SSA DASD Model 040 Drawer) . . . . . . . 21
Special Tools . . . . . . . . . . . . . . . . . . . . . . . . . 27
2105 Model Exx/Fxx and Expansion Enclosure Overview
This section gives an overview of the 2105 Model Exx/Fxx and Expansion
Enclosure and describes its interfaces and components. This product is also known
as the Enterprise Storage Server (ESS).
The 2105 Model Exx/Fxx and Expansion Enclosure is a member of the Seascape™
product family of storage servers and attached storage devices (disk drive
modules). The storage server provides integrated caching and RAID support for the
disk drive modules (DDM). The DDMs are attached via a serial storage interface
(SSA) interface.
The ESS provides:
v RAID or non-RAID
v Fast SSA disk drive modules (DDMs)
v Fast RISC processors
v Fault tolerant system
v Storage sharing S/390 for open systems
v OS / 390 parallel I/O
v Instant copy
© Copyright IBM Corp. 1999
1
Reference Information
v Disaster recovery
Each ESS rack has dual-line cords and redundant power. The redundant power
system allows both the storage controller and DDM to continue normal operation
when one of the line cords is inactive. Redundancy also ensures continuous data
availability.
The 2105 Models E20 or F20 with the expansion enclosure provides up to 11
terabytes (TB) of storage capacity, with a choice of 9.1, 18.2, or 36.4 gigabyte (GB)
DDMs. See Figure 1 and Figure 2 for illustrations of the 2105 models.
The 2105 Model E10 and F10 do not support an expansion enclosure.
Front view
Rear view
Figure 1. 2105 Model Exx/Fxx Front and Rear Views (s007725m)
Front view
Rear view
Figure 2. 2105 Expansion Enclosure Front and Rear Views (S007726m)
The 2105 subsystem supports a maximum of 384 DDMs:
v 64 DDMs in a 2105 Models E10/F10
2
VOLUME 1, ESS Service Guide
Reference Information
v 128 DDMs in a 2105 Models E20/F20
v 256 DDMs in an 2105 Expansion Enclosure, must be attached to a 2105 Model
E20
v 384 DDMs in a 2105 Models E20/F20 with 2105 Expansion Enclosure
Host Systems Supported by the IBM ESS
This section contains information about attaching the 2105 Model Exx/Fxx to the
host:
v Open systems, SCSI attachment
v Short wave Fibre channel attachment to a SCSI host system
v S/390
SCSI Host Systems
The 2105 Model Exx/Fxx and Expansion Enclosure provides heterogeneous data
storage that can be shared with Open System (SCSI and Fibre channel attachment)
and System/390 workloads. The 2105 Model Exx/Fxx and Expansion Enclosure
supports the following interfaces and host systems. With SCSI adapters the 2105
Model Exx/Fxx can connect to up to 128 host systems, four per SCSI interface.
Note: See Web site http://www.ibm.com/storage/ess/htm for details about the types,
models, adapters, and operating systems supported for SCSI host systems.
The following systems support SCSI attachment:
v Hewlett Packard (HP-UX operating system)
v IBM RISC System/6000® and RISC System/6000® SP (IBM AIX operating
system)
v IBM AS/400® (OS/400® operating system)
v IBM Netfinity and Intel-based PC servers (Microsoft® Windows NT® operating
systems)
v Sun™ (Solaris™ operating system)
v Data General (DG/UX operating system)
v Intel-based PC servers (Novell Netware™)
v Compaq™ AlphaServers (TRU64 UNIX and OpenVMS)
Fibre Channel Host Systems
The following host systems support short wave fibre channel attachment to a SCSI
host system:
Note: See Web site http://www.ibm.com/storage/ess/htm for details about the
types, models, adapters, and operating systems supported for Fibre channel
host systems.
The following systems support Fibre channel attachment:
v Hewlett Packard (HP-UX operating system)
v IBM RISC System/6000® and RISC System/6000® SP (IBM AIX operating
system)
v IBM Netfinity and Intel-based PC servers (Microsoft® Windows NT® operating
systems)
v Sun™ (Solaris™ operating system)
v Data General (DG/UX operating system)
v Intel-based PC servers (Novell Netware™)
Reference Information, CHAPTER 1
3
Reference Information
v Compaq™ AlphaServers (TRU64 UNIX and OpenVMS)
OS/390 Host Systems
With ESCON adapters, you can have up to 32 connections, each with up to 64
logical paths.
Note: See Web site http://www.ibm.com/storage/ess/htm for details about the
types, models, adapters, and operating systems supported for S/390.
The following IBM S/390® host systems are supported on the enterprise systems
connection (ESCON) interface:
v MVS
v VM
v VSE
v TPF
v ICKDSF
v EREP
v DFSORT
Web Interfaces
This section describes Web security, the ESSNet, and the Web interfaces for ESS.
The Web interfaces include:
v StorWatch Enterprise Storage Server Specialist (ESS Specialist)
v StorWatch Enterprise Storage Server Copy Services (ESS Copy Services), an
optional feature
v StorWatch Enterprise Storage Server Expert (ESS Expert), an optional feature
See the IBM Enterprise Storage Server Web Users Interface Guide book for
detailed descriptions of the Web interfaces and instructions about how to use them.
Web Connection Security
The customer connects to the 2105 (ESS) via the ESSNet.
All data that is sent between the 2105 and the Web browser through the ESSNet is
encrypted to avoid unauthorized modification of configuration commands. Access to
the interface is protected by passwords and authorization levels.
The customer controls user access by assigning levels of access and passwords.
IBM Enterprise Storage Server Network (ESSNet)
The IBM ESSNet is a private network residing in an IBM workstation. It is a
required feature. You (the IBM service support representative) install the ESSNet
when you install the 2105 Model Exx/Fxx. The ESSNet hardware includes:
v The IBM workstation (a PC) and monitor
v An external Ethernet hub that provides cable connections from the ESSNet to the
2105 Model Exx/Fxx.
Note: The customer can attach their Ethernet LAN to the external hub. They
must provide any hardware needed for this connection.
v A modem and modem expander that allows communications between the 2105
and IBM for service.
4
VOLUME 1, ESS Service Guide
Reference Information
Note: This equipment is included with Remote Services Support.
ESSNet software on the workstation includes:
v Windows NT 4.0 operating system
v Browser software (Microsoft Internet Explorer) that allows access to ESS
Specialist.
v The ESSNet application for installation and configuration. The ESSNet
workstation includes an application that provides links to the ESS. Clicking on
one of these links initiates ESS Specialist.
ESSNet provides:
v Support for multiple 2105s. A hub with 16 ports will support seven 2105 Model
Exx/Fxxs.
v Connections between the 2105 Model Exx/Fxx and the ESS Specialist web
interface. The ESSNet provides browser software at the correct level for the
connection.
v Improvements in web performance
v Faster network connections and elimination of network setup problems.
v Ethernet connection through an Ethernet hub to the ESSNet
v An independent platform that facilitates installation and configuration of the 2105.
v Software for maintenance and configuration.
v Server code that is controlled and released as part of the product.
IBM installs the ESSNet when the first 2105 Model Exx/Fxx is installed.
Accessing ESS Specialist and Copy Services
The customer accesses the StorWatch Enterprise Storage Server Specialist (ESS
Specialist) and StorWatch Enterprise Storage Server and Copy Services (ESS Copy
Services) from the ESSNet. The ESSNet includes browser software for this access.
The customer accesses ESS Copy Service from ESS Specialist.
ESS Specialist
The 2105 includes the ESS Specialist. ESS Specialist is a Web-based interface that
allows the customer to configure the 2105.
From the Web interface the customer can perform the following tasks:
v Monitor problem logs
v View and modify the configuration
– Add or delete SCSI host systems
– Configure SCSI host ports on the 2105 Model Exx/Fxx
–
–
–
–
–
Define controller images for System/390
Define fixed block (FB) and count key data (CKD) disk groups
Add FB and CKD volumes
Assign volumes to be accessible to more than one host system
Change volume assignments
v Change and view communication resource settings, such as E-mail addresses
and telephone numbers
v Authorize user access
Reference Information, CHAPTER 1
5
Reference Information
v With ESS Specialist the customer can view the following information:
– The external connection between a host system and a 2105 Model Exx/Fxx
port
– The internal connection of SCSI ports to Cluster Bay 1 or Cluster Bay 2
– How storage space is allocated to FB and CKD volumes
2105 Copy Services
The Copy Services feature provide a Web-based interface for managing
Peer-to-Peer Remote Copy (PPRC) and Flash Copy commands. Copy Services
collects information from the IBM storage servers on a single Copy Services server.
Copy Services is part of the IBM ESS Copy Services Web interfaces. The customer
access Copy Services from the ESS Specialist main menu.
Use the Copy Services panels to view and define the following information:
v Volumes
The Volumes panel allows the customer to view volumes and define them as
source or target volumes for the PPRC program.
v Controller
The Controller panel allows the customer to work with logical controllers as
complete entities. The customer can build tasks to place all of the volumes of a
logical controller within a peer-to-peer relationship with all the volumes of another
logical controller.
The customer can also build a task to remove similar groups of volumes from an
existing peer-to-peer relationship.
v Paths
The Paths Panel displays the current status of paths between one physical
controller and the controllers to which it is connected.
The customer can also use this panel to add or remove copy service paths.
v Tasks
The customer can use the Tasks panel to manage tasks they have defined. The
customer may run, remove, export, or import tasks.
v Configuration
The customer can use the Configuration panel to add to or save the existing
configuration. The customer can also use this panel to display the problem log.
ESS Expert
The StorWatch Enterprise Storage Server Expert (ESS Expert) is an optional
software product the customer can purchase to use with the ESS. The ESS Expert
Web interface provides storage resource management functions for the IBM storage
servers. The customer selects the storage servers.
v Asset management
ESS Expert collects and displays asset management data.
v Capacity management
The ESS Expert collects and displays capacity management data.
v Performance management
ESS Expert collects and displays performance management data, for example:
– Number of I/O requests
– Number of bytes transferred
– Read and write response time
6
VOLUME 1, ESS Service Guide
Reference Information
– Cache use statistics.
ESS Expert allows the customer to schedule the information collection. With this
information, the customer can make informed decisions about volume placement
and capacity planning as well as isolate I/O performance bottlenecks.
Service Interface
The 2105 Model Exx/Fxx provides service interface ports for external connection of
a service terminal.
IBM or the customers service provider can perform service on the 2105 using an
IBM mobile service terminal (MoST) or equivalent.
Remote Services Support
The 2105 service interface also provides remote service support with call-home
capability with directed maintenance for service support representatives.
The customer provides an analog telephone line to enable this support. The service
interface provides an RS232 connection via a modem switch and modem, to the
analog telephone line.
The customer must order a modem and modem switch. The first 2105 Model
Exx/Fxx ordered requires this equipment. The modem and modem switch support
up to seven 2105 Model Exx/Fxxs. The cable length from the 2105 Model Exx/Fxx
to the modem switch should be a maximum of 50 feet (15 meters).
The 2105 Model Exx/Fxx and Expansion Enclosure provides the following service
functions:
v Continuous self-monitoring that initiates a call (call home) to service personnel; if
a failure has occurred. Because service personnel who respond to the call knows
about the failing component, repair time is reduced.
v Problem logs are available that service personnel can access remotely to
analyze potential failures.
v Remote support that allows the ESS to correct many types of problems. When
the ESS reports a problem, service personnel can often create a correction which
they can apply from the remote location.
You, the Service support representative, logically configures the ESS during
installation. After the ESS is installed the customer can perform additional
configuration using the ESS Web interfaces. This includes modifying the remote
service functions.
Fibre Channel Connection
The ESS provides Fibre channel connection to host systems that it supports. Fibre
channel interconnection architecture provides a variety of communication protocols
on the ESS. The units that are interconnected are referred to as nodes. Each node
has one or more ports.
An ESS is a node in a Fibre channel network. Each port on an ESS Fibre channel
host adapter is a Fibre channel port. A host is also a node in a Fibre channel
network. Each port on a host Fibre channel adapter is a Fibre channel port.
Each port attaches to a serial-transmission medium that provides full-duplexed
communication with the node at the other end of the medium.
Reference Information, CHAPTER 1
7
Reference Information
ESS architecture supports three basic interconnection topologies.
v Point-to-point allows you to interconnect ports directly.
v Fabric (the underlying structure)
To allow multiple nodes to be interconnected, you can use a fabric that provides
the necessary switching functions to support communication between multiple
nodes. You can implement a fabric using available vendor products.
v Arbitrated Loop
Arbitrated loop is a ring topology that allows you to interconnect set of nodes.
The maximum number of ports you can have for a Fibre channel arbitrated loop
is 128.
Fibre Channel Host Card Indicators
Table 1. Fibre Host Card LED Indicators
Green LED
Indicator
Yellow LED
Indicator
Indicated Condition
Off
Off
Wake-up failure (card dead)
Off
On
Power on Self Test failure (card dead)
Off
Blinking slowly (1
blink per second)
Wake-up failure
Off
Blinking rapidly (4
blinks per second)
Power on Self Test failure
Off
Unsteady blinking
(no pattern)
Power on Self Test in progress
On
Off
Failure while operating
On
On
Failure while operating
On
Blinking slowly (1
blink per second)
Normal, inactive
On
Unsteady blinking
(no pattern)
Normal, active
On
Blinking rapidly (4
blinks per second)
Normal, busy
Blinking slowly (1
blink per second)
Off
Normal, link down or not yet started (loss of light)
Blinking slowly (1
blink per second)
Blinking slowly (1
blink per second)
Off-line for download
Blinking slowly (1
blink per second)
Blinking rapidly (4
blinks per second)
Restricted off-line mode (waiting for restart)
DDM Bay and SSA DASD Drawer Reference Information
The 7133 Serial Storage Architecture (SSA) DASD drawer are used in the 2105
product. The 7133 SSA DASD Model 020 and 040 drawers can be installed in the
attached 2105 Model 100 rack.
Each SSA DASD drawer can contain 16 SSA disk drive modules (DDMs), eight at
the front and eight at the rear of the drawer.
Each SSA DASD drawer has three fans and power supplies that provide all of the
power and cooling for the drawer.
8
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
The DDMs in a drawer are connected to each other in (SSA) strings of four DDMs,
two strings at the front and two strings at the rear. These strings can be connected
to: strings in the same drawer, strings in other drawers, or to SSA device cards.
A SSA DASD drawer can be disconnected from its SSA device cards while the
2105 is operating. Most of the SSA DASD drawer field replaceable units (FRUs)
can be replaced while the SSA DASD drawer and 2105 are running.
Use the following list to find a description of the SSA DASD drawer or DDM
indicators and switches:
v “SSA DASD Model 020 Drawer Indicators and Power Switch”
v “SSA DASD Model 040 Drawer Indicators and Switches” on page 10
v “SSA DASD Model 020 Drawer Disk Drive Module Indicators” on page 14
v “SSA DASD Model 040 Drawer and DDM Bay Disk Drive Module Indicators” on
page 15
SSA DASD Model 020 Drawer Indicators and Power Switch
The SSA DASD Model 020 drawer has indicators that show the status of the
drawer. It also has a power switch. Each DDM has indicators that show the status
of that DDM.
1 [Figure 3] Power Switch (On/Off) This switch controls the internal dc power
that is supplied to the SSA DASD drawer by the fan-and-power-supply
assemblies.
To power on the SSA DASD Model 020 drawer, press and release the switch.
Repeat the action to power off the dc power. When the dc power is off, rack
power is still present in the fan-and-power-supply assemblies if the SSA DASD
drawer is connected to the rack power supply.
2 [Figure 3] SSA DASD Drawer Power Indicator This green indicator is on
when the power switch has been pressed to power on the dc voltage, and the
dc voltage is present in the SSA DASD drawer.
3 [Figure 3]SSA DASD Drawer Check Indicator This amber indicator comes
on if a failure occurs in the SSA DASD drawer. The drawer might be able to
continue operating satisfactorily although the failure of a particular part has been
detected.
4 [Figure 3] Power Card Indicator This green indicator is on when electrical
power is present on the card.
5 [Figure 3] Fan-and-Power Check indicator This amber indicator comes on
and stays on if dc output from the power supply part of the
fan-and-power-supply assembly fails or is disabled.
If the power supply fails completely, the fan-and-power indicator is powered on
from one of the other fan-and-power-supply assemblies in the SSA DASD
drawer. The indicator blinks if the fan fails.
6 [Figure 3]Power Indicator This green indicator is on when rack electrical
power is present in the fan-and-power-supply assembly.
7 [Figure 3]Link Status (Ready) Indicator This green indicator shows the
status of the port (for example, port 1) through which the bypass card is
connected to the SSA device card:
– Indicator Permanently On The interface through the bypass card is fully
operational.
– Indicator Blinking (two seconds on, two seconds off) The interface through
the bypass card is not operational.
– Indicator Off The card is in Bypass state or in Forced Inline mode.
Reference Information, CHAPTER 1
9
SSA DASD Drawer Reference Information
8 [Figure 3]Mode Indicator This indicator shows in which mode the bypass
card is operating.
– Indicator Permanently On (Amber) The bypass card is switched to Bypass
state.
– Indicator Permanently On (Green) The bypass card is jumpered for Forced
Inline mode.
– Indicator Off The bypass card is switched to Inline state.
9 [Figure 3]Link Status (Ready) indicator This green indicator shows the
status of the port (for example, port 2) through which the bypass card is
connected to the SSA device card:
– Indicator Permanently On The interface through the bypass card is fully
operational.
– Indicator Blinking (two seconds on, two seconds off) The interface through
the bypass card is not operational.
– Indicator Off The card is in Bypass state or in Forced Inline mode.
Figure 3. SSA DASD Model 020 Drawer Indicators and Power Switch (t007290n)
SSA DASD Model 040 Drawer Indicators and Switches
The SSA DASD Model 040 drawer drawer has indicators that show the status of the
drawer. Each DDM has indicators that show the status of that DDM.
1 [Figure 4]Controller Card Indicator This amber indicator is on when the
controller card fails.
2 [Figure 4] Fan Power Indicator This green indicator is on when dc voltage
is present at the fan.
3 [Figure 4] Fan Check Indicator This amber indicator comes on and remains
on when the fan fails.
10
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
7 [Figure 4] PWR/FAULT RESET Switch This switch switches off the dc
output voltage from the power supply. To switch off the dc voltage, pull the
switch out, then push it down. To switch on the dc voltage, pull the switch out,
then push it up.
If the SSA DASD Model 040 drawer drawer has a serious power problem, the
power supply can become latched off. By switching this switch Off then On, you
can reset the power supply.
8 [Figure 4] PWR Indicator This green indicator comes on when rack power
is present in the power supply.
6 [Figure 4] CHK/PWR-GOOD Indicator This indicator has two colors that
show power supply status:
– This indicator shows green when the dc output from the power supply is
active (good).
– This indicator shows amber when the dc output from the power supply fails.
4 [Figure 4] Link Status (Ready) Indicator This green indicator shows the
status of the port (for example, port 1) through which the bypass card is
connected to another device:
– Permanently On, The path through this port is operational.
– Flashing, The path through this port is not operational.
– Off,, one of the following conditions exists:
- The path through this port is not operational.
- The card is switched into Bypass state (mode light is on amber)
- The card is jumpered for Forced Inline mode (mode light is on green)
5 [Figure 4] Mode Indicator This indicator has two colors that show which
mode the bypass card is operating in:
– Permanently On (amber), the bypass card is switched to bypass state.
– Permanently On (green), the bypass card is jumpered for forced inline
mode.
– Off,, the bypass card is switched to inline mode.
The following table summarizes the various states of the three bypass card
lights:
Table 2. Summary of Bypass Card Indicators
Operating Mode
Status
Link Status
Light-1
Mode Light
Link Status
Light-2
Automatic
Inline
On
Off
On
Automatic
Bypass
Off
Amber
Off
Forced Inline
Inline
Off
Green
Off
Forced Bypass
Bypass
On
Amber
On
Forced Open
Open
Off
Off
Off
Jumpered Forced Inline
Inline
Off
Green
Off
Reference Information, CHAPTER 1
11
SSA DASD Drawer Reference Information
Figure 4. SSA DASD Model 040 Drawer Indicators (t007661p)
DDM Bay Indicators and Switches
The DDM bay has indicators that show the status of the DDM bay. Each DDM has
indicators that show the status of that DDM.
3 [Figure 5] Controller Card Power Check Indicator This green indicator is
on when controller card power is present.
4 [Figure 5] DDM Check Indicator This amber indicator is on when a DDM
fails.
5 [Figure 5] Controller Card Indicator This amber indicator is on when the
controller card fails.
1 [Figure 5] Link Status (Ready) Indicator This green indicator shows the
status of the port (for example, port 1) through which the bypass card is
connected to another device:
– Permanently On, The path through this port is operational.
– Flashing, The path through this port is not operational.
– Off,, one of the following conditions exists:
- The path through this port is not operational.
- The card is switched into Bypass state (mode light is on amber)
- The card is jumpered for Forced Inline mode (mode light is on green)
2 [Figure 5] Mode Indicator This indicator has two colors that show which
mode the bypass card is operating in:
– Permanently On (amber), the bypass card is switched to bypass state.
12
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
– Permanently On (green), the bypass card is jumpered for forced inline
mode.
– Off,, the bypass card is switched to inline mode.
The following table summarizes the various states of the three bypass card
lights:
Table 3. Summary of Bypass Card Indicators
Operating Mode
Status
Link Status
Light-1
Mode Light
Link Status
Light-2
Automatic
Inline
On
Off
On
Automatic
Bypass
Off
Amber
Off
Forced Inline
Inline
Off
Green
Off
Forced Bypass
Bypass
On
Amber
On
Forced Open
Open
Off
Off
Off
Jumpered Forced Inline
Inline
Off
Green
Off
Figure 5. DDM Bay Indicators (S008108l)
Disk Drive Module Indicators
The DDM indicators at the front or rear of the SSA DASD Model 020 or 040
drawers are visible by opening the front or rear door of the 2105.
Reference Information, CHAPTER 1
13
SSA DASD Drawer Reference Information
SSA DASD Model 020 Drawer Disk Drive Module Indicators
Figure 6. SSA DASD Model 020 Drawer Disk Drive Module Indicators (t007383m)
1 [Figure 6]Power Indicator This green indicator is on when dc voltage is
present and inside the specified limits.
2 [Figure 6] Ready Indicator This green indicator shows the following
conditions:
– Indicator Off Both SSA links are inactive because one of the following
conditions exists:
- The DDMs or DDM and bypass card that are logically on each side of, and
next to, this DDM are not connected or are missing.
- The DDMs or DDM and bypass card that are logically on each side of, and
next to, this DDM are inactive.
- A bypass card that is in the loop is inactive.
- A power-on self-test (POST) is running on this DDM.
– Indicator Permanently On Both SSA links are active, and the DDM is ready
to accept commands from the using system. The Ready indicator does not
show that the motor of the DDM is spinning. The DDM might be waiting for a
Motor Start command, or might have received a Motor Stop Command.
– Indicator Slowly Blinks (two seconds on, two seconds off) Only one SSA
link is active.
– Indicator Blinks Fast (five times per second) The DDM is active with a
command in progress.
3 [Figure 6] Check Indicator This amber indicator shows the following
conditions:
– Indicator Off Normal operating condition.
– Indicator Permanently On One of the following conditions exists:
- An unrecoverable error that prevents the normal operation of the SSA link
has been detected.
- The power-on self-tests (POSTs) are running or have failed. The indicator
comes on as soon as the DDM is powered on, and goes off when the
14
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
POSTs are complete. If the indicator remains on for longer than one
minute after the DDM is powered on, the POSTs have failed.
- Neither SSA link is active.
- The DDM is in Service mode, and can be removed from the SSA DASD
drawer.
– Indicator Blinking The Check indicator has been set by a service aid to
identify the position of a particular DDM.
SSA DASD Model 040 Drawer and DDM Bay Disk Drive Module
Indicators
Figure 7. SSA DASD Model 040 Drawer Disk Drive Module Indicators (t007660m)
1 [Figure 7] Ready Indicator This green indicator shows the following
conditions:
– Indicator Off Both SSA links are inactive because one of the following
conditions exists:
- The DDMs or DDM and bypass card that are logically on each side of, and
next to, this DDM are not connected or are missing.
- The DDMs or DDM and bypass card that are logically on each side of, and
next to, this DDM are inactive.
- An SSA attachment that is in the loop is inactive.
- A power-on self-test (POST) is running on this DDM.
– Indicator Permanently On Both SSA links are active, and the DDM is ready
to accept commands from the using system. The Ready indicator does not
show that the motor of the DDM is spinning. The DDM might be waiting for a
Motor Start command, or might have received a Motor Stop Command.
– Indicator Slowly Blinks (two seconds on, two seconds off) Only one SSA
link is active.
– Indicator Blinks Fast (five times per second) The DDM is active with a
command in progress.
2 [Figure 7] Check Indicator This amber indicator shows the following
conditions:
– Indicator Off Normal operating condition.
Reference Information, CHAPTER 1
15
SSA DASD Drawer Reference Information
– Indicator Permanently On One of the following conditions exists:
- An unrecoverable error that prevents the normal operation of the SSA link
has been detected.
- The power-on self-tests (POSTs) are running or have failed. The indicator
comes on as soon as the DDM is powered on, and goes off when the
POSTs are complete. If the indicator remains on for longer than one
minute after the DDM is powered on, the POSTs have failed.
- Neither SSA link is active.
- The DDM is in Service mode, and can be removed from the SSA DASD
drawer.
– Indicator Blinking The Check indicator has been set by a service aid to
identify the position of a particular DDM.
Internal Connections (SSA DASD Model 020 and 040 Drawer)
Inside the SSA DASD drawer, the DDMs are connected in strings of four DDMs.
These strings are connected to the external SSA connectors at the back of the SSA
DASD drawer.
The following diagrams show the relationships between the disk drive DDM strings
and the external SSA connectors at the back of the SSA DASD Model 020 drawer.
SSA DASD Model 020 and 040 Drawer Internal Connections
Table 4 summarizes the relationship between the DDM strings and the external SSA
connectors.
Figure 8. SSA DASD Model 020 and 040 Drawer Internal SSA Connections (t007304m)
Table 4. Relationship between Strings and Connectors of SSA DASD Model 020 and 040
Drawer
Disk Drive Modules
SSA DASD Model 020 and 040 Drawer
Connectors
Back DDMs 13 through 16
J13 and J16
Back DDMs 9 through 12
J9 and J12
Front DDMs 5 through 8
J5 and J8
Front DDMs 1 through 4
J1 and J4
16
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
Internal Connections (DDM Bay)
Inside the DDM bay, the DDMs are connected in a string of eight DDMs. The string
is connected to the external SSA connectors at the front of the DDM bay.
The following diagram show the relationships between the disk drive DDM string
and the external SSA connectors at the front of the DDM bay.
DDM Bay Internal Connections
The diagram below shows the relationship between the DDM string and the external
SSA connectors.
Figure 9. DDM Bay Internal SSA Connections (S008107l)
External SSA Connections (DDM Bay)
From one to six DDM bays can be connected on two loops, each of which is
connected to a different SSA device card.
The following diagram show the relationships between the SSA device cards loops
with one to six DDM bays.
Note: Figure 12 on page 18 and Figure 13 on page 19 show the two stages
necessary to concurrently connect a second (E2) DDM bay.
Figure 10. DDM Bay Diagram Explanation (S008122l)
Reference Information, CHAPTER 1
17
SSA DASD Drawer Reference Information
Figure 11. One DDM Bay External SSA Connections (S008129m)
Figure 12. Two DDM Bay Initial External SSA Connections (S008128m)
18
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
Figure 13. Two DDM Bay Final External SSA Connections (S008127m)
Figure 14. Three DDM Bay External SSA Connections (S008126m)
Reference Information, CHAPTER 1
19
SSA DASD Drawer Reference Information
Figure 15. Four DDM Bay External SSA Connections (S008125m)
Figure 16. Five DDM Bay External SSA Connections (S008124m)
20
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
Figure 17. Six DDM Bay External SSA Connections (S008123m)
External SSA Connections (SSA DASD Model 040 Drawer)
From one to three SSA DASD Model 040 drawers can be connected on each of the
two loops, which are connected to two SSA device cards.
The following diagram show the relationships between the SSA device cards and a
single loop with one, two, and three SSA DASD Model 040 drawers.
Note: Figure 20 on page 23 and Figure 21 on page 24 show the two stages
necessary to concurrently connect a second (D2) SSA DASD Model 040
drawer. Figure 22 on page 25 and Figure 23 on page 26 show the two stages
necessary to concurrently connect a third (D3) SSA DASD Model 040
drawer.
Note: The lines connecting the two terminals in a bypass card show that these two
terminals are automatically connected when no cable is installed. The
automatic connection occurs when no cable is connected between either of
the terminals and another powered up drawer or SSA device card.
Reference Information, CHAPTER 1
21
SSA DASD Drawer Reference Information
TO
DDM
4
BP
5
8
TO
DDM
D_
BP
4
DDMs
DDMs
1
5
8-5
4-1
16
8
DDMs
DDMs
13
9
9 - 12
13 - 16
12
BP
9
7133 Drawer
1
16
BP
13
12
D_ = 7133 Model 020/040 Drawer Number
BP = Bypass Card
Figure 18. SSA DASD Model 040 Drawer Diagram Explanation (S008134m)
Figure 19. One SSA DASD Model 040 Drawer External SSA Connections (S008139m)
22
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
TO
DDM
SSA
DEVICE
Card
Cluster
1
4
BP
5
8
BP
5
8
9
BP
D1
DDMs
DDMs
1
5
8-5
4-1
16
8
DDMs
DDMs
13
9
9 - 12
13 - 16
12
TO
DDM
4
TO
DDM
4
BP
9
7133 Drawer
7133 Drawer
16
BP
13
12
SSA
DEVICE
Card
Cluster
2
TO
DDM
D2
BP
4
DDMs
DDMs
1
5
8-5
4-1
16
8
DDMs
DDMs
13
9
9 - 12
13 - 16
12
BP
1
1
16
BP
13
12
Figure 20. Two SSA DASD Model 040 Drawer Initial External SSA Connections (S008137p)
Reference Information, CHAPTER 1
23
SSA DASD Drawer Reference Information
Figure 21. Two SSA DASD Model 040 Drawer Final External SSA Connections (S008138p)
24
VOLUME 1, ESS Service Guide
SSA DASD Drawer Reference Information
Figure 22. Three SSA DASD Model 040 Drawer External SSA Connections (S008136s)
Reference Information, CHAPTER 1
25
Special Tools
Figure 23. Four SSA DASD Model 040 Drawer External SSA Connections (S008135s)
26
VOLUME 1, ESS Service Guide
Special Tools
Special Tools
v
v
v
v
SSA screwdriver tool (P/N 32H7059)
ESCON wrap tool, P/N 5605670
Fibre channel long wave (LW) wrap tool, P/N 78G9610
Fibre channel short wave (SW) wrap tool, P/N 16G5609
Reference Information, CHAPTER 1
27
Special Tools
28
VOLUME 1, ESS Service Guide
Chapter 2: Entry MAP for All Service Actions
Start all service actions for the IBM 2105 subsystem, 2105 Model E10/E20 rack,
2105 Expansion Enclosure, DDM bay or, SSA DASD drawer here.
Select the type of action you want to perform from Table 5 below.
Table 5. Entry MAP for All Service Actions
If you are here to:
Go to:
SERVICE TERMINAL
Connect the service terminal to 2105
Model Exx/Fxx rack
″Service Terminal Setup and 2105 Configuration Verification″ in chapter 8 of
the Enterprise Storage Server Service Guide, Volume 3
Repair service terminal connection
problem to one cluster bay.
“MAP 6060: Isolating a Service Terminal Login Failure To One Cluster” on
page 432
Repair service terminal connection
problem to both cluster bays
“MAP 6040: Isolating a Service Terminal Login Failure To Both Clusters” on
page 431
INSTALL
2105 Model Exx/Fxx Subsystem
″Installing and Testing the 2105 Model Exx/Fxx Unit″ in chapter 5 of the
Enterprise Storage Server Service Guide, Volume 2
2105 Model 100 Subsystem
″Attaching the 2105 Model 100 to a 2105 Model Exx/Fxx Unit″ in chapter 5 of
the 2105 Model 100 Attachment to ESS Server Service Guide
2105 Expansion Enclosure (Physically ″Installing and Testing the 2105 Expansion Enclosure″ in chapter 5 of the
attached to a 2105 Model E20 or F20 Enterprise Storage Server Service Guide, Volume 2
only)
DDM Bay (8 Pack)
Adding a DDM bay to an existing 2105 subsystem requires a separate MES.
7133 Drawer (Customer supplied and
previously used)
Adding previously used 7133 device drawers must be checked for
compatibility. Use the ″7133 Model 020 and D40 Requirements for 2105
Installations″ instruction list service offering in the IBM Enterprise Storage
Server Introduction and Planning Guide book, form number GC26-7294.
Adding a previously used 7133 SSA DASD drawer is a billable service.
Host Card
Installing a Host Card ″Installing a Host Card″ in chapter 5 of the Enterprise
Storage Server Service Guide, Volume 2
SSA Device Card
Adding an SSA device card to an existing 2105 subsystem requires a
separate MES.
Modem or Modem Expander
″Connecting the Modem and Modem Expansion Cables for Remote Support″
in chapter 5 of the Enterprise Storage Server Service Guide, Volume 2
Attach ESSNet to Customer Network
″Attaching the ESSNet to a Customer Network″ in chapter 5 of theEnterprise
Storage Server Service Guide, Volume 2, this is a billable service.
REMOVE
2105 Subsystem
″Discontinue a 2105 Model Exx/Fxx Subsystem″ in chapter 5 of the Enterprise
Storage Server Service Guide, Volume 2
2105 Expansion Enclosure
Removing a 2105 Expansion Enclosure from an existing 2105 subsystem
requires a separate RPQ.
2105 Model 100
Removing a 2105 Model 100 from an existing 2105 Model Exx/Fxx subsystem
requires a separate RPQ.
DDM Bay (8 Pack)
Removing an DDM bay from an existing 2105 subsystem requires a separate
RPQ.
© Copyright IBM Corp. 1999
29
Start
Table 5. Entry MAP for All Service Actions (continued)
If you are here to:
Go to:
7133 Drawer
Removing a 7133 SSA DASD drawer from an existing 2105 subsystem
requires a separate RPQ.
Host Card
Removing a Host Card ″Removing a Host Card″ in chapter 5 of the Enterprise
Storage Server Service Guide, Volume 2
SSA Device Card
Removing a SSA device card from an existing 2105 subsystem requires a
separate RPQ.
Relocate 2105 Subsystem
″Relocating a 2105 Model Exx/Fxx Subsystem″ in chapter 5 of the Enterprise
Storage Server Service Guide, Volume 2
IBM VERSATILE STORAGE SERVER ATTACHMENT
Attaching a 2105 Model 100 rack to a
2105 Model Exx/Fxx
Requires MES FC 1121 or 1122. Use the attachment procedure in chapter 5
of the 2105 Model 100 Attachment to ESS Service Guide book.
LOGICAL CONFIGURATION / ESS SPECIALIST
Change logical subsystem
configuration
If additional configuration needs to be completed, use the ESS Specialist from
the ESSNet console.
Customer cannot access the 2105
Model Exx/Fxx using the ESS
Specialist
Go to Analyze and Repair a Service Request section of this table.
Customer cannot access a SCSI LUN
Go to Analyze and Repair a Service Request section of this table.
CHANGE COMMUNICATIONS CONFIGURATION
TCP/IP LAN, use only after 2105 initial ″Changing TCP/IP Configuration″ in chapter 6 of the Enterprise Storage
installation
Server Service Guide, Volume 2
Enable/Disable ESS Specialist
″Configure ESS Specialist″ in chapter 6 of the Enterprise Storage Server
Service Guide, Volume 2
Regenerate the ESS Specialist
Certificate
″Regenerate ESS Specialist Certificate″ in chapter 6 of the Enterprise Storage
Server Service Guide, Volume 2
EMail
″Configure Email″ in chapter 6 of the Enterprise Storage Server Service
Guide, Volume 2
Serial port / modem
″Configure Call Home/Remote Services″ in chapter 6 of the Enterprise
Storage Server Service Guide, Volume 2 book.
SNMP
″Configure SNMP″ in chapter 6 of the Enterprise Storage Server Service
Guide, Volume 2
Call home/remote reporting options
″Configure Call Home/Remote Services″ in chapter 6 of the Enterprise
Storage Server Service Guide, Volume 2
Import/Export configuration data
“MAP 4020: Performing the SCSI Hard Drive Build Process” on page 316
Configure Copy Services, with DNS
″Configure Copy Services, with DNS″ in chapter 6 of the Enterprise Storage
Server Service Guide, Volume 2
Configure Copy Services, without
DNS
″Configure Copy Services, without DNS″ in chapter 6 of the Enterprise
Storage Server Service Guide, Volume 2
Managing Copy Services
″Copy Services Server Menu″ in chapter 6 of the Enterprise Storage Server
Service Guide, Volume 2, refer to the ″Copy Services Server Menu″ options
there
ANALYZE and REPAIR a SERVICE REQUEST
30
VOLUME 1, ESS Service Guide
Start
Table 5. Entry MAP for All Service Actions (continued)
If you are here to:
Go to:
Prioritize symptoms for repair
“MAP 1200: Prioritizing Visual Symptoms and Problem Logs For Repair” on
page 52
Codes displayed by the Cluster Bay
Operator Panel
“MAP 4360: Isolation Using Codes Displayed by the Cluster Operator Panel”
on page 342
Cluster Bay Ready indicator LED Off
“MAP 20A0: Cluster Not Ready” on page 72
Display and repair a problem with the
service terminal
“MAP 1210: Displaying and Repairing a Problem Record” on page 53
E-Mail reported problem
“MAP 1460: Isolating E-Mail Reported Errors” on page 67
SCSI-Host system receives command
rejects and check condition of internal
target failure
“MAP 4560: No Valid Subsystem Status Available” on page 370
SCSI-Host system detected
“MAP 5220: Isolating a SCSI Bus Error” on page 406
ESCON-Host system receives ’FC’
status, pinned data
“MAP 4560: No Valid Subsystem Status Available” on page 370
ESCON-Host system detected
“MAP 5300: ESCON Link Fault” on page 414
Fibre channel-host system detected
“MAP 5400: Fibre Channel Link Fault” on page 422
Customer reports a loss of line cord
input power via email message
This should cause a visual symptom, “MAP 1320: Isolating Problems Using
Visual Symptoms” on page 58
Power on or off problems
“MAP 2020: Isolating Power Symptoms” on page 71
Modem call home
“MAP 1300: Isolating Cluster to Modem Communication Problems” on page 54
Visual symptom
“MAP 1320: Isolating Problems Using Visual Symptoms” on page 58
Power and cooling
“MAP 1320: Isolating Problems Using Visual Symptoms” on page 58
Cluster bay boot or down problem
“MAP 4360: Isolation Using Codes Displayed by the Cluster Operator Panel”
on page 342
Customer LAN connection problem
“MAP 4450: ESSNet Cluster Bay to Customer Network Problem” on page 354
Replace a FRU without using a
problem log
“MAP 1480: Replacing a FRU, Without Using a Problem Log” on page 67
Repair a service terminal connection
problem to one cluster bay
“MAP 6060: Isolating a Service Terminal Login Failure To One Cluster” on
page 432
Repair a service terminal connection
problem to both cluster bays
“MAP 6040: Isolating a Service Terminal Login Failure To Both Clusters” on
page 431
Customer cannot access a SCSI LUN
Normally this is due to a logical configuration problem or other customer
related problem with the SCSI based host server. For this to be hardware
based, there should be two problems on the same SSA loop which cause a
RAID array to be off-line. Use the service terminal Repair Menu, Show /
Repair Problems Needing Repair option. If related problem logs are not
found, call the next level of support.
Customer cannot access a fibre
channel LUN
“MAP 5430: Host Fibre Channel Fails to Recognize ESS LUNs” on page 428
Customer cannot access the 2105
Model Exx/Fxx using the ESS
Specialist
“MAP 5000: ESS Specialist Cannot Access Cluster” on page 405
ESSNet Console Hardware Problem
“MAP 1600: ESSNet Console Problem” on page 69
ESSNet Console Software Problem
“MAP 1600: ESSNet Console Problem” on page 69
2105 Model 100, Visual Symptom
″MAP 1320: Isolating Problems Using Visual Symptoms″ in chapter 3 of the
2105 Model 100 Attachment to ESS Server Service Guide
Entry MAP for All Service Actions, CHAPTER 2
31
Start
Table 5. Entry MAP for All Service Actions (continued)
If you are here to:
Go to:
2105 Model 100, Power Problems
″MAP 2020: Isolating Power Symptoms″ in chapter 3 of the 2105 Model 100
Attachment to ESS Server Service Guide
ESSNet CONSOLE
ESSNet Console Hardware Problem
“MAP 1600: ESSNet Console Problem” on page 69
ESSNet Console Software Problem
“MAP 1600: ESSNet Console Problem” on page 69
SYSTEM/390 REPAIRS
SIM Generation and Usage
“SIM Generation and Usage” on page 33
Repair Using a Hardware SIM ID
The SIM ID is the same as the Problem Number in the 2105 Problem Log.
Use this number to begin the repair, go to “MAP 1210: Displaying and
Repairing a Problem Record” on page 53.
Repair Using an EREP Report
“Repair Using an EREP Report” on page 34
Repair Using a SIM Console Message “Repair Using a SIM Console Message” on page 34
Media SIM Maintenance Procedures
“Media SIM Maintenance Procedures” on page 38
Decode a Refcode
“Decode a Refcode” on page 36
Change SIM Reporting Levels
″Change SIM Reporting Options (System/390 Only)″ in chapter 6 of the
Enterprise Storage Server Service Guide, Volume 2
TEST a MACHINE FUNCTION
Cluster Bay
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Host Bay Planners
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Interface Cards
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
External Connections
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
SSA Devices
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
SSA Loops
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Rack Power Control (RPC) Cards
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
CD-ROM Drive
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Diskette Drive
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Send Test Notification
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Show Problem Log
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
Safety inspection
″Safety Inspection″ in chapter 12 of the Enterprise Storage Server Service
Guide, Volume 3
32
VOLUME 1, ESS Service Guide
Start
Table 5. Entry MAP for All Service Actions (continued)
If you are here to:
Go to:
LICENSED INTERNAL CODE (Microcode E/C)
Install/Activate LIC Feature
″Activate LIC Feature″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3
LIC Feature Control Record Extraction ″LC Feature Control Record Extraction″ in chapter 5 of the Enterprise Storage
Server Service Guide, Volume 2
Display LIC Levels and Resource
Requirements
″Licensed Internal Code Maintenance Menu″ in chapter 8 of the Enterprise
Storage Server Service Guide, Volume 3
Display LIC Installation Instructions
″Licensed Internal Code Maintenance Menu″ in chapter 8 of the Enterprise
Storage Server Service Guide, Volume 3
Copy a LIC Image to LIC Library
″Licensed Internal Code Maintenance Menu″ in chapter 8 of the Enterprise
Storage Server Service Guide, Volume 3
Activate a LIC Image
″Licensed Internal Code Maintenance Menu″ in chapter 8 of the Enterprise
Storage Server Service Guide, Volume 3
Copy and Activate a LIC Image
″Licensed Internal Code Maintenance Menu″ in chapter 8 of the Enterprise
Storage Server Service Guide, Volume 3
INFORMATION
Machine overview
“DDM Bay and SSA DASD Drawer Reference Information” on page 8
Service interface
“Service Interface” on page 7
Locations and FRUs, 2105 Model
Exx/Fxx, only
″Locations″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3
Locations and FRUs, 2105 Model 100, ″2105 Model 100 Locations″ in chapter 7 of the 2105 Model 100 Attachment
only
to ESS Server Service Guide book.
Determine ESD procedures
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2
Determine standard tools needed
″Standard Tools Needed″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2
Cluster Bay Operator Panel, status
codes
“MAP 4360: Isolation Using Codes Displayed by the Cluster Operator Panel”
on page 342
DDM Bay and SSA DASD Drawer
indicators and switch
“DDM Bay and SSA DASD Drawer Reference Information” on page 8
2105 Model Exx/Fxx maintenance
agreement qualification
″Safety Check″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2
2105 Model 100 maintenance
agreement qualification
″Safety Check″ in chapter 12 of the 2105 Model 100 Attachment to ESS
Server Service Guide
SIM Generation and Usage
SIM generation by the ESS family of products is not intended to be the primary
notification for service, as it was for the 3390, 3990, 9340, and 9390 product
families. SIM generation for ESS is a complement to the existing problem
notification process, and is used to support previous system attachments to S/390
hosts.
The strategy for SIM presentation differs from previous products. Instead of
directing a SIM to the failing device and system, hardware SIMs will be presented
Entry MAP for All Service Actions, CHAPTER 2
33
Start
to all S/390 hosts attached to the storage subsystem. Exception Class 0 and Media
SIMs will still be off-loaded against the failing device and system.
The SIM ID is the same as the Problem Number in the 2105 Problem Log and will
be used to repair the problem.
Repair Using a SIM Console Message
The SIM ID is the same as the Problem Number in the 2105 Problem Log.
When a SIM ID is available, start the repair by going to “MAP 1210: Displaying
and Repairing a Problem Record” on page 53.
The 2105 maintenance strategy does not rely on the analysis of data in
environmental recording, editing and printing (EREP) reports, or sense bytes on the
console. Sense data records for some 2105 temporary and all permanent errors are
sent from the 2105 to the system to give information necessary to perform needed
system error recovery procedures. The 2105 sense data is logged in the
error-recording data set (ERDS) in the system, but is not used for 2105 problem
determination. It is preferred that you start all service actions with a SIM. If the
customer receives sense data without a SIM, the following procedure can be used
to evaluate the error.
Customer Receives Sense Data Without a SIM
If you do not see a SIM in EREP or on the console, and the customer continues to
receive sense data on the console or console messages:
1. Use the service terminal to display all active problems associated with the failing
2105.
2. If the service terminal does not find any problems related to the console
message, run Machine Tests on the suspected failing machine function. See
″Machine Test Menu″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3. Repair any failure detected.
3. If the error continues, call your next level of support.
Repair Using an EREP Report
The SIM ID is the same as the Problem Number in the 2105 Problem Log.
When a SIM ID is available, start the repair by going to “MAP 1210: Displaying
and Repairing a Problem Record” on page 53.
The 2105 maintenance strategy does not rely on the analysis of data in
environmental recording, editing, and printing (EREP) reports. Sense data records
for some 2105 temporary and all permanent errors are sent from the 2105 to the
system to give information necessary to perform needed system error recovery
procedures. The 2105 sense data is logged in the error-recording data set (ERDS)
in the system, but it is not used for 2105 problem determination. Start a service
action with a SIM ID only.
All 2105 sense data, including the sense data sent to the system for error recovery,
is processed by the 2105 support facility (SF) which generates SIMs whenever
2105 service is needed. The SIMs summarize the service information necessary to
isolate and repair 2105 error conditions. SIMs are presented to the customer as
console messages. SIMs are also logged in the ERDS.
Do not attempt to off-load device statistics when running EREP (SYSEXN) if
devices or paths are failing. A device or path problem can prevent EREP from
34
VOLUME 1, ESS Service Guide
Repair Using an EREP Report
successfully collecting statistics, and the EREP job will not complete successfully.
To prevent off-loading statistics, make a working data set from the ERDS and then
run EREP against the working data set.
For more information on EREP, see “EREP Reports”.
EREP Reports
For detailed information about EREP reports, see Environmental Recording, Editing,
and Printing Program User’s Guide book.
System Exception Reports
The customer should normally run the system exception reports daily. The best
report to use as a basis for servicing the 2105 is the Service Information Messages
report, see Figure 24.
Other system exception reports might contain 2105 information. The other reports
would only be used as a basis for 2105 service if there were no SIMs.
SERVICE INFORMATION MESSAGES
REPORT DATE 024 99
PERIOD FROM 021 99
TO
022 99
FIRST OCCURRENCE
LAST OCCURRENCE
COUNT
****************************************************************************************************
1
021/99 17:44:27:78
021/99 17:44:27:78
MODERATE ALERT 2105-E20
S/N 0113-10473 REFCODE C211-1060-A00A ID=03
DASD EXCEPTION ON SSID 0011
ADDITIONAL ANALYSIS REQUIRED TO DETERMINE REPAIR IMPACT.
SEE PROBLEM NUMBER 03 FOR DETAILS
2
021/99 19:24:19:56
021/99 19:24:19:56
SERVICE ALERT 2105-E20
S/N 0113-30224 REFCODE 4320-0000-5284 ID=06
MEDIA EXCEPTION ON SSID 00D2, VOLSER 380050 DEV 0E12, 0D
REFERENCE MEDIA MAINTENANCE PROCEDURE 2
3
021/99 19:24:04:67
022/99 03:29:01:65
SERIOUS ALERT 2105-E20
S/N 0113-10473 REFCODE C211-1060-A00A ID=09
DASD EXCEPTION ON SSID 00D2
ADDITIONAL ANALYSIS REQUIRED TO DETERMINE REPAIR IMPACT.
SEE PROBLEM NUMBER 09 FOR DETAILS
Figure 24. Service Information Messages Report (S008595n)
To run EREP for the system exception reports:
1. Make a working data set using the following parameters:
PRINT=NO
ACC=Y
ZERO=N
TYPE=O
TABSIZE=999K
2. Run EREP against the working data set and print using the following
parameters:
SYSEXN=Y
HIST
ACC=N
Entry MAP for All Service Actions, CHAPTER 2
35
Repair Using an EREP Report
TABSIZE=999K
DEV=(33xx)
Event History Report
Note: The best EREP report to use is the Service Information Messages report.
See Figure 24 on page 35.
The Event History report gives a one-line summary of each entry in the system
error recording data set (ERDS). See Figure 25. Selection parameters can be used
to select records by device type, date, and time.
When an Event History report (EVENT) is needed, instruct the customer to select
the following parameters when running EREP against the working data set:
EVENT=Y
HIST
ACC=N
TABSIZE=999K
DEV=(2105)
CUA=(xxx-xxx)
where xxx-xxx is the device address (CUA) range of the string.
For details about the Event History report, see Environmental Recording, Editing,
and Printing User’s Guide book.
REPORT DATE
PERIOD FROM
PERIOD TO
TIME
DATE
00 12
*****
00 19
*****
00 27
*****
00 28
JOBNAME
RECTYP CP CUA
* DNO
SPID
SNID
SSYS ID REASON PSW-MCH /PROG-EC
RCYRYXIT
DEVT
CMD CSW SENSE
04 06
08 10
12 14
CRW CHP
SCSW
ESW
079/99
052/99
076/99
COMP/MOD CSECTID ERROR-ID
16 18
20 22
VOLUME
SEEK
SD CT
052 99
10 44
N/A
ASYNCH 00 0201 2105-E10 RAS201
00000500 0127CF1A 35000680 00410A00 00412000 00444100 05104501 FE000100
22 92
N/A
ASYNCH 00 0201 2105-E10 RAS201
00000500 0127CF1A 35000680 00410A00 00412000 00444100 05104601 FE000100
44 10
N/A
ASYNCH 00 0201 2105-E10 RAS201
00000500 0127CF1A 35000680 00410A00 00412000 00444100 05104601 FE000100
41 75
N/A
ASYNCH 00 0201 2105-E10 RAS201
00000500 0127CF1A 35000680 00410A00 00012000 00444100 05104601 FE000100
Figure 25. Event History Report (S008596m)
To make a refcode from SIM sense bytes, see “Generating a Refcode from Sense
Bytes” on page 37.
Decode a Refcode
The refcode is a 6-byte field that contains information you can use to locate and
repair a 2105 error condition. This section explains how to decode the refcode and
find the probable failing FRUs, see Figure 26 on page 37.
36
VOLUME 1, ESS Service Guide
Decode a Refcode
KTGS-CCCC-II PP
KTGS: ESC
PP: Repair Procedure
Refcode Bytes 0 and 1
If PP=09 (Refcode Byte 5),
Perform procedure for problem
indicated in Refcode Byte 4.
Exception Class
Exception Type
General Symptom
If PP=82 (Refcode Byte 5),
Perform Media Maintenance 2
MAP or SIM Symptom
CCCC: LIC Level Identifier
Refcode Byte 2
II: Problem ID (SIM ID)
Refcode Byte 4
Figure 26. Decoding the Refcode (s008597m)
Generating a Refcode from Sense Bytes
The refcode is a 6-byte field that contains information the service representative can
use to locate and repair a 2105 error condition.
The refcode is created from SIM sense byte data as shown in Figure 27 below. For
details about the refcode, see “Decode a Refcode” on page 36.
2105 SIM Sense Byte Fields: DASD SIM
00
03
xxxxxxxx
04
07
xxxxxFYY
08
11
xxxxxxCC
12
15
CCIIPPxx
16
19
xxxxxxxx
20
23
xxxx KTGS
24
27
xxxxxxxx
28
31
FExxxxxx
Byte 06 = xF: Needed
for SIM sense bytes
YY: SIM ID field
refcode:
KTGS-CCCC-IIPP
Byte 28= FE:
2105 DASD SIM
2105 SIM Sense Byte Fields: MEDIA SIM
00
03
xxxxxxxx
04
07
xxxxxFYY
08
11
xxxxxx00
12
15
00SSQMxx
16
19
xxxxxxxx
20
23
xxxx KTGS
24
27
xxxxxxxx
28
31
FEc.ccchh
.
.
.
.
Byte 06 = xF: Needed
for SIM sense bytes
YY: SIM ID field
refcode: KTGS-0000-SSQM
Byte 28= FE:
2105 DASD SIM
Failing cylinder
Failing head
Figure 27. Refcode in the 2105 SIM Sense Bytes (S008594n)
Entry MAP for All Service Actions, CHAPTER 2
37
Decode a Refcode
Use the information in Figure 27 on page 37 to determine the refcode if the EREP
or similar function is not available. See “EREP Reports” on page 35 for more
information. If the record type in the Event History report is ASYNCH, that indicates
this record contains SIM sense bytes. If the record type in the Event History report
is OBRxxx, the record is a unit check sense and does not contain SIM sense bytes.
Media SIM Maintenance Procedures
Instruct the customer to perform the media maintenance procedure indicated in
Table 6. Also, look at the examples shown in “Customer Media Maintenance
Procedure Examples”.
Table 6. 2105 Media Maintenance Procedures
Procedure
Number
Description
ICKDSF Commands
2
The first part of this
procedure finds all tracks
with unrecoverable data and
supplies information on the
allocation of the user data
(for example, dataset
names).
Use ICKDSF Release 16 or higher, enter the following commands:
The second part of this
procedure returns the
indicated track to a usable
condition. Data on this track
has been lost. All subsystem
attempts at media
maintenance have been
unsuccessful. All attempts to
recover the data have been
unsuccessful.
IODELAY SET MSEC(100)
See Note 1 below.
ANALYZE <UNIT() ³DDNAME()> NODRIVE SCAN
See Note 2 below.
See Figure 28 on page 39 for the location of the ESC and addresses of
the failing track and head (cccchh) in the Analyze sense information.
For each track that reports an ESC of 4xC0 or 0F0B, issue the following
command (all on the same line):
INSPECT <UNIT()³DDNAME()> <VFY()³NOVFY>
ASSIGN NOCHECK NOPRESERVE TRACK(cccc,hh)
See Note 3 below.
Note: The above ICKDSF inspect command will result in the loss of all
customer data on that track.
Notes:
1. IODELAY adjusts ICKDSF to run concurrently with customer operations.
2. ANALYZE scans the volume for data that is not readable or not usable.
3. The NOPRESERVE parameter must be specified for the 2105. The PRESERVE
parameter is not valid for the 2105. All previous attempts by the subsystem to
recover the data have not been successful. Although the track will be returned
to a usable state, all customer data on the specified track will be lost when the
INSPECT command is run.
Customer Media Maintenance Procedure Examples
Example of Procedure 2
To locate all tracks with unrecoverable data, obtain information on the allocation of
the user data. To restore such tracks to a usable condition, run the ICKDSF
command sequence below. ICKDSF must be at level 16 or higher.
38
VOLUME 1, ESS Service Guide
Media SIM Maintenance Procedures
ENTER INPUT COMMAND:
analyze unit(1290) nodrive scan
ANALYZE UNIT(1290) NODRIVE SCAN
ICK00700I DEVICE INFORMATION FOR 1290 IS CURRENTLY AS FOLLOWS:
PHYSICAL DEVICE = 2105
STORAGE CONTROLLER = 2105
STORAGE CONTROL DESCRIPTOR = CC
DEVICE DESCRIPTOR = 06
ICK04000I DEVICE IS IN SIMPLEX STATE
ICK01400I 1290 ANALYZE STARTED
ICK01408I 1290 DATA VERIFICATION TEST STARTED
ICK21776I DATAVER TEST: ERROR DURING DATA VERIFICATION
CSW = D07C88 0200FFFF CCW = DE000000 3000FFFF FILEMASK = 1E
SENSE = 80000000 9000010B 00000034 80000004 02007667 FFB20F0B 000040E2 0003A401
ICK21401I 1290 SUSPECTED DRIVE PROBLEM
|
|
ICK401I 1290 SUSPECTED DRIVE PROBLEM
ESC1
cccchh
2
ICK01406I 1290 ANALYZE ENDED
ICK00001I FUNCTION COMPLETED, HIGHEST CONDITION CODE WAS 8
Figure 28. Example of ICKDSF Analyze Drivetest Output
Sense Information Key Description:
ESC 1
ESC = 0F0B in this example
cccchh 2
Failing track and head address (cccchh)
v Failing track address (cccc = track 03A4 in this example)
v Failing head address (hh = head 01 in this example)
Entry MAP for All Service Actions, CHAPTER 2
39
Media SIM Maintenance Procedures
40
VOLUME 1, ESS Service Guide
Chapter 3: Problem Isolation Procedures
MAPs 1XXX: General Isolation Procedures . . . . . . . . . . .
MAP 1200: Prioritizing Visual Symptoms and Problem Logs For Repair
Description . . . . . . . . . . . . . . . . . . . . .
Procedure. . . . . . . . . . . . . . . . . . . . . .
MAP 1210: Displaying and Repairing a Problem Record. . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure. . . . . . . . . . . . . . . . . . . . . .
MAP 1300: Isolating Cluster to Modem Communication Problems . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 1301: Isolating Call Home / Remote Services Failure . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 1320: Isolating Problems Using Visual Symptoms . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 1460: Isolating E-Mail Reported Errors . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure. . . . . . . . . . . . . . . . . . . . . .
MAP 1480: Replacing a FRU, Without Using a Problem Log . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure. . . . . . . . . . . . . . . . . . . . . .
MAP 1500: Ending a Service Action . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure. . . . . . . . . . . . . . . . . . . . . .
MAP 1600: ESSNet Console Problem . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
ESSNET Console Repair Process . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAPs 2XXX: Power and Cooling Isolation Procedures . . . . . . .
MAP 2000: Model 100 Power Problems. . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 2020: Isolating Power Symptoms . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 20A0: Cluster Not Ready . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 20B0: Cluster Did Not Power On, OK Displayed . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 2210: Electronics Cage Power Supply Problem . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 2320: Installed Unit Does Not Match Logical Unit . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 2340: PPS Status Code 06 . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 2350: Isolating PPS Status Indicator Codes . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
© Copyright IBM Corp. 1999
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
52
52
52
52
53
53
53
54
54
55
58
58
58
58
58
58
67
67
67
67
67
67
68
68
68
69
69
69
69
70
70
70
70
71
71
71
72
72
72
74
74
74
76
76
76
77
77
77
77
78
78
80
80
41
Isolate
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2360: 2105 Model Exx/Fxx UEPO Problems . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2370: Automatic Power On Problem . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2380: Isolating 2105 Expansion Enclosure UEPO Problems . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2390: Remote Power On Not Working . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2400: 2105 Model Exx/Fxx Local Power On Problems . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2410: RPC Power Mode Switch Mismatch . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2420: 2105 Expansion Enclosure Power On Problem. . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2430: One RPC Card Firmware Down Level . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2440: Isolating 2105 Model Exx/Fxx Power Off Problems . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2460: Battery Charge Low . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2470: Battery Set Detection Problem . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2490: PPS Input Phase Missing . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 24A0: PPS Power On Problem . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 24B0: Cannot Power Off, Pinned Data. . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 24F0: Both RPC Cards Firmware Down Level . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2520: PPS Output Circuit Breaker Tripped . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 2540: Power Problem Detected By Cluster Bay . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAPs 3XXX SSA DASD Drawer Isolation Procedures . . . . . . . .
Using the SSA DASD drawer Maintenance Analysis Procedures (MAPs)
MAP 3000: Isolating an SSA Link Error . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
42
VOLUME 1, ESS Service Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 80
. 82
. 82
. 83
. 84
. 84
. 85
. 86
. 86
. 87
. 88
. 88
. 89
. 91
. 91
. 91
. 95
. 95
. 95
. 96
. 96
. 96
. 99
. 99
. 99
. 99
. 99
. 99
. 102
. 102
. 103
. 103
. 103
. 103
. 104
. 104
. 104
. 104
. 104
. 105
. 106
. 106
. 106
. 107
. 107
. 107
. 107
. 107
. 107
. 108
. 108
. 108
. 108
108
. 109
. 109
Isolate
Isolation . . . . . . . . . . . . . . . .
MAP 3010: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3050: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3060: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3077: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3078: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3080: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3081: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3082: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3083: Isolating a Degraded SSA Link Error . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3085: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3086: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3095: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3096: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3100: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3101: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3105: Isolating a Loss of Power to a SSA DASD
Description . . . . . . . . . . . . . . .
Isolation:. . . . . . . . . . . . . . . .
MAP 3120: Isolating an SSA Link Error . . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3121: Isolating a Degraded SSA Link . . . .
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 3123: Array Repair Required . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Model 040.
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Problem Isolation Procedures, CHAPTER 3
110
111
111
112
113
113
113
117
118
118
121
121
122
126
126
126
129
130
130
133
133
134
135
135
136
140
140
141
144
144
144
148
148
148
150
151
151
155
156
156
158
159
159
168
168
169
172
172
173
173
173
174
180
181
181
183
43
Isolate
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3124: Isolating Between DDM Hardware and Microcode Failures
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3125: Isolating an Unexpected SSA SRN. . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3126: Isolating an Unexpected SSA Test Result . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3127: Formatting of a DDM Has Not Completed . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3128: Isolating an Unknown DDM Failure . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3129: Isolating an Array Repair Required Failure . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3142: Isolating Multiple DDMs on an SSA Loop Cannot be Accessed
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3150: Isolating an SSA DASD Drawer Power Problem . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3151: Isolating an SSA DASD Drawer Visual Power Problem . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3155: Isolating an SSA Link Error . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3158: Isolating an SSA Link Error . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3160: SSA DASD Drawer Isolating a Single DDM Redundant Power
Fault . . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3180: Controller Card Failed or Wrong Drawer Type Installed . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3190: Wrong Drawer Type Installed . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3200: Uninstalled SSA DDMs Connected to Loop A . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3210: Uninstalled SSA DDMs Connected to Loop B . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3220: Isolating too Few DDMs in an SSA DASD DDM Bay . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3280: Isolating too Few DDMs in an SSA Drawer. . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . .
44
VOLUME 1, ESS Service Guide
. 184
. 184
184
. 184
. 184
. 184
. 185
. 185
. 185
. 185
. 185
. 186
. 186
. 186
. 186
. 186
. 186
. 187
. 187
. 187
187
. 187
. 187
. 188
. 188
. 188
. 192
. 192
. 192
. 196
. 196
. 196
. 198
. 198
. 199
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
201
201
201
202
202
202
203
203
204
204
204
205
205
206
206
207
207
207
208
208
Isolate
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3300: Repair Alternate Cluster to Run SSA Loop Test . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3350: Isolating SSA DASD Drawer Power Problems . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3351: Isolating SSA DASD Drawer Visual Power Problems . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3352: Isolating SSA DASD Drawer Power Problems . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3353: Isolating SSA DASD Drawer Visual Power Problems . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3354: Isolating an SSA DASD Drawer Multiple DDM Redundant Visual
Power Fault . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3355: Isolating an SSA DASD Drawer Multiple DDM Redundant Power
Fault . . . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3356: Isolating SSA DASD Drawer Power On Problems . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3360: Ending a DASD Service Action . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3375: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3378: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3379: Analyzing a Storage Cage Fan/Power Sense Card Check
Summary Indicator On . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3380: Isolating 7133 Model 040 SSA DASD Drawer Power Problems
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3381: Isolating a Storage Cage Fan/Power Sense Card Error . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3384: Isolating a Storage Cage Fan Failure . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3387: Isolating a Storage Cage Power Supply Failure . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3390: Isolating SSA DASD Drawer Visual Power Problems, Model 040
Drawer . . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
Problem Isolation Procedures, CHAPTER 3
209
211
211
211
212
212
212
216
216
217
219
220
220
221
222
222
223
224
224
225
226
226
227
228
228
231
231
231
232
232
232
233
233
233
233
233
234
234
234
235
238
239
239
239
240
240
242
242
242
247
247
247
45
Isolate
MAP 3391: Isolating a Storage Cage Power System Problem . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3395: Isolating an SSA DASD DDM Bay Power Problem . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3397: Isolating an SSA DASD DDM Bay Controller Card Problem
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3398: Isolating a DDM bay Controller Card Communications Failure
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3400: Replacing an SSA DASD Drawer Backplane or Frame . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3421: Storage Cage Fan/Power Sense Card R2 Cable Problem . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3422: Storage Cage Fan/Power Sense Card R2 Jumper and Cable
Problems. . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3423: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Missing Error . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3424: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Failing Error. . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3425: Isolating a Storage Cage Fan/Power Sense Card R2 Cable
Error . . . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3426: Isolating a Storage Cage Fan/Power Sense Card Location Error
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3427: Isolating a Storage and DDM Bay Location Error . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3428: Isolating an SSA DASD Drawer Location Error . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3429: Isolating a DDM Location Error . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3500: Verifying an SSA DASD Drawer Repair . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3520: SSA DASD Drawer Verification for Possible Problems . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3540: Unrelated Occurrence, Retry Web Operation . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . . . .
MAP 3560: Unrelated Occurrence, Retry Verification Test . . . . . . . .
46
VOLUME 1, ESS Service Guide
253
253
253
259
259
259
261
261
261
262
262
262
263
264
264
264
264
265
266
266
266
267
268
268
269
269
269
270
270
271
271
271
272
273
273
274
275
275
275
278
278
279
279
279
279
280
280
280
280
281
281
281
Isolate
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3570: Unrelated Event Caused Resume Fail . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3600: Multiple DDMs Isolated on an SSA Loop . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3605: Isolating an Unexpected Result . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3610: DDM Installation with New Rank Site Capacity . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Detailed Description . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3612: DDM Installation with Mixed Capacity Rank Site . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Detailed Description . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3614: DDM Installation Introduces Different RPM . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3616: No Intermix of Bus Speeds is Allowed . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3618: Replacement DDM Has Slower RPM Than Called For .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3619: This Repair Requires a Larger Capacity DDM . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3620: Multiple DDMs Isolated on an SSA Loop . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3621: New DDM Storage Capacity Smaller Than Original DDMs
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3623: New DDM Storage Capacity Less Than 4.5 GB . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3625: All DDMs on SSA Loop A Do Not Have the Same
Characteristics. . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3626: All DDMs on SSA Loop B Do Not Have the Same
Characteristics. . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3630: Isolating an SSA Device Card/DRAM Problem . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3640: Other Cluster Fenced - Unable to Verify SSA Loop . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 3650: Wrong, Missing, or Failing Bypass Card . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
281
282
282
282
282
282
283
283
285
285
285
285
285
286
287
288
289
289
290
291
292
292
293
294
294
295
295
295
296
296
296
296
297
297
297
298
298
298
298
298
. . . 298
. . . 299
. . . 299
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
300
300
300
301
301
301
302
302
302
304
304
Problem Isolation Procedures, CHAPTER 3
47
Isolate
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 3652: Wrong, Missing, or Failing Passthrough Card . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 3654: Bypass Card Jumpers Wrong . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 3656: 20 MB SSA Cable Installed Where 40 MB Cable Expected
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 3680: Isolating a Two DDMs Detect Over-Temperature Problem .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 3685: Isolating a Multiple DDMs Detect Over-Temperature Problem
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAPs 4XXX: Cluster Bay Isolation Procedures. . . . . . . . . . .
MAP 4020: Performing the SCSI Hard Drive Build Process . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . . .
MAP 4030: CPI Hardware Version Mismatch . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4040: Entry MAP for CPI Problems . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4050: Isolating a CPI Problem . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4060: Replacement of Cluster FRUs for CPI Problems. . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4070: Replacement of Host Bay FRUs for CPI Problems . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4080: Powering the 2105 Model Exx/Fxx Off to Replace CPI FRUs
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4090: CPI Address Mismatch . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4100: Isolating a LIC Process Read/Display Problem . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4120: Handling Unexpected Resources . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4130: Handling a Missing or Failing Resource . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4140: Isolating a LIC Activation Process Failure . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4240: Isolating a Blinking 888 Error on the Cluster Operator Panel
Description . . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . . .
MAP 4320: Isolating E1xx SCSI Hard Drive Code Boot Problems . . .
48
VOLUME 1, ESS Service Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
. .
304
305
306
306
307
307
307
308
308
308
309
310
310
313
313
313
316
316
316
316
320
321
321
321
321
322
322
323
323
326
326
327
327
328
328
329
329
329
329
330
330
331
331
331
331
331
332
333
333
333
334
334
334
334
334
336
Isolate
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4340: Isolating a E3xx Memory Test Hang Problem . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4350: Isolating Cluster Code Load Counter=2 . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4360: Isolation Using Codes Displayed by the Cluster Operator
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4370: Error Displaying Problems Needing Repair . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4380: Isolating a Customer LAN Connection Problem . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4390: Isolating a Cluster to Cluster Ethernet Problem . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4400: Displaying Cluster SMS Error Logs . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . .
MAP 4420: Displaying I/O Planar UAA LAN Address . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . .
MAP 4440: ESSNet Console to Cluster Bay Problem . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . . . .
MAP 4450: ESSNet Cluster Bay to Customer Network Problem . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4480: Isolating a Cluster / RPC Problem . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 44F0: Electronics Cage Cooling Problem . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4500: Isolating an ESC=5xxx . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4510: Isolating a Cluster to Cluster CPI Communication Failure
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4520: Pinned Data and/or Volume Status Unknown . . . . .
Description . . . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . . . .
MAP 4540: Isolating Problems on a Minimum Configuration Cluster .
Description . . . . . . . . . . . . . . . . . . . . .
MAP Step 4540-1 . . . . . . . . . . . . . . . . . .
MAP Step 4540-2 . . . . . . . . . . . . . . . . . .
MAP Step 4540-3 . . . . . . . . . . . . . . . . . .
MAP Step 4540-4 . . . . . . . . . . . . . . . . . .
MAP Step 4540-5 . . . . . . . . . . . . . . . . . .
MAP Step 4540-6 . . . . . . . . . . . . . . . . . .
MAP Step 4540-7 . . . . . . . . . . . . . . . . . .
. .
. .
. .
. .
. .
. .
. .
. .
Panel
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
336
337
339
339
339
341
341
341
342
342
342
344
344
345
346
346
346
347
348
348
351
351
351
351
351
352
352
352
352
354
354
354
357
357
358
360
360
361
361
361
361
362
362
362
363
363
363
364
364
365
366
367
368
368
369
369
Problem Isolation Procedures, CHAPTER 3
49
Isolate
MAP Step 4540-8 . . . . . . . . . . . . . . . .
MAP Step 4540-9 . . . . . . . . . . . . . . . .
MAP 4550: NVS FRU Replacement . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4560: No Valid Subsystem Status Available . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4580: Pinned Data In Single Cluster NVS . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4600: Isolating a CD-ROM Test Failure . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4610: Cluster SP/System Firmware Down-level . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4620: Isolating a Diskette Drive Failure . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4630: Listed FRUs May Be Incomplete or Need Isolation .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4700: Replacing Cluster FRUs . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . .
MAP 4710: Isolating a DDM LIC Update Problem. . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4720: Cluster or Host Bay Fails to Power Off . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4730: Isolating a Cluster Power Off Request Problem . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4740: Fan Check Detected by I/O Planar, Model Exx Only
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4750: Cluster Bay Power is Off, Had to Force it Off . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4760: Recovering from Corrupted Files or Functions . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4770: Isolating a E152 Cluster Hang . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4780: Isolating a Functional Code Not Running Problem .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4790: Repairing the Electronics Cage . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4810: Unexpected Host Bay Power Off . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
50
VOLUME 1, ESS Service Guide
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
369
369
370
370
370
370
370
371
372
372
372
373
373
373
373
373
374
374
374
374
374
375
375
375
375
375
384
384
384
385
385
385
387
387
387
387
387
387
388
388
388
389
389
389
390
390
390
393
393
393
395
395
395
396
396
396
Isolate
MAP 4820: Isolating a SCSI Card Configuration Timeout . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4840: CPI Diagnostic Communication Problem . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 4970: Isolating a Software Problem . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . .
MAP 4980: Customer Copy Services Problems . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . .
MAP 4990: LIC Feature License Failure . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Procedure . . . . . . . . . . . . . . . . . . .
MAPs 5XXX: Host Interface Isolation Procedures . . . . . . .
MAP 5000: ESS Specialist Cannot Access Cluster . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5220: Isolating a SCSI Bus Error. . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5230: Isolating a Fixed Block Read Data Failure . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5240: Isolating a Customer Data Check Failure . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
Analyzing a Media SIM . . . . . . . . . . . . . .
MAP 5250: Isolating a Meta Data Check Failure . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5300: ESCON Link Fault . . . . . . . . . . . . .
Fiber Optic Cable Handling Precautions . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5310: ESCON Bit Error Validation . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5320: ESCON Optical Power Measurement . . . . . .
Description . . . . . . . . . . . . . . . . . . .
MAP 5340: CKD Read Data Failure . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5400: Fibre Channel Link Fault . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5410: Fibre Channel Bit Error Validation . . . . . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . . . . . .
MAP 5420: Fibre Channel Optical Power Measurement . . .
Description . . . . . . . . . . . . . . . . . . .
Isolation Procedure 1: . . . . . . . . . . . . . . .
Isolation Procedure 2: . . . . . . . . . . . . . . .
MAP 5430: Host Fibre Channel Fails to Recognize ESS LUNs .
Description . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Problem Isolation Procedures, CHAPTER 3
399
399
400
400
400
400
401
401
401
402
402
402
404
405
405
405
405
406
406
406
407
407
409
409
409
410
410
411
411
413
413
413
414
414
415
415
416
416
417
418
418
421
421
422
422
422
422
424
424
424
425
426
426
427
428
428
51
Isolate
Isolation . . . . . . . . . . . . . . . .
MAP 5440: Fibre Host Card Reports a Loss of Light
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAPs 6XXX: Service Terminal Isolation Procedures . .
MAP 6040: Isolating a Service Terminal Login Failure
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
MAP 6060: Isolating a Service Terminal Login Failure
Description . . . . . . . . . . . . . . .
Isolation . . . . . . . . . . . . . . . .
Service Terminal Connection Diagram . . . . .
.
.
.
.
.
To
.
.
To
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Both Clusters
. . . . . .
. . . . . .
One Cluster .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
428
430
430
430
430
431
431
431
432
432
432
433
MAPs 1XXX: General Isolation Procedures
The isolate procedures in the MAP 1XXX group of the Isolate chapter are general
MAPs that deal with reported errors and error logs.
MAP 1200: Prioritizing Visual Symptoms and Problem Logs For Repair
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Use this procedure if there is more than one visual symptom and/or problem log
needing repair.
Procedure
v Display the details of each problem log and then use the table below to prioritize
their repair.
Table 7. Prioritize Repairs Table
52
Condition
Description
Visual Symptoms
Visual symptoms should create a problem
log. Repair related problem logs before using
visual symptoms.
Multiple problem logs for one fault.
A single fault may create more than one
related problem log. The successful repair of
one problem log will automatically close the
other related problem logs for the same
resource.
Power problem logs
Power problem logs can normally be repaired
after logic problem logs because of the fault
tolerant power system design.
Cluster bay problem logs
Cluster bay problem logs should be repaired
before SSA loop or DDM problem logs. Both
fault free cluster bays are needed to verify
the repair of an SSA loop or DDM problem
log.
SSA loop or DDM problem logs
Both cluster bays must be fault free to verify
the repair of an SSA loop or DDM problem
log.
VOLUME 1, ESS Service Guide
MAP 1200: Prioritizing Visual Symptoms and Problem Logs For Repair
Table 7. Prioritize Repairs Table (continued)
Condition
Description
An SSA loop has two or more problem logs.
Repair this SSA loop before repairing an SSA
loop with only one problem log.
CPI interface problem logs for a cluster bay
and host bay.
All CPI interface problem logs needing
isolation use the same isolation MAP, so
either problem log can be used.
Cluster bay hung with a code displayed in its
operator panel, the other cluster bay Ready
LED indicator is on.
Use the other cluster to show and repair any
problem logs for it. If there are none, go to
“MAP 4360: Isolation Using Codes Displayed
by the Cluster Operator Panel” on page 342.
Each cluster bay has a problem log, at least
one cluster bay Ready LED indicator is on.
The service terminal must be connected to a
cluster bay with the Ready LED indicator on.
The problem log for the other cluster bay is
then repaired first.
A cluster bay has more than one problem log. If one of the problem logs has an ESC of
5xxx (SRN based repair), repair the other
problem log for the cluster bay first.
Both cluster bays are hung with a code in
their operator panels.
Repair either cluster bay first using a visual
symptom of the code.
MAP 1210: Displaying and Repairing a Problem Record
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A problem record was created by a cluster and stored in the problem log. A 2105
Model Exx/Fxx operator panel Message indicator was turned on to show which
cluster reported the problem. The problem may be in the cluster indicated, the other
cluster, or somewhere else in the 2105 Model Exx/Fxx. If the clusters can
communicate with each other, the service terminal can display problems from both
clusters while attached to either cluster. If the clusters cannot communicate, error
information will be displayed to connect the service terminal to the other cluster.
Problem records from that cluster can then be displayed. A failing cluster may be
able to communicate with the service terminal even when it cannot communicate
with the other cluster. The Message indicator turns off when the service terminal
connects to that cluster. If e-mail is enabled, a copy of the problem log will be sent
to the defined customer destinations.
The service terminal will be used to display the problem or problems needing repair.
The problem records show FRUs and/or isolation procedures needed to repair the
problem. The service terminal and service guide will work together to guide you
through the repair process.
Procedure
Use the following steps to display and repair the problem:
1. Ensure the 2105 Model Exx/Fxx is powered on.
2. Observe the 2105 Model Exx/Fxx operator panel Message indicators:
Problem Isolation Procedures, CHAPTER 3
53
MAP 1210: Display and Repair a Problem Record
v If both cluster message indicators are on, connect the service terminal to
either cluster bay.
v If only one cluster message indicator is on, connect the service terminal to
that cluster bay.
v If both cluster message indicators are off, connect the service terminal to
cluster bay 1.
3. Look at the service terminal screen.
Is the service terminal displaying the copyright and login screen?
v Yes, go to step 5.
v No, continue with the next step.
4. Connect the service terminal to the other cluster and try again.
Is the service terminal displaying the copyright and login screen?
v Yes, go to step 5.
v No, go to “MAP 6040: Isolating a Service Terminal Login Failure To Both
Clusters” on page 431.
5. Display the problem logs. From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Note: Each cluster will display its problem logs and the problem logs
from the other cluster. If the cluster cannot communicate with the
other cluster, an informational error message will be displayed.
With this condition, display the available problem logs then
connect to the other cluster and display its problem logs. If this
fails, a problem log for the cluster to cluster communication
problem should be available on the cluster that does display logs.
6. Display the one line description of each problem. Select the problem summary
line to display problem details such as: the reporting cluster, failing cluster,
FRUs, isolation procedures, and other information. Review all problems needing
repair before selecting one to repair.
7. Follow the service terminal instructions to select and repair a problem.
MAP 1300: Isolating Cluster to Modem Communication Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster is not able to communicate with the modem expander or the modem.
This error can occur for the following reasons:
v The modem expander or modem is powered off.
v The modem expander needs to be reset. Powering the modem expander off and
on will not reset it. The SET and CLEAR buttons must be used to reset the
modem expander. The service terminal configuration screens are used to reload
the initialization strings. This can only be done through cluster 1 in the 2105
Model Exx/Fxx. Modem expander port 1 is always cabled to cluster 1 in one of
the attached 2105 Model Exx/Fxxs. The other modem expander ports do not
have authority to accept the initialization string.
54
VOLUME 1, ESS Service Guide
MAP 1300: Cluster to Modem Communication
v The modem is hung and needs to be reset. Powering the modem off and on
should clear the hang. To ensure the modem is set correctly, use the service
terminal configuration screens to reload the initialization strings.
v The cable between the modem expander and modem, or the cluster and modem
expander, is disconnected or damaged.
v One or more of the modem configuration settings in the cluster is not configured
correctly.
The possible FRUs are:
v
v
v
v
v
Modem
Modem expander to modem cable, packaged with the modem expander
Modem expander
Cluster to modem expander cable (null modem cable)
Cluster I/O planar
The service terminal Change / Show Modem Configuration option has two
different uses:
1. It displays the modem configuration settings. These can be compared to the
values listed on the Communications Resources Work Sheet provided by the
customer.
2. It will attempt to initialize the modem expander and then the modem when the
Enter key is pressed. This occurs even if none of the displayed values have
been updated. This is a pass/fail test. If the test fails, no reason for the failure is
indicated.
Note: Any problems that were created while the modem was unavailable will still
be queued to be sent to the call home destination. If e-mail notification is
enabled, these problems will be sent to the customer by e-mail.
Isolation
1. Ensure the modem expander and modem are powered on by observing their
ON indicators.
2. Determine if the cluster to modem communication error is still present. Use the
following procedure as a cluster to modem communication test. Display the
Change / Show Modem Configuration screen.
From the service terminal Main Service Menu, select:
Configurations Options Menu
Configure Communications Resources Menu
Configure Call Home / Remote Services Menu
Change / Show Modem Configuration
Pressing enter, will attempt to initialize the modem expander and modem. If it
is not successful, an error message will be displayed. The error message does
not isolate the type of failure, this is a pass/fail test.
For an explanation of Call Home return codes, see Table 8 on page 57.
3. Determine if the test passed or failed:
v If the test failed, stopped with an error, go to step 4 on page 56.
v If the test was successful, complete OK, check that the modem can call the
defined remote telephone numbers.
From the service terminal Main Service Menu, select:
Problem Isolation Procedures, CHAPTER 3
55
MAP 1300: Cluster to Modem Communication
Machine Test Menu
Send Test Notification Menu
Service Notification (via modem)
v If the modem call is successful go to “MAP 1500: Ending a Service Action”
on page 68.
v If the modem call fails, go to “MAP 1301: Isolating Call Home / Remote
Services Failure” on page 58.
For an explanation of Call Home return codes, see Table 8 on page 57.
4. If a problem is found and corrected in any of the following steps, you should
jump to step 14 on page 57.
5. Get a copy of the Communication Resources Work Sheet that the customer
provided when this 2105 Model Exx/Fxx was installed. Refer to work sheet
section 6. Modem Configuration fields. Use the service terminal to display and
correct these fields as needed.
From the service terminal Main Service Menu, select:
Configurations Options Menu
Configure Communications Resources Menu
Configure Call Home / Remote Services Menu
Change / Show Modem Configuration
As required, update the modem configuration to match the
worksheets.
6. Verify that the cluster to modem expander cable is plugged into modem
expander Port 16 and the modem serial port.
7. Verify that the cluster to modem expander cable has the proper connectors
installed at each end. There must be a null modem connector (labeled null) on
one end, and a standard connector (not labelled) on the other end. The null
modem connector crosses signals so that the serial ports in the expander and
cluster can be connected directly together without a set of modems in
between.
8. Check that the modem expander to modem cable is plugged into cluster serial
port S3 and the proper port in the modem expander. Refer to the
Communication Resources Worksheet section 6. Modem Configuration fields.
9. Power the modem off and then on.
10. Power the modem expander off and then on.
11. Determine if the other cluster in this 2105 Model Exx/Fxx is also failing.
Connect the service terminal to the other cluster and run the cluster to modem
communications test again.
v If only one cluster fails, call the next level of support.
v If both clusters fail, continue with the next step.
12. Read the note below then reset the modem expander.
Note: Resetting the modem expander will load factory default settings. These
settings will not work with the 2105 Model Exx/Fxx. The modem
expander must be initialized through port 1 after the reset. You must
locate the 2105 Model Exx/Fxx with the cluster 1 that is cable to
modem expander port 1. Ensure that the customer will let you have
access to it. The modem expander can attache up to seven 2105 Model
Exx/Fxxs.
a. Press and hold both the SET and CLEAR buttons.
b. Release only the CLEAR button.
56
VOLUME 1, ESS Service Guide
MAP 1300: Cluster to Modem Communication
c. Release the SET button.
13. Initialize the modem expander. Connect the service terminal to the cluster 1
that is cabled to modem expander port 1. Use the cluster to modem
communication test to test and initialize the modem expander.
v If the test fails, call the next level of support.
v If the test is successful, continue with the next step.
14. Connect the service terminal to the original cluster that was failing and repeat
the cluster to modem communication test.
v If the test is successful, then go to “MAP 1500: Ending a Service Action” on
page 68.
v If the problem has not been fixed, and the cluster to modem communication
test still fails, call the next level of support.
Table 8. Call Home Return Codes
Return Code
Description and Information
00
INITIALIZE_SUCCESSFUL
Note: NO errors detected.
48
INITIALIZE_PARM_ERROR
Note: This is essentially an MLE error.
49
CLUSTER/EXPANDER/MODEM_ODM_ERROR
Note: This is a failure to access Call Home or other RAS ODM.
50
MODEM_DIAL_ERROR
Note: The same as return code 52 for 2nd number, check configuration.
51
TTY/EXPANDER/MODEM_CONNECT_TIMEOUT
Note: Actually a failure to connect or lock the tty, not necessarily a
hardware failure.
52
MODEM_FAILED_TO_CONNECT
Note: Phone being called was either busy or doesn’t answer, check
configuration.
53
TTY_MODEM_EXPANDER_BUSY
Note: NOT an error condition, some other cluster is using the
expander/modem.
54
MODEM_EXPANDER_CONFIG_ERROR
Note: Call Home not configured correctly.
55
MODEM_WRITE_ERROR
Note: Bad response from Call Home Catcher, may be bad phone lines or
Catcher failure.
56
MODEM_EXPANDER_TTY_ERROR
Note: Failure to connect tty to modem, not necessarily a hardware failure.
57
MODEM_RESET_ERROR
Note: Bad return from resetting the modem, may be a hardware problem
but no 1220 ESC issued.
58
MODEM_EXPANDER_INIT_ERROR
Note: Failure to initialize modem, can result in an 1220 ESC.
59
MODEM_EXPANDER_RESPONSE_ERROR
Note: Failure to receive a response from the Call Home Catcher OR
response was invalid.
60
MODEM_EXPANDER_NO_REPONSE
Note: Also know as a MODEM_HANG_ERROR, can result in an 1220
ESC.
Problem Isolation Procedures, CHAPTER 3
57
MAP 1301: Call Home / Remote Services
MAP 1301: Isolating Call Home / Remote Services Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Call Home / Remote Services has been configured on the storage facility (cluster)
but the cluster cannot communicate with IBM.
Description
This failure can occur for the following reasons:
v The customer’s analog phone line is not functional.
v The phone numbers and protocols defined to those phone numbers do not
match.
v A cabling problem exists between the cluster and the customer’s phone line.
Isolation
1. Verify that the phone number or phone numbers being used are valid and that
the customer phone line is functional:
a. Connect the customer’s analog phone line to a phone receiver set.
b. Call the phone number or phone numbers defined for use by the Configure
Call Home / Remote Services Menu.
If a modem answers, hang up and reconnect the customer’s phone line to
the modem. Continue with step 2.
2. Verify that the cabling between the cluster and the modem is functional.
a. Review “MAP 1300: Isolating Cluster to Modem Communication Problems”
on page 54.
b. Repair any problems found, if no problem is found go to step 3.
3. Determine if the protocol for a phone number is correct:
a. Call the next level of support. Have them confirm that the required PE
protocol or RETAIN protocol match the phone number or phone numbers
being used.
b. If the problem is not resolved, call the next level of support again.
MAP 1320: Isolating Problems Using Visual Symptoms
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Most visual symptoms create a related problem log which should be used to start
the problem repair. If a related problem log was not created, the table below can be
used to start the repair.
Isolation
v Locate your visual symptom in the following table then follow the description and
actions.
58
VOLUME 1, ESS Service Guide
MAP 1320: Visual Symptoms
– 2105 Model Exx/Fxx operator panel, use Table 9
– 2105 Model Exx/Fxx rack, cluster, and storage bay, use Table 10 on
page 60
– Model 020 drawer and DDMs, use Table 11 on page 64
– Model 040 drawer and DDMs, use Table 12 on page 65
– DDM bay and DDMs, use Table 13 on page 66
Table 9. 2105 Model Exx/Fxx Operator Panel Visual Symptom Table
Visual Symptom
Description and Action
2105 Model Exx/Fxx operator Description: A problem record has been logged in that cluster.
panel cluster Message
The indicator will go off when a service terminal login to that
indicator is on.
cluster occurs.
Action: Use the service terminal Repair Menu, Show / Repair
Problems Needing Repair option to begin the repair.
2105 Model Exx/Fxx operator Description: During cluster power on and code load, status
panel cluster Ready indicator codes are displayed on the cluster bay operator panel. When
is off.
the code load is complete the cluster bay operator panel Ready
indicator LED will be lit. The LED is set to off when a cluster is
fenced and a problem log is created.
Note: It is possible for the code to switch off the cluster
Ready indicator, even when the cluster is still ready. The
cluster will allow a service terminal login. The Repair Menu,
End Of Call Status option will show no related problem and the
cluster will not be fenced or quiesced. The Ready indicator will
return to normal operation when the cluster code is loaded
again.
Action:
v Use the service terminal Repair Menu, Display and Repair
Problems Needing Repair option to repair any related cluster
bay problem logs. If there are none, continue.
v Observe the cluster bay operator panel. If it is displaying a
code, go to “MAP 4360: Isolation Using Codes Displayed by
the Cluster Operator Panel” on page 342. If it is not,
continue.
v There is no single point of hardware failure that can cause
the operator panel Ready indicator to fail. (Behind the each
cluster Ready indicator are two LEDs, each controlled by a
single RPC card.) Call the next level of support.
Both 2105 Model Exx/Fxx
operator panel Line Cord
indicators off.
Description: Normal condition when the 2105 Model Exx/Fxx is
powered off, both primary power supplies (PPS) will have some
indicators on.
Will also occur if both customer line cords lose power, or both
PPS input circuit breakers are in the off position. Both PPS will
have all indicators off.
Action: If both PPS have some indicators on, no action
needed. If both PPS have all indicators off, ensure PPS input
circuit breakers are on and have customer restore line cord
power.
Problem Isolation Procedures, CHAPTER 3
59
MAP 1320: Visual Symptoms
Table 9. 2105 Model Exx/Fxx Operator Panel Visual Symptom Table (continued)
Visual Symptom
Description and Action
One 2105 Model Exx/Fxx
operator panel Line Cord
indicator off, the other Line
Cord indicator on.
Description: primary power supply (PPS) input power section
problem.
Action:
1. Use the service terminal to display and repair any related
power problems.
2. Observe the PPS front status display. If any codes are
displayed, go to “MAP 4360: Isolation Using Codes
Displayed by the Cluster Operator Panel” on page 342.
3. Observe the PPS front LED indicators. If the PPS Good
LED (middle) indicator is on, the operator panel Line cord
indicator should also be on. The indicator circuit is either
not active, is broken or the indicator is bad. One of the
following FRUs is failing:
v PPS
v PPS to RPC cable (PPS connector J4)(2105 Model
Exx/Fxx only)
v RPC card (for that PPS)(2105 Model Exx/Fxx only)
v RPC to Operator Panel cable (RPC connector J2)(2105
Model Exx/Fxx only)
v PPS to Operator Panel cable (PPS connector J2)(2105
Expansion Enclosure only)
Operator panel Line Cord
indicator slow blinking.
Description: The indicator slow blinks if a problem has been
detected. A code is displayed in the primary power supply
(PPS) status display.
Action: Go to “MAP 2350: Isolating PPS Status Indicator
Codes” on page 80.
Operator panel Line Cord
indicator fast blinking.
Description: The indicator fast blinks while the cluster is
powering on.
Action: None. Wait up to three minutes for the cluster power on
to complete.
Table 10. 2105 Model Exx/Fxx Rack, Cluster, and Storage Bay Visual Symptom Table
Visual Symptom
Description and Action
Cluster operator panel is
blank or stopped with a
progress code displayed.
Description: During cluster power on and code load, status
codes are displayed. They may display for seconds or minutes.
The 2105 Model Exx/Fxx operator panel cluster Ready
indicator will be on when the code load is complete.
An error condition is occurring if a code displays for more than
10 minutes. The alternate cluster may have created a problem
log. It may have specific problem information or may just report
no communication with the failing cluster.
Action: Go to “MAP 4360: Isolation Using Codes Displayed by
the Cluster Operator Panel” on page 342.
Both primary power supplies
(PPS) have all indicators off.
Description: This occurs when both customer line cords lose
power, or both PPS input circuit breakers are in the off position.
Action: Ensure PPS input circuit breakers are on and have
customer restore line cord power.
60
VOLUME 1, ESS Service Guide
MAP 1320: Visual Symptoms
Table 10. 2105 Model Exx/Fxx Rack, Cluster, and Storage Bay Visual Symptom
Table (continued)
Visual Symptom
Description and Action
A code is displayed in the
primary power supply (PPS)
status display.
Description: The PPS has detected an error condition.
Primary power supply (PPS)
indicators
Action: Go to “MAP 2350: Isolating PPS Status Indicator
Codes” on page 80.
Description: There are five PPS indicators which can be as
listed here:
1. UEPO PWR/STBY indicator is lit when customer line
voltage input is available to the PPS. A code is displayed in
the PPS status display.
2. UEPO Loop CMPLT indicator is lit when customer line
voltage input is available to the PPS and the UEPO Switch
is in the normal position. A code is displayed in the PPS
status display.
3. PPS Good indicator slow blinks in standby mode when the
2105 Model Exx/Fxx is off. The indicator is on when the
2105 Model E10/E20 is powered on.
4. PPS Fault indicator slow blinks when a fault has been
detected. A code is displayed in the PPS status display.
5. On Batt indicator is only lit when customer power to both
line cords has been lost. The 2105 Model Exx/Fxx will
complete writing the customer data in cache to DDMs and
will then power off within 5 minutes.
Action: Use other visual symptom in this table to correct any
problems.
Primary power supply (PPS) Description: The PPS has no customer line cord power and the
status display and indicators PPS to PPS communication is failing.
are off. The other PPS is has
Action: Ensure the communication cable is connected to PPS
a status display code of 06.
connector J3 at both ends. If it is, replace it. The status code
06 will automatically reset when communication is again
successful. “MAP 2340: PPS Status Code 06” on page 77.
Only one electronics cage
power supply has one or
more indicator LEDs off (front
or rear).
Description: The front indicator LEDs (HA1, SNMP, HA2) show
the state of the three separate outputs. HA1 is for the left host
bay in the electronics cage. SNMP is for the cluster bay. HA2
is for the right host bay in the electronics cage. The outputs are
individually controlled by the functional code in both clusters.
The rear LEDs show the state of the input power from each
PPS.
Action: Any two of the three storage cage power supplies will
supply all needed power. A failing power supply will create a
problem log. Use the service terminal to display and repair the
problem log.
Problem Isolation Procedures, CHAPTER 3
61
MAP 1320: Visual Symptoms
Table 10. 2105 Model Exx/Fxx Rack, Cluster, and Storage Bay Visual Symptom
Table (continued)
Visual Symptom
Description and Action
All three electronics cage
power supplies have their
HA1 or SNMP or HA2
indicator LEDs off. (A host
bay or cluster bay is powered
off.)
Description: The indicator LEDs show the state of the three
outputs. HA1 is for the left host bay in the electronics bay.
SNMP is for the cluster bay. HA2 is for the right host bay in the
electronics bay. Each output is controlled by the RPC cards
and cluster microcode. It is normal for the indicator LEDs to be
off when the service terminal repair option switches off the
power for that FRU resource.
Action:
v If the indicator LEDs switched off on their own, use the
service terminal to display and repair any related power
problems logs.
v If there are no related power problem logs, use the service
terminal Repair Menu, Replace a FRU option to simulate
replacing a FRU in the host bay or cluster bay with power
off. The option will quiesce, power off, power on and then
resume the resource. This will protect the customer from an
unexpected outage. This will display any remaining power
problems for this resource.
All three electronics cage
power supplies have one of
the input indicator LEDs off
(in the rear).
Description: The electronics cage power supplies each have
two inputs, one from each primary power supply (PPS). Either
input can supply all power needed. This allows one PPS to fail
or be powered off concurrently.
Action:
v Most likely one of the two PPS are not supplying power.
Ensure that the power cable from the PPS to the three
electronic cage power supplies is properly plugged at both
ends.
v Use the service terminal to display and repair any related
power problem logs.
v Observe each PPS status display, the failing PPS should
display a code. Go to “MAP 2350: Isolating PPS Status
Indicator Codes” on page 80.
v If the PPS is not failing, replace either the power cable from
the PPS to the three electronics cage power supplies or the
power supply itself.
RPC card indicator is off.
Description: The RPC card indicator is lit when the primary
power supply (PPS) that supplies power to it has customer line
cord power.
Action: Use the PPS status display code to begin the repair.
Primary power supply (PPS)
input circuit breaker is
tripped.
Description: An over-current condition in the PPS has occurred.
Primary power supply (PPS)
output circuit breaker is
tripped.
Description: An over-current condition outside the PPS has
occurred. A PPS status code 13 should be displayed.
Action: Do not reset the input circuit breaker. Replace the PPS.
Action: Go to “MAP 2350: Isolating PPS Status Indicator
Codes” on page 80.
62
VOLUME 1, ESS Service Guide
MAP 1320: Visual Symptoms
Table 10. 2105 Model Exx/Fxx Rack, Cluster, and Storage Bay Visual Symptom
Table (continued)
Visual Symptom
Description and Action
Storage bay power supply
indicators:
Description during normal operation:
v PWR, J1 and J2 indicators are both on (green)
v PWR, J1 and J2 indicators
These two indicators monitor the DC input voltage to the power
are not both on (green)
supply. They are green when the DC input voltage from the
primary power supply is present.
Action: Go to “MAP 3387: Isolating a Storage Cage Power
Supply Failure” on page 242.
Storage bay power supply
indicators:
Description during normal operation:
v CHK/POWER GOOD indicator is on (green)
v CHK/POWER GOOD
indicator is not on (green), This indicator is green with normal power on.
or
v If it is off, the power supply is not operating at all.
v CHK/POWER GOOD
v If it is on amber, the power supply has detected a fault and
indicator is on (amber)
has partly or completely powered off.
Action: Go to “MAP 3387: Isolating a Storage Cage Power
Supply Failure” on page 242.
Storage bay FAN/POWER
SUPPLY CHECK summary
indicator:
Description during normal operation:
v CHECK indicator is on
(amber)
This indicator is off during normal operations. If it is on, the
fan/power sense card has detected a storage bay or power
supply failure.
v CHECK indicator is normally off.
Action: Go to “MAP 3379: Analyzing a Storage Cage
Fan/Power Sense Card Check Summary Indicator On” on
page 233.
Storage bay FAN POWER
SENSE CARD CHECK
indicator:
Description during normal operation:
v CARD CHECK indicator is
on (amber)
This indicator is off during normal operations. If it is on, the
fan/power sense card is failing.
v CARD CHECK indicator is normally off.
Action: Go to “MAP 3378: Isolating a Storage Cage Fan/Power
Sense Card Error” on page 233.
A cooling fan is not turning:
Description during normal operation:
v All cooling fans should be turning.
A problem log should have been created for this.
Action: Use the service terminal Repair Menu, Show / Repair
Problems Needing Repair option to repair the fan.
Note: If no problem log exists there are two problems, the fan
and the fan detection circuitry. Call the next level of support
before replacing the fan.
Problem Isolation Procedures, CHAPTER 3
63
MAP 1320: Visual Symptoms
Table 11. Model 020 Drawer, and DDMs Visual Symptom Table
Visual Symptom
Description and Action
At the Front of The SSA
DASD Drawer, Model 020
drawer:
Description during normal operation:
v Green SSA DASD drawer
power indicator is off, or
v Amber SSA DASD drawer
check indicator is on or
blinking.
v Green drawer power indicator is on and
v Amber drawer check indicator is off.
Action: if not as described above, go to “MAP 3151: Isolating
an SSA DASD Drawer Visual Power Problem” on page 192.
For indicator locations, see
“SSA DASD Model 020
Drawer Indicators and Power
Switch” on page 9.
At the Front of The SSA
DASD Drawer, Model 020
drawer:
v Any green power
indicators is off.
Note: Indicators may be
obscured by internal
cabling.
Description during normal operation:
v All green power card indicators are on.
Action: if not as described above, go to “MAP 3354: Isolating
an SSA DASD Drawer Multiple DDM Redundant Visual
Power Fault” on page 223.
For indicator locations, see
“SSA DASD Model 020
Drawer Indicators and Power
Switch” on page 9.
At the rear of the SSA DASD Description during normal operation:
drawer, Model 020 drawer:
v Green power indicators is on and
v Green power indicator is
v Amber check indicators is off.
off, or
v Amber fan-and-power CHK Action: if not as described above, go to “MAP 3151: Isolating
an SSA DASD Drawer Visual Power Problem” on page 192.
(check) indicator is on or
blinking.
For indicator locations, see
“SSA DASD Model 020
Drawer Indicators and Power
Switch” on page 9.
At the rear of the SSA DASD Description during normal operation:
drawer, Model 020 drawer:
v If there are two SSA cables connected adjacent to
indicators,
Lights on the bypass card:
– Link status (ready) indicators is always on green
v Link status (ready)
– Mode indicator amber is always on.
indicators (green)
v If there are no SSA cables connected adjacent to lights,
v Mode indicator
(amber/green)
– Link status (ready) indicators are off
– Mode indicator amber is on.
For indicator locations, see
“SSA DASD Model 020
Action: if not as described above, “MAP 3520: SSA DASD
Drawer Indicators and Power Drawer Verification for Possible Problems” on page 280.
Switch” on page 9.
64
VOLUME 1, ESS Service Guide
MAP 1320: Visual Symptoms
Table 11. Model 020 Drawer, and DDMs Visual Symptom Table (continued)
Visual Symptom
Description and Action
Lights on disk drive modules, Description during normal operation:
Model 020 drawer:
v Green DDM power indicator is on, and
v Green DDM power
v Green DDM ready indicator is on, and
indicator is off, or
v Amber DDM check indicator is off
v Green DDM ready
indicator is off, or
Action: if not as described above, “MAP 3520: SSA DASD
Drawer Verification for Possible Problems” on page 280.
v Amber DDM check
indicator is on.
For indicator locations, see
“SSA DASD Model 020
Drawer Disk Drive Module
Indicators” on page 14.
At the rear of the SSA DASD Description during normal operation:
drawer, Model 020 drawer:
v If there are two SSA cables connected adjacent to an
indicator.
Lights on the bypass card:
– Link status indicators is always on green
v Green link status
– Mode indicator amber is always on.
indicators
v If there are no SSA cables connected adjacent to an
For indicator locations, see
indicator,
“SSA DASD Model 020
– Link status (ready) indicators are off
Drawer Indicators and Power
Switch” on page 9.
Action: if not as described above, “MAP 3520: SSA DASD
Drawer Verification for Possible Problems” on page 280.
Table 12. Model 040 drawer, and DDMs Visual Symptom Table
Visual Symptom
Description and Action
At the front of the SSA DASD Description during normal operation:
drawer, Model 040 drawer:
v Controller check indicator (amber) is off, and
v Controller check indicator v Fan check indicator (amber) is off, and
(amber) is on, or
v Fan power indicator (green) is on.
v Fan check indicator
(amber) is on, or
Action: if not as described above, “MAP 3390: Isolating SSA
DASD Drawer Visual Power Problems, Model 040 Drawer” on
v Fan power indicator
page 247.
(green) is off.
For indicator locations, see
“SSA DASD Model 040
Drawer Indicators and
Switches” on page 10.
Problem Isolation Procedures, CHAPTER 3
65
MAP 1320: Visual Symptoms
Table 12. Model 040 drawer, and DDMs Visual Symptom Table (continued)
Visual Symptom
Description and Action
At the rear of the SSA DASD Description during normal operation:
drawer, Model 040 drawer:
v Power supply CHK/PWR Good indicator is on (green), and
v Power supply CHK/PWR
v Power supply PWR indicator is on (green).
Good indicator is on
(amber), or
Action: if not as described above, “MAP 3390: Isolating SSA
DASD Drawer Visual Power Problems, Model 040 Drawer” on
v Power supply CHK/PWR
page 247.
Good indicator is off, or
v Power supply PWR
indicator is off (green).
For indicator locations, see
“SSA DASD Model 040
Drawer Indicators and
Switches” on page 10.
Lights on disk drive modules, Description during normal operation:
Model 040 drawer:
v Green DDM ready indicator is on and
v Green DDM ready
v Amber DDM check indicator is off
indicator is off or
Action: if not as described above, “MAP 3520: SSA DASD
v Amber DDM check
Drawer Verification for Possible Problems” on page 280.
indicator is on.
For indicator locations, see
“SSA DASD Model 040
Drawer and DDM Bay Disk
Drive Module Indicators” on
page 15.
Table 13. DDM Bay, and DDMs Visual Symptom Table
Visual Symptom
Description and Action
Lights on disk drive modules, Description during normal operation:
DDM bay:
v Green DDM ready indicator is on and
v Green DDM ready
v Amber DDM check indicator is off
indicator is off or
Action: Look at all of the above indicators on all of the DDMs in
v Amber DDM check
the DDM bay.
indicator is on.
v If all of the indicators on all of the DDMs in the DDM bay are
For indicator locations, see
off, go to “MAP 3395: Isolating an SSA DASD DDM Bay
“SSA DASD Model 040
Power Problem” on page 259. if the DDM indicators are not
Drawer and DDM Bay Disk
as described above, go to “MAP 3520: SSA DASD Drawer
Drive Module Indicators” on
Verification for Possible Problems” on page 280.
page 15.
Controller card DDM Check
indicator, DDM bay:
v Check indicator is on
(amber)
Description during normal operation:
v Check indicator is normally off.
This indicator is off during normal operations. If it is on, the
DDM bay controller card has detected a failure in the DDM
bay.
Action: Go to “MAP 3520: SSA DASD Drawer Verification for
Possible Problems” on page 280.
66
VOLUME 1, ESS Service Guide
MAP 1320: Visual Symptoms
Table 13. DDM Bay, and DDMs Visual Symptom Table (continued)
Visual Symptom
Description and Action
Controller Card CHECK
indicator, DDM bay:
Description during normal operation:
v Card Check indicator is normally off.
v Card Check indicator is on
This indicator is off during normal operations. If it is on, the
(amber)
DDM bay controller card is failing.
Action: Go to “MAP 3397: Isolating an SSA DASD DDM Bay
Controller Card Problem” on page 261.
MAP 1460: Isolating E-Mail Reported Errors
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A problem record was created by one of the 2105 clusters. It was stored in the
problem log and an e-mail copy of it was sent to the e-mail destination(s). The 2105
operator panel Message indicator for the reporting cluster should be on steady (not
blinking). The customer may have given you a copy of the e-mail or may just have
told you that an e-mail occurred. The service terminal will be used to display and
then repair the problem log.
Procedure
Use the following to begin the problem repair. If you have a copy of the e-mail
problem record, and this Service Guide you may be able to plan the service action
prior to arriving at the 2105 Model E10/E20. The problem record displays the FRUs
and/or isolation procedures used to determine the FRUs.
Go to, “MAP 1210: Displaying and Repairing a Problem Record” on page 53.
MAP 1480: Replacing a FRU, Without Using a Problem Log
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Occasionally the you may need to replace a FRU that is not failing and has not
generated a problem. The following procedure uses the service terminal functions to
replace a FRU with no problem.
This procedure replaces a FRU that no problem has been logged for.
Procedure
1. Select a FRU for replacement. From the service terminal Main Service Menu,
select:
Repair Menu
Replace a FRU
Problem Isolation Procedures, CHAPTER 3
67
MAP 1480: FRU Replacement Without Problem Log
Cluster Bay FRUs
Host Bay FRUs
DDM Bay or 7133 Drawer FRUs
Rack Power Cooling FRUs
Device Power Cooling FRUs
Electronics Cage Power Cooling FRUs
Select the FRU area and press enter. Select the FRU in the FRU area
and press enter.
2. Follow the service terminal instructions.
MAP 1500: Ending a Service Action
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Before leaving the customer account the following actions are needed:
v Ensure that the problem just repaired had its problem log closed. If not, use the
menu option to close it.
Note: Closing or cancelling a problem log will attempt to return to customer use
any fenced or quiesced resources.
v Ensure that any resources associated with the repair have been returned to
customer use.
v Ensure that any other resources not available for customer use are associated
with problem log(s) still needing repair. Plan to repair those problems.
Procedure
1. If the service terminal repair process did not automatically close the problem
log, then use this step to close it now.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Repair Menu
Close a Previously Repaired Problem.
Note: Closing or cancelling a problem log will attempt to return to customer use
any fenced or quiesced resources. If the problem was not fully repaired,
the existing problem log may be updated or a new problem log created.
2. Use the service terminal options listed below to ensure all resources for this
repair have been returned to customer use (they will not be listed). Any listed
resources are not available for customer use and will still be ’quiesced’ or
’fenced’. Those resources should have a related problem log listed that still
needs repair. If resources are listed and there are no problem logs listed, call
the next level of support.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Repair Menu
End of Call Status
68
VOLUME 1, ESS Service Guide
MAP 1600: ESSNet Console Problem
MAP 1600: ESSNet Console Problem
Description
The ESSNet Console platform has a software or hardware problem.
ESSNET Console Repair Process
The ESSNet Console is an off-the-shelf personal computer (PC) that has been
converted into an ESSNet Console.
The repair process for the ESSNet Console is:
1. Repair the personal computer
2. Restore the personal computer’s software
3. Convert the personal computer to an ESSNet Console
Repairing the Personal Computer: Since the ESSNet Console is an off-the-shelf
personal computer, it should be repaired by a person who is trained on repairing
PCs. Several levels of repair assistance are available, the following lists the
preferred order of service:
1.
2.
3.
4.
A person trained on repairing personal computers.
The IBM Technical Support Line at 1-800-IBM-2472.
The IBM Personal Systems Help Center at 1-800-772-2227.
The personal computer Hardware Maintenance Manuals on the Service
Document CD-ROM (SK2T-8771) shipped with each 2105 Model Exx/Fxx.
5. IBM Personal Computing Support on the Internet at
http://www.ibm.com/pc/support.
6. In emergency situations, the IBM 2105 Field Support Center can authorize
shipment of a replacement ESSNet Console.
Restoring the Personal Computer’s Software: This step is only required if the
PC’s hard drive has been replaced or its software has become damaged.
v IBM PC 300s: The personal computer’s software is restored to its off-the-shelf
state using the IBM PC 300’s Hardware Rebuild procedure. This procedure is in
the About Your Software Windows NT Workstation 4.0, Applications and Support
Software pamphlet. This pamphlet is shipped with the IBM PC 300, and uses the
IBM PC 300’s Product Recovery CD-ROM to restore the hard drive to its
off-the-shelf state.
v IBM Net Vista PCs: The personal computer’s software can be restored to its
off-the-shelf state using procedure ″Converting Windows NT 2000 to Windows
NT 4.0, NetVista″ in chapter 5 of the Enterprise Storage Server Service Guide,
Volume 2.
Converting the Personal Computer to an ESSNet Console: This step is only
required if the ESSNet PC’s software was restored to its off-the-shelf state or a
replacement ESSNet PC is being installed. If either of the above was used, the
ESSNet Console software must also be reinstalled. Reinstall the Console using the
ESSNet Installation Diskette and ″Installing and Connecting the ESSNet Console to
the ESSNet Hub″ in chapter 5 of the Enterprise Storage Server Service Guide,
Volume 2.
Isolation
Note: If you are not trained on repairing personal computers, have the ESSNet PC
repaired by a qualified technician.
Problem Isolation Procedures, CHAPTER 3
69
MAP 1600: ESSNet Console Problem
1. A problem with the ESSNet Console is occurring. Find the description that
applies:
v Hardware problem other than hard drive, go to step 2.
v Hardware problem with hard drive, go to step 3.
v Software problem with the Windows NT operating system, go to step 4.
v Software problem with the ESSNet Console application software, go to step
5.
2. Any hardware problem with the keyboard, mouse, display or server platform
should be repaired using the repair procedures in “Repairing the Personal
Computer” on page 69.
After the repair is complete, ensure that the ESS Specialist application functions
properly.
If the applications does not function properly, continue to the next step as
though the hard drive had been replaced.
3. The ESSNet Console’s hard drive is being replaced. Use the standard repair
procedures to replace the hard drive.
After the hard drive is replaced, perform the procedures in “Restoring the
Personal Computer’s Software” on page 69 and “Converting the Personal
Computer to an ESSNet Console” on page 69.
4. The Windows NT operating system needs to be reloaded. The hard drive has
been replaced or there was an operating system problem that could not be
recovered.
Perform the procedures in “Restoring the Personal Computer’s Software” on
page 69.
After Windows NT is fully installed, continue with the next step.
5. The ESSNet Console application software must be installed and configured.
Perform the procedures in “Converting the Personal Computer to an ESSNet
Console” on page 69.
MAPs 2XXX: Power and Cooling Isolation Procedures
Procedures in the MAP 2XXX group of the Isolate chapter cover the power and
cooling areas of the 2105 Model 100 units.
MAP 2000: Model 100 Power Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Model 100 power problems are repaired using the Model 100 service guide.
Isolation
Go to ″Entry MAP for All Service Actions″ in chapter 2 of the 2105 Model 100
Attachment to ESS Server Service Guide. Use the Power entry under the
ANALYZE and REPAIR a SERVICE REQUEST section of the table.
70
VOLUME 1, ESS Service Guide
MAP 2020: Power Symptoms
MAP 2020: Isolating Power Symptoms
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Most power symptoms create a related problem log which should be used to start
the problem repair. If a related problem log was not created, the table below can be
used to start the repair.
Isolation
Use the table below to find and repair your power symptom:
Table 14. 2105 Model Exx/Fxx Power Symptom Table
Power Symptom
Description and Action
Visual power symptoms.
Description: A problem record is created for most power
problems.
Action: Use the service terminal Repair Menu, Show / Repair
Problems Needing Repair option to begin the repair. If no
related problem records are found, go to “MAP 1320:
Isolating Problems Using Visual Symptoms” on page 58.
2105 Model Exx/Fxx will not
power on in local mode.
Description: If the RPC card switches are set for local mode,
the 2105 Model Exx/Fxx Local power switch should be able
to power it on.
Action: Go to “MAP 2400: 2105 Model Exx/Fxx Local Power
On Problems” on page 91.
2105 Model Exx/Fxx will not
power off in local mode.
Description: If the RPC card switches are set for local mode,
the 2105 Model Exx/Fxx Local power switch should be able
to power it off.
If a pinned data condition exists, a problem record will have
been created and the 2105 Model Exx/Fxx will not power off
until that condition is repaired.
Action: Go to “MAP 2440: Isolating 2105 Model Exx/Fxx
Power Off Problems” on page 99.
2105 Model Exx/Fxx will not
power on in remote mode.
Description: If the RPC card switches are set for remote
mode, a 2105 Model Exx/Fxx remote system should be able
to power it on.
Action: Go to “MAP 2390: Remote Power On Not Working”
on page 88.
2105 Model Exx/Fxx will not
power off in remote mode.
Description: If the RPC card switches are set for remote
mode, a 2105 Model Exx/Fxx remote system should be able
to power it on.
If a pinned data condition exists, a problem record will have
been created and the 2105 Model Exx/Fxx will not power off
until that condition is repaired.
Action: Go to “MAP 2390: Remote Power On Not Working”
on page 88.
Problem Isolation Procedures, CHAPTER 3
71
MAP 2020: Power Symptoms
Table 14. 2105 Model Exx/Fxx Power Symptom Table (continued)
Power Symptom
Description and Action
2105 Model Exx/Fxx will not
power on or off in automatic
mode.
Description: If the RPC card switches are set for automatic
mode, the 2105 Model Exx/Fxx should power on the first
time line cord power returns after both line cords lost power.
Action: Go to “MAP 2370: Automatic Power On Problem” on
page 84.
2105 Model Exx/Fxx UEPO
problems.
Description: The UEPO switch on the operator panel should
prevent the 2105 Model Exx/Fxx power on when in the off
position and should allow the 2105 Model Exx/Fxx power on
when in the on position.
Action: Go to “MAP 2360: 2105 Model Exx/Fxx UEPO
Problems” on page 82.
2105 Model 100 will not power Description: The power on control for the 2105 Model 100
on.
comes from the 2105 Model Exx/Fxx RPC cards.
Action: Go to “MAP 2420: 2105 Expansion Enclosure Power
On Problem” on page 96.
2105 Model 100 UEPO
problems.
Description: The UEPO switch on the operator panel should
prevent the 2105 Model 100 from powering on when in the
off position. It should allow the 2105 Model 100 to power on
when in the on position.
Action: Go to“MAP 2380: Isolating 2105 Expansion
Enclosure UEPO Problems” on page 86.
MAP 20A0: Cluster Not Ready
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Model Exx/Fxx is powered on. The cluster should be powered on and the
2105 Model Exx/Fxx operator panel cluster Ready indicator should be on.
Isolation
1. Ensure the 2105 Model Exx/Fxx is powered on.
2. Observe the 2105 Model Exx/Fxx operator panel cluster Ready indicator.
Is the cluster Ready indicator on?
v Yes, the problem is not failing. Go to “MAP 1500: Ending a Service Action” on
page 68.
v No, continue with the next step.
3. Connect the service terminal to the failing cluster and attempt to login.
Did the service terminal login and display the main menu?
v Yes, continue with the next step.
v No, go to 5 on page 73.
4. The cluster Ready indicator LED will be off.
72
VOLUME 1, ESS Service Guide
MAP 20A0: Cluster Not Ready
The cluster was fenced due to a problem. A problem log will have been created
as part of the fencing. Connect the service terminal to the working cluster and
use the Main Menu, Repair Menu, Display and Repair Problems Needing Repair
option.
Is there a related problem log?
v Yes, exit this MAP and repair the problem.
v No, there is no single point of hardware failure that should cause this. Each
cluster Ready indicator is really two LEDs behind the green lens. One is
driven from one RPC card, and the other is driven from the other RPC card.
Each RPC card receives the same cluster software command for the LED.
Call the next level of support.
5. Connect the service terminal to the working cluster and login. Use the Main
Menu, Repair Menu, Display and Repair Problems Needing Repair option to test
the cluster to cluster communication through the ethernet cable. The problem
log status for the login cluster will be displayed. The problem log status for the
failing cluster will also be displayed. It is either good status or a message that
the cluster is not responding.
Did the failing cluster successfully give the login cluster its problem log status?
v Yes, continue with the next step.
v No, go to step 7.
6. The failing cluster is able to communicate with the other cluster. It is not
accepting logins, and the Ready indicator is off, call the next level of support.
7. Press the eject button on the CD-ROM drive in the failing cluster.
Does the CD tray open?
v Yes, the cluster is powered on.
– If the cluster operator panel is hung with a progress code, go to “MAP
4360: Isolation Using Codes Displayed by the Cluster Operator Panel” on
page 342.
– If the cluster operator panel is displaying progress codes, wait up to 20
minutes for the cluster to come Ready.
– If the cluster operator panel is blank, connect the service terminal to the
failing cluster and attempt to login and display problems needing repair. If
the login is successful and there are no related problems, the cluster is in
ready, but the 2105 Model Exx/Fxx operator panel cluster ready indicator
is not working.
– To test the operator panel and cluster ready indicator, the cluster will need
to be quiesced, powered off, and powered on. Connect the service
terminal to the working cluster and use the Repair Menu, Alternate Cluster
Repair Menu options.
– If the operator panel does not display progress codes, ensure the operator
panel to I/O planar cable is connected at both ends. Then replace the
operator panel (EEPROM must be moved to new panel), I/O planar and
cable in that order until the progress codes are displayed. Go to “MAP
4700: Replacing Cluster FRUs” on page 375.
If progress codes are displayed, wait for the cluster bay to come ready,
then resume the cluster bay. Then go to “MAP 1500: Ending a Service
Action” on page 68.
v No, do the following steps:
a. Ensure both RPC to Electronics Cage Cables are connected.
Problem Isolation Procedures, CHAPTER 3
73
MAP 20A0: Cluster Not Ready
b. Replace the following FRUs in the listed order. SP Card (2105 Model
E10/E20 only), I/O Planar, Electronics Cage Sense Card, RPC1 Card,
RPC2 Card.
Connect the service terminal to the working cluster and use the Repair
Menu, Replace a FRU options. After the repair is complete go to “MAP
1500: Ending a Service Action” on page 68.
MAP 20B0: Cluster Did Not Power On, OK Displayed
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The 2105 Model Exx/Fxx is powered on. One cluster will not power on properly and
displays OK.
Isolation
1. Ensure the cluster bay is fully seated and the two thumbscrews are screwed
in.
Attention: Do not pull the cluster bay out to verify that it is seated. Verify the
cluster bay is fully seated by pushing it in, then tightening the two thumb
screws.
2. Ensure both RPC to electronics cage fan sense card cables are fully seated
(at back of electronics cage between four fans).
3. Observe the indicators on the front of the three electronics cage power
supplies.
Are the cluster bay indicators (center LEDs) for all three power supplies on?
v Yes, continue at the next step.
v No, press the 2105 Model Exx/Fxx operator panel Local power switch
momentarily to on (|, up):
– If the center indicators for all three power supplies are now on, continue
with the next step.
– If one or more center indicators are still not on go to “MAP 2210:
Electronics Cage Power Supply Problem” on page 76.
Note: Normally, a cluster bay will power on even if one electronics cage
power supply is not working. For this MAP, it is required to have all
three power supplies working properly before continuing.
4. Show and repair any related power problems or cluster bay problems. If there
are none, go to the next step.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
5. Press the operator panel local power switch momentarily to on (|, up).
Does the cluster power on? (OK is no longer displayed and progress codes
are displayed instead.)
v Yes, the cluster power on is working. Go to “MAP 1500: Ending a Service
Action” on page 68.
v No, go to the next step.
6. Quiesce the cluster and then power it off and then on. Use the Repair Menu,
Alternate Cluster Repair Menu options.
74
VOLUME 1, ESS Service Guide
MAP 20B0: Cluster Did Not Power On, OK
Does the cluster power on? (OK is no longer displayed and progress codes
are displayed instead.)
v Yes, the cluster power on is working. Wait for the cluster to come ready and
then go to “MAP 1500: Ending a Service Action” on page 68.
v No, go to the next step.
7. Read the following description to understand how cluster power control
operates.
Pressing the Local Power switch, on the 2105 Model Exx/Fxx operator panel
on (|, up). momentarily sends the power on signal to both RPC cards. Each
RPC card then sends the power on signal to the electronics cage power
supplies. The power supplies provide voltage to the cluster service processor
(SP) which comes ready and displays OK on the cluster operator panel. The
SP signals the RPC cards that it is ready to power on the cluster. The RPC
card(s) then respond with a signal to the electronics cage power supplies to
switch on power to the cluster bay power. The cluster logic powers on and
displays four digit progress codes on the cluster operator panel. Both RPC
cards 0 and 1 can control the SP in cluster bay 1 or 2. If one RPC is failing,
the other RPC should be able to power on both clusters. The exception is a
stuck fault on one RPC card could hold the cluster power on signal to Off. This
would prevent the cluster bay from powering on.
Go to the next step.
8. Use this step if cluster bay 1 is failing. For cluster bay 2 go to step 10.
Determine if the problem is a stuck fault failure on the shared cluster power on
line from both RPC cards to cluster bay 1.
a. Unplug the J3 cable from RPC card 1.
b. Try to power the cluster on using the service terminal.
Did the cluster bay power on?
v Yes, replace RPC card 1. Use the Repair Menu, Replace FRU Menu
options. Then return here and plug the J3 cable back in. Go to “MAP
1500: Ending a Service Action” on page 68.
v No, plug the J3 cable back in and go to the next step.
9. Use this step if cluster bay 1 is still failing.
a. Unplug the J3 cable from RPC card 2.
b. Try to power the cluster bay on using the service terminal.
Did the cluster bay power on?
v Yes, replace RPC card 2. Use the Repair Menu, FRU Replace Menu
options. Then return here and plug the J3 cable back in. Go to “MAP
1500: Ending a Service Action” on page 68.
v No, plug the J3 cable back in and go to step 12 on page 76.
10. Use this step if cluster bay 2 is failing. Determine if the problem is a stuck fault
failure on the shared cluster power on line from both RPC cards to cluster bay
2.
a. Unplug the J4 cable from RPC card 1.
b. Try to power the cluster bay on using the service terminal.
Did the cluster bay power on?
v Yes, replace RPC card 1. Use the Repair Menu, FRU Replace Menu
options. Then return here and plug the J4 cable back in. Go to “MAP
1500: Ending a Service Action” on page 68.
v No, plug the J4 cable back in and go to the next step.
11. Use this step if cluster bay 2 is still failing.
Problem Isolation Procedures, CHAPTER 3
75
MAP 20B0: Cluster Did Not Power On, OK
a. Unplug the J4 cable from RPC card 2.
b. Try to power the cluster bay on using the service terminal.
Did the cluster bay power on?
v Yes, replace RPC card 2. Use the Repair Menu, FRU Replace Menu
options. Then return here and plug the J4 cable. Go to “MAP 1500:
Ending a Service Action” on page 68.
v No, plug the J4 cable back in and go to the next step.
12. Connect the service terminal to the cluster bay not being repaired. Use the
alternate cluster repair menu option to power off the cluster bay. The FRUs to
replace are the SP card (2105 Model E10/E20 only) and cluster I/O planar.
(The 2105 Model F10/F20 I/O planar has the SP integrated on it.) Use the
service terminal Repair Menu, Replace FRUs Menu options.
Ensure that all the pins on the both parts of the cluster bay docking connector
at the rear of the cluster bay are not bent or broken.
2105 Model E10/E20 only. Once the cluster bay is open with access to the
FRUs, ensure that the SP card is properly seated on the I/O planar. Ensure
that the cable to the SP card is properly seated. Ensure that all the cables to
the cluster power planar are properly seated.
13. If the cluster bay still fails, the remaining FRUs are the cluster power planar
cable, the cluster power planar, the cluster power planar to docking connector
cable. If the cluster still fails, call the next level of support.
MAP 2210: Electronics Cage Power Supply Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
One or more electronics cage power supplies are not powering on or off the cluster
or host bay power boundaries. The power request may be from the service terminal
or the 2105 Model E10/E20 operator panel Local power switch.
The electronics cage power supplies only need to receive input power from one of
the two PPS power supplies to function and be able to provide power to the cluster
bay or host bays when requested to. The power input from the second PPS power
supply makes the power system fault tolerant.
Isolation
1. Observe the INPUT PRESENT indicators on the front of all three electronics
cage power supplies. Ensure the input power switch for each electronics cage
power supply is set to on (|, up).
Are both INPUT PRESENT indicators for all three electronic cage power
supplies off?
v Yes, input power from both PPS is not present. Ensure the power input
cables are connected at the PPS. Observe the PPS status code display and
then go to “MAP 2350: Isolating PPS Status Indicator Codes” on page 80.
v No, continue with the next step.
2. Repeat the power on or off procedure that sent you here before. If the
procedure still fails, return here and continue with the next step.
76
VOLUME 1, ESS Service Guide
MAP 2210: Electronics Cage Power Supply Problem
3. Observe the three electronics cage power supplies POWER ON indicators (front
of power supply). Observe the POWER ON indicator for the bay that is failing to
power on or off.
Do all three power supplies have the same POWER ON indicator either on or
off?
v Yes, ensure the failing bay is fully seated. Do not pull it out, only ensure it is
in and the release screws are secured.
v Replace the RPC cards one at a time. Use the Repair Menu, Replace a FRU,
Rack Power Cooling FRUs option.
If it still fails, replace the following FRUs until the problem is repaired. Use
“MAP 4790: Repairing the Electronics Cage” on page 395 to replace the
FRUs.
– Electronics cage sense card, see ″Rack, Electronics Cage Sense Card,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
– Electronics cage power planar, see ″Electronics Cage Power Planars and
Cables, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
– Electronics cage power planar to sense card cable, see ″Cables, 2105
Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
v No, replace the failing power supply. Use the Repair Menu, Replace a FRU,
Rack Power Cooling FRUs option.
If it still fails, replace the electronics cage power backplane. Use “MAP 4790:
Repairing the Electronics Cage” on page 395 to replace the FRU.
MAP 2320: Installed Unit Does Not Match Logical Unit
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A mismatch has been found between the type of unit physically attached to the
2105 Model Exx/Fxx and the type logically defined using the service terminal.
Isolation
1. The 2105 Model physically attached to the 2105 Model Exx/Fxx is different from
the model logically defined using the service terminal. The rack will need to be
logically removed.
v Use the service terminal Install/Remove Menu, Rack Menu, Remove an
Additional Rack option.
2. Use the Install an Additional Rack option to attempt the install again using
selecting the proper rack type. If it fails again, call the next level of support.
MAP 2340: PPS Status Code 06
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
77
MAP 2340: PPS Status Code 06
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The PPS status code of 06 is a communication failure between PPS-1 and PPS-2.
This communication failure can be caused by two different conditions:
1. A hardware communication fault between PPS 1 and PPS 2. Because PPS 1
and PPS 2 communicate in both directions, the failure could be in either PPS or
the communication cable.
2. A mismatch of the PPS identifications. When a PPS is installed in the PPS-2
position, which never has a battery signal cable connection, the PPS
identification status code should be a 92. When a PPS is installed in the PPS-1
position, which always has a battery signal cable connection, the PPS
identification status code should be a 91. If both PPS have the same
identification status code, they will display an 06 status code.
Isolation
1. Verify that the PPS to PPS Cable is properly plugged into the J3 connector on
PPS 1 and PPS 2.
Is the cable connected correctly?
v Yes, continue with the next step.
v No, connect the cable and then press the 2105 Model Exx/Fxx operator
panel Local power on switch momentarily to on (up). If the status code 06 is
no longer displayed, go to “MAP 1500: Ending a Service Action” on page 68.
If the status code 06 is still displayed, continue with the next step.
2. Ensure both PPS have the same code level.
Display the code level. Press the 2105 Model Exx/Fxx operator panel Local
power on switch momentarily to on (up). Observe the status code display on
each PPS. A sequence of 00, then xx (the code level number, 30-89), and then
yy (either 91 or 92). Are the code levels the same?
v Yes, continue with the next step.
v No, replace PPS-2. Call the next level of support to determine the proper
code level. Replace the PPS with the improper code level. Use the service
terminal Repair Menu, Replace a FRU, Rack Power Cooling FRUs options for
the Primary Power Supply. After the PPS is replaced, ensure both PPS have
the same code level. See the description for status code 00-xx-yy.
Note: Ensure both PPS in the rack are the same type. The new type have an
additional connector J5C that is not present on the old type. (The
exception to this is while upgrading a rack from the old to new type of
PPS concurrently.) For further information see ″Primary Power Supply,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
3. Determine if the communication failure is caused by a PPS identification
mismatch.
Display the PPS-2 identification. Press the 2105 Model Exx/Fxx operator panel
Local power on switch momentarily to on (up). Observe the PPS-2 status code
display. A sequence of 00, then xx (any number between 30-89), and then yy is
repeated for about 10 seconds. Does yy = 92?
v Yes, continue with the next step.
78
VOLUME 1, ESS Service Guide
MAP 2340: PPS Status Code 06
v No, replace PPS-2. Use the service terminal Repair Menu, Replace a FRU,
Rack Power Cooling FRUs options for the Primary Power Supply. After the
PPS is replaced, ensure both PPS have the same code level. See the
description for status code 00-xx-yy.
4. Display the PPS-1 identification. Press the 2105 Model Exx/Fxx operator panel
Local power on switch momentarily to on (up). Observe the PPS-1 status code
display. A sequence of 00, then xx (any number between 30-89), and then yy is
repeated for about 10 seconds. Does yy = 91?
v Yes, go to step 9.
v No, continue with the next step.
5. Verify that the PPS-1 to Battery Signal cable is properly plugged into the PPS-1
J5B connector and the Battery J1B connector.
Is the cable connected correctly?
v Yes, continue with the next step.
v No, connect the cable correctly, then return to the top of this MAP.
6. Switch the battery circuit breaker to the off position (down).
7. Unplug both ends of the PPS-1 to Battery Signal Cable (PPS-1 J5B and Battery
J1B). Use a meter to measure continuity of each of the four wires in the cable.
Does the continuity indicate any wire as an open circuit?
v Yes, replace the PPS-1 to Battery Signal Cable, switch the Battery circuit
breaker to the on position (up) and then return to the top of this MAP.
v No, continue with the next step.
8. Measure the continuity between the upper two pins of the 390V Battery J1B
connector. Measure the continuity between the lower two pins of the 390V
Battery J1B connector.
Do both pairs of pins indicate a closed circuit?
v Yes, replace PPS-1. Use the service terminal Repair Menu, Replace a FRU,
Rack Power Cooling FRUs options for the Primary Power Supply. After the
PPS is replaced, ensure both PPS have the same code level. See the
description for status code 00-xx-yy.
v No, replace the 390V Battery Set. Use the service terminal Repair Menu,
Replace a FRU, Rack Power Cooling FRUs options for the Primary Power
Supply.
9. A communication problem exists between the PPS.
Note: A bent pin in the J3 connector in either PPS can cause this failure.
Do both PPS display an 06 status code?
v Yes, replace each PPS and the PPS to PPS Cable until the problem is fixed.
Use the service terminal Repair Menu, Replace a FRU, Rack Power Cooling
FRUs options for the Primary Power Supply.
v No, the PPS displaying the 06 status code is receiving bad parity from the
sending PPS. Replace the sending PPS, the PPS to PPS Cable, the
receiving PPS in that order until the problem is fixed. Use the service terminal
Repair Menu, Replace a FRU, Rack Power Cooling FRUs options for the
Primary Power Supply. If the PPS is replaced, ensure both PPS have the
same code level. See the description for status code 00-xx-yy.
Problem Isolation Procedures, CHAPTER 3
79
MAP 2350: PPS Status Indicator Codes
MAP 2350: Isolating PPS Status Indicator Codes
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The PPS Status display is normally off. If a power fault is detected, the status
display will display a two digit code. If more than one fault is present, the first status
code will display followed by the next codes. If a status code is displayed, the
operator panel Line Cord indicator for this PPS should be blinking slowly. Pressing
the 2105 Model Exx/Fxx operator panel Local power switch momentarily to the on
position will display the PPS code level, the PPS I.D. and any status codes that are
active.
Isolation
1. The 2105 Model Exx/Fxx operator panel Line Cord indicator is slow blinking (1
per second) if a status code is still active. If the PPS status display is blank,
momentarily press the 2105 Model Exx/Fxx operator panel Local power switch
up (on) to display any active codes. If no codes are displayed, replace the PPS.
If codes are displayed, continue with the next step.
2. Use the table below to lookup the code and perform the action;
3. After the fault is repaired Go to “MAP 1500: Ending a Service Action” on
page 68.
Table 15. PPS Status Display Codes
Status Code
Description and Action
00-xx-yy
Description: PPS code level. 00 is displayed, followed by the PPS code level (xx, 3x-8x) and then the
PPS I.D. (yy, 91=PPS-1, 92=PPS-2). This sequence will repeat a few times at the start of a PPS
power and when the 2105 Model Exx/Fxx operator panel Local power switch is momentarily pressed
to on.
Action: None.
01
Description: PPS Fan #1 fault. The fan rotation sensor is reporting the fan is below minimum speed.
Action: Replace PPS Fan #1 (left fan). The visual symptoms automatically reset when the FRU is
replaced. Then go to: “MAP 1500: Ending a Service Action” on page 68.
02
Description: PPS Fan #2 fault. The fan rotation sensor is reporting the fan is below minimum speed.
Action: Replace PPS Fan #2 (right fan). The visual symptoms automatically reset when the FRU is
replaced. Then go to: “MAP 1500: Ending a Service Action” on page 68.
03
Description: 390 V battery has a low charge.
When fully discharged the battery can require up to 25 hours to become fully charged. The 03 status
code will no longer display when the batteries are fully charged.
v If status code 03 is still displayed after 25 hours a permanent error will be logged.
v When status code 03 is no longer displayed, the 390 V Battery has been fully charged. Go to
“MAP 1500: Ending a Service Action” on page 68.
04
Description: 390 V battery fault. The 390 V battery set is not detected properly.
Action: Go to “MAP 2470: Battery Set Detection Problem” on page 103.
80
VOLUME 1, ESS Service Guide
MAP 2350: PPS Status Indicator Codes
Table 15. PPS Status Display Codes (continued)
Status Code
Description and Action
05
Description: System on 390 V battery. The system has lost customer line cord input to both PPS and
is on the 390 V battery set to save volatile customer data before powering off.
Action: Have the customer restore power to the 2105 Model Exx/Fxx. If the power system is set to
local power mode, press the operator panel local switch to on (|, up) to power on the 2105 Model
Exx/Fxx If the power is set to remote power mode, have the customer power on the 2105 Model
Exx/Fxx. Then go to: “MAP 1500: Ending a Service Action” on page 68.
06
Description: PPS communication fault to the other PPS in this rack due to hardware communication
problem or both PPS reporting as the same logical PPS (PPS-1 or PPS-2).
Action:
v Go to “MAP 2340: PPS Status Code 06” on page 77.
07
Description: PPS A/C input phase is missing.
Action:
v Use the power checks for this line cord listed in the service guide Install chapter for this rack. Use
the service terminal Repair Menu, Replace a FRU option to prepare the PPS to be powered off for
the checks.
v If line cord power is not good, contact the customer.
v If line cord power is good, replace the PPS . Use the service terminal Repair Menu, Replace a
FRU option, Rack Power Cooling FRUs option.
08
Description: If the PPS UEPO PWR indicator is off, then the PPS line input is missing. (If the UEPO
PWR indicator is on, go to the next 08 in table.)
Action:
v There are three types of PPS, one for the complete input voltage range, one for a low input voltage
range and a one for a high input voltage range. The high input voltage range PPS will act like the
line cord input is missing if the customer is providing power at the low input voltage range.
For more information, refer to ″Primary Power Supply, 2105 Model Exx/Fxx and Expansion
Enclosure″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v Use the power checks for this line cord listed in the service guide Install chapter for this rack. Use
the service terminal Repair Menu, Replace a FRU option to prepare the PPS to be powered off for
the checks.
v If line cord power is not good, contact the customer.
v If line cord power is good, replace the PPS . Use the service terminal Repair Menu, Replace a
FRU option, Rack Power Cooling FRUs option.
08
Description: If the PPS UEPO PWR indicator is on, then only this PPS has a UEPO condition. (If the
UEPO PWR indicator is off, go to the prior 08 in this table.)
Action: Go to “MAP 2360: 2105 Model Exx/Fxx UEPO Problems” on page 82
09
Description: PPS over-temperature condition.
Action:
v Check that no other fault codes displayed, the room air temperature is within limits and proper
airflow is not blocked.
v Replace the PPS. Use the service terminal Repair Menu, Replace a FRU option, Rack Power
Cooling FRUs option.
Problem Isolation Procedures, CHAPTER 3
81
MAP 2350: PPS Status Indicator Codes
Table 15. PPS Status Display Codes (continued)
Status Code
Description and Action
10
Description: PPS Over-current Fault.
Action:
v If any output circuit breaker is tripped, go to “MAP 2520: PPS Output Circuit Breaker Tripped” on
page 107.
v If no output circuit breaker is tripped, replace the PPS . Use the service terminal Repair Menu,
FRU Replace Menu options, Rack Power Cooling FRUs option.
11
Description: PPS Over-voltage Fault.
Action: Replace the PPS . Use the service terminal Repair Menu, FRU Replace Menu options, Rack
Power Cooling FRUs option.
12
Description: PPS Under-voltage Fault.
Action:
v If there is also a status code 10, repair it first.
v If there is no other status code, replace the PPS. Use the service terminal Repair Menu, FRU
Replace Menu options, Rack Power Cooling FRUs option.
13
Description: PPS Output CB tripped.
Action: Go to “MAP 2520: PPS Output Circuit Breaker Tripped” on page 107.
14
Description: PPS Internal logic error
Action: Replace the PPS. Use the service terminal Repair Menu, Replace a FRU option, Rack Power
Cooling FRUs option.
15
Description: Battery low early warning
Action: The 2105 Model Exx/Fxx is on battery and the battery set has gone low. When the customer
restores line cord power, the battery set will be automatically recharged.
16
Description: Input CB tripped
Action: If the input circuit breaker tripped and no output circuit breaker tripped, there is a problem
inside the PPS. Do not reset the input circuit breaker (CB00) to the on position (up). Replace the
PPS. Use the service terminal Repair Menu, Replace a FRU option, Rack Power Cooling FRUs
option.
If the input circuit breaker was switched off intentionally (not by a problem), switch the input circuit
breaker back to the on position.
3x-8x
Description: The PPS code level. See the description for status code 00-xx-yy above.
91
Description: 91 is the ID status code for PPS-1. See the description for status code 00 above.
92
Description: 92 is the ID status code for PPS-1. See the description for status code 00 above.
MAP 2360: 2105 Model Exx/Fxx UEPO Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Model Exx/Fxx operator panel UEPO (Unit Emergency Power Off) switch
is used to switch off the PPS 395 V dc output. The logic voltage for the PPS
82
VOLUME 1, ESS Service Guide
MAP 2360: UEPO
internal logic, RPC card and operator panel are not switched off. To switch off all
logic voltage, the PPS input circuit breaker must be switched to off.
Note: Each PPS supplies the other PPS with logic voltage only for the PPS internal
logic through the PPS to PPS communication cable. This occurs if the PPS
input circuit breaker is on and customer line cord power is present.
The PPS UEPO PWR indicator is on when the PPS has customer input power, the
input circuit breaker is on and the PPS internal logic is providing UEPO logic
voltage. The UEPO LOOP-STBY indicator is on when the UEPO loop circuit is
completed with the Unit Emergency switch on (up).
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 29. 2105 Primary Power Supply Locations (s009048)
Isolation
The 2105 Model Exx/Fxx will be powered off during this isolation. Ensure it is not in
use by the customer. This isolation does a complete checkout of the UEPO
functions.
1. The 2105 Model Exx/Fxx should be in local power control mode for this MAP.
Ensure the RPC card local/remote switch for each RPC card is set to local
(down). If they are set to remote (up), set them to the down position. When the
repair is complete, set them back to their original position.
2. Power off the 2105 Model Exx/Fxx.
3. Ensure the input circuit breaker for each PPS is set to on (up).
Problem Isolation Procedures, CHAPTER 3
83
MAP 2360: UEPO
4. Ensure that the 2105 Model Exx/Fxx operator panel Unit Emergency switch is
set to on (up).
5. Ensure that the 2105 Model Exx/Fxx operator panel Local/Remote switch,
inside the front cover, is in the back position (partially covering the connector).
6. Ensure that the 2105 Model Exx/Fxx operator panel Local Power rocker switch
is not stuck in the down or up position. It is a momentary contact rocker
switch.
7. Is the PPS UEPO PWR indicator on?
v Yes, continue with the next step.
v No, go to the install chapter and perform the customer line cord power
checks. If no problems are found, replace the PPS and then return here.
Use the service terminal Repair Menu, Replace a FRU option.
8. Is the PPS UEPO LOOP-STBY indicator on?
v Yes, continue with the next step.
v No, the UEPO loop is open. Ensure the UEPO cable is plugged into PPS
connector J6 and operator panel UEPO card connectors J1 or J2. If still
failing replace the PPS to UEPO card cable, the operator panel UEPO card,
the PPS until the UEPO LOOP-STBY indicator comes on. Then go to the
next step.
9. Switch the operator panel UEPO switch to the off position (O, down).
Is the PPS UEPO LOOP-STBY indicator off?
v Yes, continue with the next step.
v No, the UEPO switch is not opening the UEPO loop circuit. Replace the
operator panel UEPO card, the PPS until it work properly. Then continue
with the next step.
10. The UEPO is working properly. Set the operator panel UEPO switch to the on
position (up). Return to the procedure that sent you here, or go to “MAP 1500:
Ending a Service Action” on page 68.
MAP 2370: Automatic Power On Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 subsystem has three power control modes.
v Automatic Power Mode: The 2105 Model Exx/Fxx will power on when power is
present on one or both mainline power cables. This happens only once after both
lines cords have been powered off. The operator panel Local Power switch can
still power the subsystem on and off.
v Local Power Control Mode: Subsystem power is controlled by the 2105 Model
Exx/Fxx operator panel Local Power switch.
v Remote Power Control Mode: Subsystem power is controlled by the host S/370
power control interface connection. The operator panel Local switch can power
the subsystem off but not on.
84
VOLUME 1, ESS Service Guide
MAP 2370: Automatic Power On
See Table 16
Table 16. RPC Card Configuration Switch Settings
Power
Mode
RPC Card
Local
Remote
Switch
DIP Switch DIP Switch DIP Switch DIP Switch
1
2
3
4
Automatic
RPC 1
Remote
On
Off
Off
Off
Automatic
RPC 2
Remote
Off
On
Off
Off
Remote
RPC 1
Remote
On
Off
On
Off
Remote
RPC 2
Remote
Off
On
On
Off
Local
RPC 1
Local
On
Off
Off
Off
Local
RPC 2
Local
Off
On
Off
Off
Isolation
1. Use the service terminal Repair Menu, Display / Repair Problems Needing
Repair option to repair any related power problems before continuing.
2. You must take the 2105 Model Exx/Fxx away from the customer before
continuing with this procedure.
3. Use the 2105 Model Exx/Fxx operator panel Local power switch to power off.
4. Ensure the RPC Interconnect Cable is connected.
5. Ensure the RPC card to Electronics Cage Cables are connected.
6. Ensure the switches on each RPC card are set for automatic mode per the
table above.
7. Set the input MAIN LINE circuit breaker (CB00) to off (down) on PPS 1 and
PPS 2.
8. Set the PPS 1 input CB to on (up).
Did the 2105 Model Exx/Fxx power on?
v Yes, do the following steps:
a. Power the 2105 Model Exx/Fxx off.
b. Set the PPS 1 input CB to off.
c. Set the PPS 2 input CB to on. When the 2105 Model Exx/Fxx powers on,
return to the procedure that sent you here or go to “MAP 1500: Ending a
Service Action” on page 68.
v No, continue with the next step.
9. Set the PPS 1 input CB to off.
10. Set the PPS 2 input CB to on.
Did the 2105 Model Exx/Fxx power on?
v Yes, do the following steps:
a. Power off the 2105 Model Exx/Fxx
b. Set the PPS 2 input CB to off.
c. Replace RPC1.
d. Go to step 8.
v No, do the following steps:
a. Set the PPS 2 input CB to off.
b. Replace the following FRUs one at a time until the procedure works:
– RPC1
– RPC2
Problem Isolation Procedures, CHAPTER 3
85
MAP 2370: Automatic Power On
– RPC Interconnect Cable
MAP 2380: Isolating 2105 Expansion Enclosure UEPO Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Expansion Enclosure operator panel UEPO (Unit Emergency Power Off)
switch is used to switch off the PPS 395 V dc output in the 2105 Expansion
Enclosure only. The logic voltage for the PPS internal logic, RPC card and operator
panel are not switched off. To switch off all logic voltage, the PPS input circuit
breaker must be switched to off.
Note: Each PPS supplies the other PPS with logic voltage only for the PPS internal
logic through the PPS to PPS communication cable. This occurs if the PPS
input circuit breaker is on and customer line cord power is present.
The 2105 Expansion Enclosure operator panel UEPO switch only powers off the
2105 Expansion Enclosure, not the 2105 Model Exx/Fxx. The 2105 Expansion
Enclosure is powered on using the 2105 Model Exx/Fxx operator panel local/remote
power control switch.
The PPS UEPO PWR indicator is on when the PPS has customer input power, the
input circuit breaker is on and the PPS internal logic is providing UEPO logic
voltage. The UEPO LOOP-STBY indicator is on when the UEPO loop circuit is
completed with the Unit Emergency switch on (up).
86
VOLUME 1, ESS Service Guide
MAP 2380: 2105 Expansion Enclosure UEPO
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 30. 2105 Primary Power Supply Locations (s009048)
Isolation
The 2105 Expansion Enclosure and 2105 Model Exx/Fxx will be powered off during
this isolation. Ensure it is not in use by the customer. This isolation does a complete
checkout of the UEPO functions.
1. The 2105 Model Exx/Fxx should be in local power control mode for this MAP.
Ensure the RPC card local/remote switch for each RPC card is set to local
(down). If they are set to remote (up), set them to the down position. When the
repair is complete, set them back to their original position.
2. Power off the 2105 Model Exx/Fxx, which also powers off the 2105 Expansion
Enclosure.
3. Ensure the input circuit breaker for each 2105 Expansion Enclosure PPS is set
to on (up).
4. Ensure that the 2105 Expansion Enclosure operator panel Unit Emergency
switch is set to on (up).
5. Ensure that the 2105 Expansion Enclosure operator panel Local/Remote switch,
inside the front cover, is in the back position (partially covering the connector).
6. Is each 2105 Expansion Enclosure PPS UEPO PWR indicator on?
v Yes, continue with the next step.
v No, go to the install chapter and perform the customer line cord power
checks. If no problems are found, replace the PPS and then return here. Use
the service terminal Repair Menu, Replace a FRU option.
Problem Isolation Procedures, CHAPTER 3
87
MAP 2380: 2105 Expansion Enclosure UEPO
7. Is each 2105 Expansion Enclosure PPS UEPO LOOP-STBY indicator on?
v Yes, continue with the next step.
v No, the UEPO loop is open. Ensure the UEPO cable is plugged into PPS
connector J6 and operator panel UEPO card connectors J1 or J2. If still
failing replace the PPS to UEPO card cable, the operator panel UEPO card,
the PPS until the UEPO LOOP-STBY indicator comes on. Then go to the
next step.
8. Switch the 2105 Expansion Enclosure operator panel UEPO switch to the off
position (O, down).
Is each 2105 Expansion Enclosure PPS UEPO LOOP-STBY indicator off?
v Yes, continue with the next step.
v No, the UEPO switch is not opening the UEPO loop circuit. Replace the
operator panel UEPO card, the PPS until it work properly. Then continue with
the next step.
9. The UEPO is working properly. Set the 2105 Expansion Enclosure operator
panel UEPO switch to the on position (up). You may now power up the 2105
Model Exx/Fxx if needed. Return to the procedure that sent you here, or go to
“MAP 1500: Ending a Service Action” on page 68.
MAP 2390: Remote Power On Not Working
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The 2105 Model Exx/Fxx power can be controlled in three modes:
1. Local - With line cord power present, only the operator panel Local power
switch controls power on and power off. RPC card Local/Remote switch in
remote position (up) and RPC card switch 1 DIP position 3 to either position.
2. Automatic Mode - Loss of power to both line cords causes a power off after the
2105 Model Exx/Fxx has de-staged customer data using the batteries for up to
5 minutes. When one or both line cords have power again, a power on
automatically occurs. The automatic power will only occur once after each
power loss to both line cords. The operator panel Local power switch can also
control power on and off. RPC card Local/Remote switch in remote position (up)
and RPC card switch 1 DIP position 3 is off (left).
3. Remote Mode - With line cord power present, a remote control power cable
from a host system controls power on and power off. The operator panel Local
power switch cannot control a power off. If the remote power signal is creating a
power off condition, the operator panel Local power switch cannot control a
power on. RPC card Local/Remote switch in remote position (up) and RPC card
switch 1 DIP position 3 is on (right).
It only requires one host system to power on the 2105 Model Exx/Fxx, even if
remote power control cables from others host systems that are powered off are
connected. A single system cannot power off the 2105 Model Exx/Fxx unless all the
host systems with remote power control cables attached are powered off.
The RPC cards each passes a 4.4 volt signal to the HDI card which then is
connected to pin x of each HDI host port connector. That signal goes to the host
which controls two return lines. The pick line return is pulsed momentarily to begin
the 2105 Model Exx/Fxx power on. The hold line return is held active to keep 2105
88
VOLUME 1, ESS Service Guide
MAP 2390: Remote Power On Not Working
Model Exx/Fxx powered on. When the hold line drops, the 2105 Model Exx/Fxx will
power off if no other hold lines from other hosts are active.
Isolation
1. This procedure requires the 2105 Model Exx/Fxx be taken away from customer
use so it can be powered off and on. Ensure all customer activity is stopped
before going to the next step.
2. Determine the type of remote power control installed/configured. For more
information see the description section above.
v Remote Mode - Controlled from one or more host systems, each having a
remote power control cable connected to the HDI card in the tailgate.
Continue with the next step.
v Automatic Mode - Controlled by line cord power changes. Go to “MAP 2370:
Automatic Power On Problem” on page 84.
3. Set the 2105 Model Exx/Fxx to local power control mode and power off.
v Set switch 1 DIP position 3 to off (left) for both RPC cards.
v Press the 2105 Model Exx/Fxx operator panel local power switch
momentarily to off. Wait up to 5 minutes for power off to complete.
4. Set the 2105 Model Exx/Fxx to remote power control mode.
v Set switch 1 DIP position 3 to on (right) for both RPC cards.
v Ensure the Local/Remote switch is set to on (up) for both RPC cards.
5. Ensure the host remote power control cables are properly connected to the
HDI card in the tailgate and also at each host system.
6. Ensure the HDI card to RPC card cable is properly connected to HDI card J1
and J6 on both RPC cards.
7. Determine if more than one host system is connected to the HDI card in this
tailgate.
Is there more than one host system remote power control cable connected?
v Yes, choose one of the following:
– If remote power DOES work from any of those host systems, go to step
8.
– If remote power DOES NOT work from any of those host systems, go to
step 9 on page 90.
v No, go to step 10 on page 90.
8. This step isolates the problem to the 2105 Model Exx/Fxx or a host system.
v Use step 3 to power down the 2105 Model Exx/Fxx.
v Use step 4 to change back to remote power control mode.
v At the 2105 Model Exx/Fxx HDI card, unplug two remote power control
cables, one from a host system that works and one from a host system that
does not. Swap the two cables and plug them back in.
v Attempt to power on from the host system that originally worked.
Does the 2105 Model Exx/Fxx power on.
– Yes, the 2105 Model Exx/Fxx HDI port works with one host system
remote power control cable plugged in and fails with the other host
system power control cable plugged in. The problem is in the host
system or the remote power control cable from that system.
– No, the 2105 Model Exx/Fxx HDI port fails with a host system remote
power control cable that worked when connected to a different HDI port.
The problem is internal to the 2105 Model Exx/Fxx. Replace the HDI
card and HDI to RPC cards cable until the problem is fixed. (The 2105
Problem Isolation Procedures, CHAPTER 3
89
MAP 2390: Remote Power On Not Working
Model Exx/Fxx can power on with only one RPC working, therefore the
RPC cards are not included here.) When the problem is corrected go to
“MAP 1500: Ending a Service Action” on page 68.
9. More than one host system cannot power on the 2105 Model Exx/Fxx. Do step
3 on page 89 to set the 2105 Model Exx/Fxx to local power mode. Attempt to
power on using the operator panel local power switch.
Does the 2105 Model Exx/Fxx power on?
v Yes, it only fails in remote power mode. Replace the HDI card and HDI to
RPC cards cable until the problem is fixed. (The 2105 Model Exx/Fxx can
power on with only one RPC working, therefore the RPC cards are not
included here.) When the problem is corrected go to “MAP 1500: Ending a
Service Action” on page 68.
v No, go to “MAP 2400: 2105 Model Exx/Fxx Local Power On Problems” on
page 91.
10. This tests more than one remote power control connector on the HDI card in
the tailgate. Unplug the remote power control cable from the HDI card
connector and plug it into a different connector. Attempt to power on the 2105
Model Exx/Fxx from the host system.
Does it power on?
v Yes, one or more host ports on the HDI card are failing. Replace the
following FRUs until the problem is fixed, HDI card and HDI to RPC cable.
Then go to “MAP 1500: Ending a Service Action” on page 68.
v No, continue with the next step.
11. Isolate the failure to the 2105 Model Exx/Fxx (not sending or receiving) the
remote power control signal) or the host system (not receiving or returning the
power controls signals).
v Use step 3 on page 89 to power off the 2105 Model Exx/Fxx.
v Both RPC cards supply +4.4v to each HDI card host port, pin 1 and 2. Use
a volt-meter to measure the voltage present at a free connector.
Do pins 1 and 2 have +4.4v present?
– Yes, the voltage is leaving the 2105 Model Exx/Fxx, go to step 12.
– No, the +4.4v from both RPC cards is not reaching the HDI card host
port connectors. Replace the HDI card and HDI to RPC cards cable until
the problem is fixed. When the problem is corrected go to “MAP 1500:
Ending a Service Action” on page 68.
12. Ensure the remote power control cable is plugged into the HDI card. Ensure
the host system is powered up and it has attempted to power on the attached
devices. This should leave the hold line line active at +5v. Measure the
voltage at the HDI connector pin 5 that the cable is plugged into.
Is +5v present?
v Yes, go to step 13.
v No, go to step 14 on page 91.
13. Measure the pick line voltage at the HDI connector pin 5 that the cable is
plugged into. The voltage will momentarily pulse when the host system
requests the attached devices to power on. You may need a second person at
the host system to create the power on condition. Is +5v momentarily present?
v Yes, both needed signals are being returned to the 2105 Model Exx/Fxx.
Replace the HDI card and HDI to RPC cards cable until the problem is
fixed. (Only one RPC card is needed to power on the 2105 Model Exx/Fxx
90
VOLUME 1, ESS Service Guide
MAP 2390: Remote Power On Not Working
and because there are two present, they are not part of the FRU group.)
When the problem is corrected go to “MAP 1500: Ending a Service Action”
on page 68.
v No, continue with the next step.
14. The 2105 Model Exx/Fxx is sending the voltage but not receiving one or both
signals needed to power on. The problem is either in the remote power control
cable or the host system control of the signals. Use the host system
documentation to ensure the host system is receiving the voltage and then
returning the control signals back to the 2105 Model Exx/Fxx If the host is
returning the signals back, the remote power control cable may have one or
more open lines. Correct the problem and then go to “MAP 1500: Ending a
Service Action” on page 68.
MAP 2400: 2105 Model Exx/Fxx Local Power On Problems
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The 2105 Model Exx/Fxx is not powering on properly. Only one of the two 2105
Model Exx/Fxx power systems is needed to power on the 2105 Model Exx/Fxx.
However, this MAP will require both power systems to be functioning.
Isolation
1. At the 2105 Model Exx/Fxx, are the Local /Remote power switches on both
RPC cards [Figure 31] set to Local mode (down)?
v Yes, go to step 3 on page 92.
v No, continue with the next step.
2. Set the RPC card Local/Remote power switches to Local mode (down).
Attempt to power on the 2105 Model Exx/Fxx using the 2105 Model Exx/Fxx
operator panel Local Power switch [Figure 33].
Does it power on?
v Yes, the 2105 Model Exx/Fxx only fails in remote power control mode. Go to
“MAP 2390: Remote Power On Not Working” on page 88.
v No, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
91
MAP 2400: Local Power On
RPC2
RPC1
REMOTE
RPC
Power Select
Switch
LOCAL
Address
Switches
0
0
0
0
1
1
1
1
1
2
3
4
0
Rear View
1
Figure 31. 2105 Model Exx/Fxx RPC Local/Remote Switch Location (S008612m)
3. Observe the primary power supply (PPS) to RPC control cables.
v PPS-1 connector J4 to RPC-1 connector J2.
v PPS-2 connector J4 to RPC-2 connector J2.
Are both cables properly connected?
v Yes, continue with the next step.
v No, before reconnecting the cable, go to the PPS it should be connected to
and set the input circuit breaker to the off position. The 2105 Model Exx/Fxx
RPC cards can stay powered on while the cable is connected. Connect the
cable. Set the input circuit breaker to on (up), then attempt to power on the
2105 Model Exx/Fxx again. If it still fails, continue with the next step. If it
works, go to “MAP 1500: Ending a Service Action” on page 68.
4. Ensure each PPS input circuit breaker is set to on (up).
92
VOLUME 1, ESS Service Guide
MAP 2400: Local Power On
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 32. 2105 Primary Power Supply Locations (s009048)
5. Observe each PPS UEPO PWR indicator. Is the indicator on?
v Yes, continue with the next step.
v No, go to “MAP 2350: Isolating PPS Status Indicator Codes” on page 80 for
status code 8 and perform the actions listed.
6. Ensure the 2105 Model Exx/Fxx operator panel UEPO switches are set to on
(up).
7. Observe the 2105 Model Exx/Fxx PPS UEPO LOOP-STBY indicator.
Is it on?
v Yes, continue with the next step.
v No, go to “MAP 2360: 2105 Model Exx/Fxx UEPO Problems” on page 82.
8. Observe the PWR GOOD indicator.
Is it slow blinking?
v Yes, the PPS is in standby mode, waiting for a power on request. Continue
with the next step.
v No, replace the PPS. If the 2105 Model Exx/Fxx still fails to power on,
return to the beginning of this MAP.
9. Observe the PWR UNIT FAULT indicator.
Is it on?
v Yes, use the PPS status code displayed to repair the problem. Go to “MAP
2350: Isolating PPS Status Indicator Codes” on page 80.
v No, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
93
MAP 2400: Local Power On
10. Observe the PPS Status Code display.
Is a status code displayed?
v Yes, use the PPS status code displayed to repair the problem. Go to “MAP
2350: Isolating PPS Status Indicator Codes” on page 80. Return to the
beginning of this MAP after the repair is complete.
v No, continue with the next step.
11. Attempt to power on the 2105 Model Exx/Fxx It is best to have the 2105 Model
Exx/Fxx in local power control mode instead of remote power control mode.
Ensure the Power Select switch on each 2105 Model Exx/Fxx RPC card is in
the Local position (down). Press the 2105 Model Exx/Fxx operator panel Local
power control switch momentarily to the on position. set to on (up).
Note: Remember to return these switches to their original position after the
repair is complete.
2105 Model Exx/Fxx
LOCAL
Unit
Emergency
REMOTE
L/R
SWITCH
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Messages
Cluster 1
Cluster 2
Front
View
Front View
Rear View
Figure 33. 2105 Model Exx/Fxx Operator Panel Locations (S008811m)
12. Observe each 2105 Model Exx/Fxx PPS. Find the condition that now exists.
v The PPS GOOD indicator is on solid which is normal operation. 390V output
is being supplied to electronics cage and storage cage power supplies. The
2105 Model Exx/Fxx should be powering on. If not, reenter the service guide
with the new symptom(s).
v A PPS status code is displayed. Go to “MAP 2350: Isolating PPS Status
Indicator Codes” on page 80.
v The PPS GOOD indicator is still slow blinking. Continue with the next step.
13. Replace the PPS.
Does it still fail?
v Yes, continue with the next step.
v No, go to “MAP 1500: Ending a Service Action” on page 68.
14. Replace the RPC card for the PPS that is slow blinking and then attempt to
power on. (RPC-1 for PPS-1, RPC-2 for PPS-2) If the clusters are in Ready,
use the service terminal FRU Replace menu option to replace the RPC card. If
it still fails, call the next level of support. If it no longer fails go to “MAP 1500:
Ending a Service Action” on page 68.
94
VOLUME 1, ESS Service Guide
MAP 2410: RPC Power Mode Switch Mismatch
MAP 2410: RPC Power Mode Switch Mismatch
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The RPC card switch settings for local or remote power control must be set the
same on both RPC cards. If they are not, a problem log will be created and the
RPC card set for remote power mode will be fenced even if the problem is with the
RPC card set for local mode.
The 2105 subsystem has three power control modes.
v Local Power Control Mode: Subsystem power is controlled by the 2105 Model
Exx/Fxx operator panel Local Power switch.
v Remote Power Control Mode: Subsystem power is controlled by the host S/370
power control interface connection. The operator panel Local switch can power
the subsystem off but not on.
v Automatic Power Mode: The 2105 Model Exx/Fxx will power on when power is
present on one or both mainline power cables. This happens only once after both
lines cords have been powered off. The operator panel Local Power switch can
still power the subsystem on and off.
Table 17. RPC Card Configuration Switch Settings
Power
Mode
RPC Card
Local
Remote
Switch
Remote
RPC 1
Remote
DIP Switch DIP Switch DIP Switch DIP Switch
1
2
3
4
On
Off
On
Off
Remote
RPC 2
Remote
Off
On
On
Off
Automatic
RPC 1
Remote
On
Off
Off
Off
Automatic
RPC 2
Remote
Off
On
Off
Off
Local
RPC 1
Local
On
Off
Off
Off
Local
RPC 2
Local
Off
On
Off
Off
Isolation
1. Observe the problem log details ″last occurrence″ timestamp field. The following
procedures will have you check if the timestamp has been updated. When the
RPC card is resumed, the timestamp will be updated if the failure is still
occurring.
2. Observe the Local/Remote switch on each RPC card.
Are the switches set the same?
v Yes, go to step 4 on page 96.
v No, continue with the next step.
3. Use the table above and determine which RPC card (RPC-1 or RPC-2) is set
incorrectly for the customer.
Is the incorrectly set Local/Remote switch in the Remote position?
v Yes, go to step 5 on page 96.
v No, go to step 6 on page 96.
Problem Isolation Procedures, CHAPTER 3
95
MAP 2410: RPC Power Mode Switch Mismatch
4. Observe the RPC Local/Remote switches.
Are the switches both set to Local(down)?
v Yes, go to step 5.
v No, go to step 6.
5. The RPC card in the problem log is reporting the switch in Remote mode. It
should be Local mode.
v Use the Repair Menu, Replace a FRU options to replace the failing RPC
card. Ensure the switches are set correctly. Then go to “MAP 1500: Ending a
Service Action” on page 68.
6. The RPC card not in the problem log is reporting the switch in Local mode. It
should be Remote mode.
a. Set both RPC card switches to Local within 5 seconds.
b. Use the Repair Menu, Replace a FRU options for the RPC card in the
problem log to reset the fence condition. Do not replace the FRU.
c. Use the Repair Menu, Replace a FRU options to replace the RPC card not
listed in the problem log. (Ensure the switches are set the same as the RPC
card that was just removed.)
7. Display the original problem log details ″last occurrence″ field.
Was the timestamp updated?
v Yes, the new RPC card did not fix the problem. Call the next level of support.
v No, set both RPC card switches to Local within 5 seconds. Go to the next
step.
8. Display the original problem log details ″last occurrence″ field.
Was the timestamp updated?
v Yes, the new RPC card did not fix the problem. Call the next level of support.
v No, go to “MAP 1500: Ending a Service Action” on page 68.
MAP 2420: 2105 Expansion Enclosure Power On Problem
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The 2105 Expansion Enclosure is not powering on properly from the 2105 Model
Exx/Fxx. Only one of the two 2105 Expansion Enclosure power systems is needed
to power on the 2105 Model Exx/Fxx. However, this MAP will require both power
systems to be functioning.
Isolation
1. Does the 2105 Model Exx/Fxx this 2105 Expansion Enclosure is attached to
power on?
v Yes, continue with the next step.
v No, go to “MAP 2400: 2105 Model Exx/Fxx Local Power On Problems” on
page 91.
2. Observe the 2105 Model Exx/Fxx primary power supply (PPS) to 2105
Expansion Enclosure RPC control cables.
v 2105 Expansion Enclosure PPS-1 connector J4 to 2105 Model Exx/Fxx
RPC-1 connector J3.
v 2105 Expansion Enclosure PPS-2 connector J4 to 2105 Model Exx/Fxx
RPC-2 connector J3.
96
VOLUME 1, ESS Service Guide
MAP 2420: 2105 Expansion Enclosure Power On Problem
Are both cables properly connected?
v Yes, continue with the next step.
v No, before reconnecting the cable, go to the 2105 Expansion Enclosure
PPS it should be connected to and set the input circuit breaker to the off
position. The 2105 Model Exx/Fxx RPC cards can stay powered on while
the cable is connected. Connect the cable. Set the input circuit breaker to
on (up), then attempt to power the 2105 Expansion Enclosure on again. If it
still fails continue with the next step.
3. Ensure each 2105 Expansion Enclosure PPS Main Line CB200 circuit breaker
is set to on (up).
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 34. 2105 Primary Power Supply Locations (s009048)
4. Observe each 2105 Expansion Enclosure PPS UEPO PWR indicator. Is the
indicator on?
v Yes, continue with the next step.
v No, go to “MAP 2350: Isolating PPS Status Indicator Codes” on page 80 for
status code 8 and perform the actions listed.
5. Ensure the 2105 Expansion Enclosure operator panel UEPO switches are set
to on (up).
6. Observe the 2105 Expansion Enclosure PPS UEPO LOOP-STBY indicator.
Is it on?
v Yes, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
97
MAP 2420: 2105 Expansion Enclosure Power On Problem
v No, go to “MAP 2380: Isolating 2105 Expansion Enclosure UEPO Problems”
on page 86.
7. Observe the PPS Good indicator.
Is it slow blinking?
v Yes, the PPS is in standby mode, waiting for a power on request. Continue
at the next step.
v No, replace the PPS. If the 2105 Expansion Enclosure still fails to power on,
return to the beginning of this MAP.
8. Observe the PPS Fault indicator.
Is it on?
v Yes, use the PPS status code displayed to repair the problem. Go to “MAP
2350: Isolating PPS Status Indicator Codes” on page 80.
v No, continue with the next step.
9. Observe the PPS Status Code display.
Is a status code displayed?
v Yes, use the PPS status code displayed to repair the problem. Go to “MAP
2350: Isolating PPS Status Indicator Codes” on page 80. Return to the
beginning of this MAP after the repair is complete.
v No, continue with the next step.
10. Attempt to power on the 2105 Expansion Enclosure It is best to have the 2105
Model Exx/Fxx in local power control mode instead of remote power control
mode. Ensure the Power Select switch on each 2105 Model Exx/Fxx RPC card
is in the Local position (down). Press the 2105 Model Exx/Fxx operator panel
Local power control switch momentarily to the on position. set to on (up).
Note: Remember to return these switches to their original position after the
repair is complete.
2105 Model Exx/Fxx
LOCAL
Unit
Emergency
REMOTE
L/R
SWITCH
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Messages
Cluster 1
Cluster 2
Front
View
Front View
Rear View
Figure 35. 2105 Model Exx/Fxx Operator Panel Locations (S008811m)
11. Observe each 2105 Expansion Enclosure PPS. Find the condition that now
exists.
v The PPS Pwr Good indicator is on solid which is normal operation. 390V
output is being supplied to electronics cage and storage cage power
98
VOLUME 1, ESS Service Guide
MAP 2420: 2105 Expansion Enclosure Power On Problem
supplies. The 2105 Model Exx/Fxx should be powering on. If not, reenter the
service guide with the new symptom(s).
v A PPS status code is displayed. Go to “MAP 2350: Isolating PPS Status
Indicator Codes” on page 80.
v The PPS Pwr Good indicator is still slow blinking. Continue at the next step.
12. Replace the PPS.
Does it still fail?
v Yes, continue with the next step.
v No, go to “MAP 1500: Ending a Service Action” on page 68.
13. Replace the RPC card for the PPS that is slow blinking and then attempt to
power on. (RPC-1 for PPS-1, RPC-2 for PPS-2) If the 2105 Model Exx/Fxx
clusters are in Ready, use the service terminal FRU Replace menu option to
replace the RPC card. If it still fails, call the next level of support. If it no longer
fails go to “MAP 1500: Ending a Service Action” on page 68.
MAP 2430: One RPC Card Firmware Down Level
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The firmware code in one RPC card is not at the latest level available.
Isolation
1. The firmware installed on the RPC card is down level from the latest available
on the 2105 Model Exx/Fxx LIC code library. The problem log that sent you here
displays the RPC card that is down level in the FRUs list.
2. Return to the service terminal and follow the displayed instructions to load the
RPC code.
Note: Do not press F3 to escape out of the problem. Do not use the LIC Menu
options to update the RPC card firmware.
MAP 2440: Isolating 2105 Model Exx/Fxx Power Off Problems
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or Failure to follow this attention can cause customer
operations to be disrupted.
Description
The following must occur for the 2105 Model Exx/Fxx to power off. Both RPC cards
must receive a power off request. This is from the 2105 Model Exx/Fxx operator
panel if in Local mode or from the HDI card if in Remote mode. Both RPC cards
must agree that they have received a power off request. If one RPC card is fenced
(quiesced), the other card can power off the 2105 Model Exx/Fxx without getting
agreement.
If a pinned data condition exists, the power off request will be ignored. The power
off request will work after the pinned data condition is cleared.
Isolation
1. Connect the service terminal to a cluster that will not power off.
From the service terminal Main Service Menu, select:
Problem Isolation Procedures, CHAPTER 3
99
MAP 2440: Power Off
Utilities Menu
Pinned Data Menu
Display Pinned Data
Are any volumes displayed with retryable, non-retryable or FC status?
v Yes, go to “MAP 4520: Pinned Data and/or Volume Status Unknown” on
page 363.
v No, continue with the next step.
2. This procedure will power off the 2105 Model Exx/Fxx Ensure the customer is
not using it.
3. Observe the 2105 Model Exx/Fxx operator panel Line Cord and Cluster
Message indicators. If the cluster Line Cord and Message indicators are
blinking rapidly, a power off is already in progress. Wait for the power off to
complete, this can take up to 5 minutes. If one or both line cord indicators are
still on solid, the 2105 Model Exx/Fxx cannot power off. Go to the next step.
4. The setting of the RPC card switches control how where to power off the 2105
Model Exx/Fxx from. Only the switches listed below must be set the same on
both RPC cards. (RPC card DIP switches 3 and 4 are set opposite each other
as they define RPC card 1 or 2.) There are four valid switch settings. Find the
description that matches your settings. Ensure that the correct power off
procedure is being used.
v RPC card Local/remote switch in Local (down) and RPC card DIP switch (at
bottom of card) position 3 in off (to left). Use the 2105 Model Exx/Fxx
operator panel Local Power switch to power off.
v RPC card Local/remote switch in Local (down) and RPC card DIP switch (at
bottom of card) position 3 in on (to right). Use the 2105 Model Exx/Fxx
operator panel Local Power switch to power off.
v RPC card Local/remote switch in remote (up) and RPC card DIP switch (at
bottom of card) position 3 in off (to left). Use the 2105 Model Exx/Fxx
operator panel Local Power switch of to power off.
v RPC card Local/remote switch in remote (up) and RPC card DIP switch (at
bottom of card) position 3 in on (to right). All attached host systems must be
powered off. When the last host system powers off, the 2105 Model Exx/Fxx
should power off.
5. Are the RPC switches set to use the 2105 Model Exx/Fxx operator panel Local
Power switch?
v Yes, continue with step 7
v No, continue with the next step.
6. Set the RPC card DIP switch position 3 to off (to left) for both RPC cards and
then attempt to power down using the operator panel Local Power switch.
Does the 2105 Model Exx/Fxx2105 Model Exx/Fxx power off now?
v Yes, power off only fails in remote mode. Return the DIP switch position 3
back to on (to right) for both RPC cards. Go to “MAP 2390: Remote Power
On Not Working” on page 88.
v No, power off fails in both remote and local modes. Leave the switches set
for Local mode. (After the problem is fixed, remember to set the switches
back to remote mode.) Continue with the next step.
7. Connect the service terminal and use the Repair Menu, Show / Repair
Problems Needing Repair option to repair any related power problems (PPS,
RPC, cluster). If a problem is found and repaired, retry the operation that sent
you here. If no problems are found go to the next step.
100
VOLUME 1, ESS Service Guide
MAP 2440: Power Off
8. Check the operation of the operator panel Local Power switch. Momentarily
press the Local Power switch to on (up). Observe both PPS status display,
they should display the PPS code level with the repeated sequence 00-xx-yy
(xx=code level, yy=PPS I.D.).
Do both PPS display the code level sequence?
v Yes, the Local Power switch cables are connected to both RPC cards. Go
to the next step.
v No, the PPS that did not display the code level should have created a new
problem log. Use the Repair Menu, Show / Repair Problems Needing Repair
option to repair the problem. If no related problem is found, go to “MAP
24A0: PPS Power On Problem” on page 104
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 36. 2105 Primary Power Supply Locations (s009048)
9. The 2105 Model Exx/Fxx will only power off if both PPS power off. The PWR
GOOD indicator on the PPS will be slow blinking when the PPS is powered off
to standby mode. Standby mode is when the main output voltages are off, but
the PPS internal logic voltages and line cord input voltages are still on.
Press the 2105 Model Exx/Fxx operator panel Local Power switch momentarily
to off (down). Wait up to 5 minutes for the PPS PWR GOOD indicators to slow
flash (indicates powered off to standby mode).
Find the condition that applies for you?
Problem Isolation Procedures, CHAPTER 3
101
MAP 2440: Power Off
v Both PPS PWR GOOD indicators are slow blinking. The 2105 Model
Exx/Fxx2105 Model Exx/Fxx powered off successfully. Return to the
procedure that sent you here, or go to “MAP 1500: Ending a Service Action”
on page 68.
v Both PPS PWR GOOD indicators are on solid. Continue with the next
step.
v One PPS PWR GOOD indicator is on solid and the other is slow
blinking. One PPS powered off and the other did not. Ensure the PPS to
RPC card cable and all RPC card cables are properly connected. Do the
following:
– Momentarily press the operator panel Local Power switch to on. This will
cause both PPS to be powered on again. Wait until both PPS PWR
GOOD indicators are on solid. This allows the working PPS power
system to keep the 2105 Model Exx/Fxx power on while the possible
failing FRUs are replaced.
– Replace the following FRUs until both PPS power off from the operator
panel Local Power switch. The PPS that failed to power off, the PPS to
RPC card cable, the RPC card for that PPS. Use the service terminal
Repair Menu, Replace a FRU option. Once the problem has been
repaired, return to the procedure that sent you here or Go to “MAP 1500:
Ending a Service Action” on page 68.
10. Both RPC cards must agree with each other to power off the 2105 Model
Exx/Fxx. Both RPC cards do not have to agree if one RPC card is already
fenced (quiesced) either by a problem or by using the service terminal Utility
menu options.
11. Use the service terminal Utility Menu, Resource Management Menu, Quiesce a
Resource option to quiesce RPC-1.
12. Press the operator panel Local Power switch momentarily to off.
Does the 2105 Model Exx/Fxx power off?
v Yes, one of the following FRUs is failing. RPC-1 card, 2105 Model Exx/Fxx
operator panel or RPC-1 to Operator Panel cable. Power the 2105 Model
Exx/Fxx on and then use the Repair Menu, Replace A FRU option to
replace the FRUs until it powers off. Then go to “MAP 1500: Ending a
Service Action” on page 68.
v No, resume RPC-1 and repeat this procedure for RPC-2. If this does not
repair the problem, call the next level of support.
MAP 2460: Battery Charge Low
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 390V Battery Set did not reach full charge in 30 hours. An uncharged battery
set will be charged at a high rate for up to 5 hours with a switched 750 ma current.
Then at low rate for up to 25 hours with a constant 750 ma current. It then begins a
trickle charge.
102
VOLUME 1, ESS Service Guide
MAP 2460: Battery Charge Low
Isolation
1.
2.
3.
4.
Ensure the circuit breaker on the master battery (under PPS -1) is set to on.
Ensure the cable between the master and slave battery is connected.
Ensure both cables between the master battery and PPS 1 are connected.
The 03 will automatically go blank when the battery set reaches full charge in
not more than 30 hours. The 03 is always displayed for 5 minutes (PPS code
level 20 or greater) when PPS 1 powers. Then the battery charge level is
checked.
5. Wait up to 30 hours for the batteries to reach full charge.
MAP 2470: Battery Set Detection Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The battery has a low charge or PPS-1 has detected a battery fault condition. If
code 03 is displayed, the battery is low and is charging. A battery that is completely
discharged can require up to 25 hours to become fully charged. The system will
report a permanent battery failure if the condition persists beyond the normal
charge time.
If code 04 is displayed, a battery failure is indicated. This condition may have been
introduced during replacement of PPS-1 or the battery. Do the following actions to
reset and then retry the battery failure condition.
Note: If the battery set is the FRU, both halves of the battery must be replaced at
the same time.
Isolation
1. Ensure that both PPS 1 to battery signal cable are connected. (PPS-1 J5B
connector and PPS-1 J5A connector)
2. Ensure the 390 V battery to battery cable is connected.
3. Ensure the battery CB is in the ON position (up).
4. Press the PPS-1 system power MAIN LINE circuit breaker (CB00) to OFF
(down).
5. Wait 10 seconds and then set the PPS-1 system power MAIN LINE circuit
breaker (CB00) to ON (up).
6. Press the 2105 Model Exx/Fxx operator panel Local power switch momentarily
to the On position.
7. If code 04 is still displayed, replace the 390V battery set. See ″390 V Battery
Set Removal and Replacement, 2105 Model Exx/Fxx and Expansion Enclosure″
in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2 book.
8. If code 03 is displayed, the battery is charging. Wait up to 30 hours. If code 03
is still displayed, the battery is not being charged. Replace the PPS and the two
battery signal cables. Use the service terminal Repair Menu, Replace a FRU,
Power Cooling FRUs menu options.
9. When the repair is complete go to “MAP 1500: Ending a Service Action” on
page 68.
Problem Isolation Procedures, CHAPTER 3
103
MAP 2490: PPS Input Phase Missing
MAP 2490: PPS Input Phase Missing
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The model number of this 2105 Model Exx/Fxx requires that all PPS have three
phase input power. This allows for maximum power output. If single phase input
power is used, only 60% of maximum power output is available.
Isolation
1. The PPS powered up and detected single phase input power when it should
have three phase input power. If the three phase input power had dropped to
single phase after power up, a PPS status code 07 would be displayed.
2. Use the service guide install chapter procedures to check the customer input to
the PPS line cords. Use the service terminal Repair Menu, Replace a FRU,
Rack Power Cooling FRUs option to prepare the PPS to be powered off for the
power checks. The PPS line cord will need to be disconnected from the
customer power source.
3. When the problem is repaired go to “MAP 1500: Ending a Service Action” on
page 68.
MAP 24A0: PPS Power On Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Each time the 2105 Model Exx/Fxx operator panel Local power switch is
momentarily pressed to on (up), the PPS status display should display a sequence
of 2 characters codes. If it does not. either the PPS is not providing power to its
RPC card, the RPC card is not sending a power on request to the PPS or the PPS
itself is failing.
104
VOLUME 1, ESS Service Guide
MAP 24A0: PPS Power On Problem
Primary Power Supply
Front View
Indicators
UEPO PWR
UEPO LOOP-STBY
PWR GOOD
PWR UNIT FAULT
ON BATTERY
Rear View
PPS Digital Status
(two digits)
Figure 37. 2105 Primary Power Supply Locations (s009048)
Isolation
1. Switch the failing PPS input circuit breaker to off. Unplug the PPS to PPS
communication cable from the J3 connector. (This removes both power sources
from the PPS logic.)
2. Plug the cable back to the J3 connector. Switch the input circuit breaker to on.
3. Observe the PPS UEPO PWR indicator.
Is the indicator on solid?
v Yes, the PPS has customer line cord input power. Go to the next step.
v No, either the customer line cord power is off or the PPS is failing. Use the
instructions in ″Check the Customer’s Circuit Breaker with the Power On″ in
chapter 5 of the Enterprise Storage Server Service Guide, Volume 2, to
measure the input voltages. If the input voltage is present, replace the PPS.
Use the service terminal Repair Menu, Replace a FRU option.
4. Observe the PPS UEPO Loop Stby indicator.
Is the indicator on solid?
v Yes, the UEPO is working correctly. Go to the next step.
v No, the UEPO is not working correctly. Go to “MAP 2360: 2105 Model
Exx/Fxx UEPO Problems” on page 82.
5. Observe the PPS PWR GOOD indicator.
Find the indicator condition you have.
Problem Isolation Procedures, CHAPTER 3
105
MAP 24A0: PPS Power On Problem
v On solid. The PPS powered on without a power on request from the 2105
Model Exx/Fxx operator panel local power switch (while in local power mode).
Replace the following FRUs until this no longer occurs. Failing PPS, the RPC
card for this PPS, the RPC card for the other PPS. Use the service terminal
Repair Menu, Replace a FRU option. If it still fails, call the next level of
support.
v Slow flashing. The PPS is in the expected standby mode. Go to the next
step.
v Off. Replace the failing PPS. Use the service terminal Repair Menu, Replace
a FRU option.
6. Press the 2105 Model Exx/Fxx operator panel Local power control switch
momentarily to the on position (up). A sequence of 2 character status codes
should be displayed and then the PPS PWR GOOD indicator should be on
solid.
Find the indicator condition you have.
v No status codes displayed, PWR GOOD indicator on. The PPS is
powered on properly but is not displaying progress codes. Replace the failing
PPS. Use the service terminal Repair Menu, Replace a FRU option.
v No status codes displayed, PWR GOOD indicator off. Replace the
following FRU until status codes are displayed. Failing PPS, RPC card,
operator panel local power control, PPS to RPC cable, RPC to operator panel
cable. Use the service terminal Repair Menu, Replace a FRU option.
v Status code displayed, PWR GOOD indicator off. Go to “MAP 2350:
Isolating PPS Status Indicator Codes” on page 80. Look up each status code
and repair the one that indicates a failure.
v Status code displayed, PWR GOOD indicator on. The PPS powered on
normally, return to the original procedure that sent you here or go to “MAP
1500: Ending a Service Action” on page 68.
MAP 24B0: Cannot Power Off, Pinned Data
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
When a pinned data condition occurs a problem log is created. The power control
microcode will not allow the 2105 Model Exx/Fxx to power off until after the pinned
data condition is repaired. The attempt to power off also disables all the host
system interfaces.
Isolation
An attempt to power off the 2105 Model Exx/Fxx failed because a pinned data
condition already exists. There are two ways to power off successfully:
v Repair the pinned data condition and then retry the power off. go to “MAP 4520:
Pinned Data and/or Volume Status Unknown” on page 363.
v If the 2105 Model Exx/Fxx needs to be powered off in an emergency, and if the
customer agrees that the pinned data can be lost, then set the operator panel
UEPO switch to off (down).
106
VOLUME 1, ESS Service Guide
MAP 24F0: Both RPC Cards Firmware Down Level
MAP 24F0: Both RPC Cards Firmware Down Level
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
The firmware code in both RPC cards is not at the latest level available.
Isolation
1. The firmware installed on both RPC cards is down level from the latest available
on the 2105 Model Exx/Fxx LIC code library.
From the service terminal Main Service Menu, select:
Licensed Internal Code Maintenance Menu
Multiple LIC Activation
(Concurrent option)
Note: Do not use the Licensed Internal Code Maintenance Menu, Firmware LIC
menu option.
2. Go to Go to: “MAP 1500: Ending a Service Action” on page 68.
MAP 2520: PPS Output Circuit Breaker Tripped
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A tripped PPS output circuit breaker will display a status code 10. An over-current
condition can cause this. The loads connected to this circuit breaker will be
disconnected until the problem is isolated.
Isolation
1. Ensure that the circuit breaker (CB) is still tripped.
2. Disconnect the power cable from the connector beneath the tripped CB.
3. Reset the CB to on (up).
Does the CB trip?
v Yes, replace the PPS. Use the service terminal, Repair Menu, Replace a
FRU menu options.
v No, continue with the next step.
4. Disconnect the other ends of the power cable. Each power cable supplies the
input for up to three power supplies. Manually trace the power cable from the
PPS to the other power supplies. Observe each power supply input indicator to
ensure the input power is already missing before disconnecting the power cable.
5. Reconnect the PPS power cable beneath the tripped CB.
Rest the CB to on (up). Does the CB trip?
v Yes, replace the power supply cable and then repeat this step.
v No, continue with the next step.
6. Reconnect the power cable to one power supply input and then set the CB to
the on position (up).
Problem Isolation Procedures, CHAPTER 3
107
MAP 2520: PPS Output Circuit Breaker Tripped
Does the CB trip?
v Yes, replace the power supply that was just connected. Use the service
terminal, Repair Menu, Replace a FRU menu options.
v No, repeat this step until all the power supplies are connected and the CB no
longer trips. The use the service terminal Repair Menu, End of Call Status
menu option.
MAP 2540: Power Problem Detected By Cluster Bay
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster bay detected an over-voltage or under-voltage condition with the power
it receives from the electronics cage power supplies. A power supply might be out
of specification or a bad connection between the power supply or supplies. The
cluster bay may be affecting the power supply voltage regulation.
Isolation
1. Use the service terminal to display and repair any related electronics cage
power problems. Use the service terminal Repair Menu, Show / Repair
Problems Needing Repair option.
2. Replace each electronics cage power supply one at a time. Use the service
terminal Repair Menu, Replace a FRU, Electronics Cage Power Cooling FRUs
option.
3. If all three electronics cage power supplies have been replaced and it still fails,
one of the following FRUs may be failing:
v Electronics cage power planar. Go to “MAP 4790: Repairing the Electronics
Cage” on page 395.
v Cluster Bay Power Planar Go to “MAP 4700: Replacing Cluster FRUs” on
page 375.
v Cluster Bay Power Planar To Docking Connector Cable Go to “MAP 4700:
Replacing Cluster FRUs” on page 375.
4. If it still fails, call the next level of support.
MAPs 3XXX SSA DASD Drawer Isolation Procedures
Procedures in the MAP 3XXX group of the Isolate chapter cover the SSA DASD in
the 2105 Model Exx/Fxx, 2105 Expansion Enclosure, and 2105 Model 100 units.
Using the SSA DASD drawer Maintenance Analysis Procedures (MAPs)
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
108
VOLUME 1, ESS Service Guide
Maintenance Analysis Procedures
These maintenance analysis procedures (MAPs) describe how to analyze a
continuous failure that has occurred in a DDM bay or SSA DASD Model 020 or 040
drawer. Failing field-replaceable units (FRUs) of the DDM bay or SSA DASD drawer
can be isolated with these MAPs.
To locate a DDM bay or SSA DASD Model 020 or 040 drawer in a 2105, see
″Locating a DDM Bay or SSA DASD Model 020 or 040 Drawer in a 2105 Rack″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
To locate a FRU in a DDM bay or SSA DASD Model 020 or 040 drawer in a 2105,
see:
v ″DDM Bay, Component Physical Location Codes″ in chapter 7 of the Enterprise
Storage Server Service Guide, Volume 3.
v ″SSA DASD Drawer Component Physical Location Codes, Model 020 Drawer″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
v ″SSA DASD Drawer Component Physical Location Codes, Model 040 Drawer″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
To isolate the FRUs in the failing DDM bay or SSA DASD drawer, do the actions
and answer the questions given in these MAPs.
See “SSA DASD Model 020 Drawer Indicators and Power Switch” on page 9 for
locations and descriptions of the indicators and switches.
Attention: Do not power off the 2105 rack, DDM bay, or SSA DASD drawer unless
instructed to do so.
Attention: If all steps in these MAPs have been followed, and verification of the
repair is still unsuccessful, call the next level of support.
Attention: Disk drive modules are fragile. Handle them with care, and keep them
well away from strong magnetic fields.
MAP 3000: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The SSA link between two adjoining disk drive modules (DDMs) is failing. The
failing link is between two adjoining DDMs, on the same backplane, in the same left
or right group of four DDMs. See Figure 38 for the relationship of the DDM and
backplane FRUs involved with this failure.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Problem Isolation Procedures, CHAPTER 3
109
MAP 3000: SSA Link Error
v DDM locations in SSA DASD Model 020 or 040 drawer, two adjoining DDMs in
DDM drawer positions 1 to 4, 5 to 8, 9 to 12, or 13 to 16
v DDM locations in DDM bay, two adjoining DDMs in DDM drawer positions 1 to 8
Backplane or
DDM Bay Backplane
(Front or Back)
DDM
DDM
Figure 38. SSA Link Failure, Two Adjoining DDMs (S007656l)
Isolation
1. Review if any other problems (pending or open) have a single DDM as the
FRU.
Are there any pending or open problems with a single DDM as the FRU?
v Yes, go to step 2.
v No, go to step 3.
2. Compare the single DDM FRU in the pending or open problem with the DDMs
in the problem you are working on.
Is the DDM in the open or pending problem the same as one of the DDMs in
the problem you are working on?
v Yes, repair the problem with the single DDM FRU first, it should fix the
problem you are working on.
v No, go to step 3.
3. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 4.
4. Replace the second DDM displayed on the service terminal with the DDM
removed in step 3, then verify the repair.
Note: The service terminal will determine if the second DDM being replaced is
in the same array as the first DDM. If both DDMs are in the same array,
the service terminal will instruct you to wait for sparing to completed.
When sparing for the first DDM replacement completes, the second DDM
can be replaced.
DDM sparing time can be many hours. Sparing time varies with system usage
and the storage capacity of the DDM being spared. An 18 GB drive may take 36
hours to spare on a heavily used system.
Did repair verification run without error?
110
VOLUME 1, ESS Service Guide
MAP 3000: SSA Link Error
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing go to step 5.
5. Replace the front or back backplane displayed on the service terminal or the
frame assembly, then verify the repair.
v SSA DASD Model 020
– Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, call the next level of support.
MAP 3010: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
The 40 MB/s SSA link, between two adjoining disk drive modules (DDMs) is
degraded and is running at 20 MB/s. The degraded link is between two adjoining
DDMs, on the same backplane. See Figure 39 for the relationship of the DDM and
backplane FRUs involved with this failure.
v Drawer models, SSA DASD Model 040, or SSA DASD DDM bay
v DDM locations in SSA DASD Model 040, two adjoining DDMs in DDM drawer
positions 1 to 4, 5 to 8, 9 to 12, or 13 to 16
Problem Isolation Procedures, CHAPTER 3
111
MAP 3010: Degraded SSA Link
v DDM locations in DDM bay, two adjoining DDMs in DDM drawer positions 1 to 8
Backplane or
DDM Bay Backplane
(Front or Back)
DDM
DDM
Figure 39. SSA Link Failure, Two Adjoining DDMs (S007656l)
Isolation
1. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 2.
2. Replace the second DDM displayed on the service terminal with the DDM
removed in step 1, then verify the repair.
Note: The service terminal will determine if the second DDM being replaced is
in the same array as the first DDM. If both DDMs are in the same array,
the service terminal will instruct you to wait for sparing to complete.
When sparing for the first DDM replacement completes, the second DDM
can be replaced.
DDM sparing time can be many hours. Sparing time varies with system usage
and the storage capacity of the DDM being spared. An 18 GB drive may take 36
hours to spare on a heavily used system.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 3.
3. Replace the frame assembly displayed on the service terminal or the frame
assembly, then verify the repair.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
112
VOLUME 1, ESS Service Guide
MAP 3010: Degraded SSA Link
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, call the next level of support.
MAP 3050: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link failed between a DDM and the SSA device card. The failing FRU is
either a center DDM, a signal or bypass card, a SSA device cable, or an SSA
device card. See Figure 85 for the relationship of the DDM, signal or bypass card,
backplane, SSA device cable and SSA device card FRUs involved with this failure.
v DDM bay A
v DDM bay B
SSA Device
Cable
SSA Device
Cable
Bypass
Card
SSA Device
Card
DDM Bay - A
Passthrough
Card
Passthrough
Card
DDM
DDM Bay - B
DDM Bay Backplane
DDM Bay Backplane
(Front or Back)
Figure 40. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008041l)
Isolation
1. Review if any other problems (pending or open) have a single DDM as the
FRU.
Are there any pending or open problems with a single DDM as the FRU?
v Yes, go to step 2.
v No, go to step 3 on page 114.
2. Compare the single DDM FRU in the pending or open problem with the DDM
in the problem you are working on.
Problem Isolation Procedures, CHAPTER 3
113
MAP 3050: SSA Link Error
Is the DDM in the open or pending problem the same as the DDM in the
problem you are working on?
v Yes, repair the open or pending problem with the single DDM FRU first, it
should fix the problem you are working on.
v No, go to step 3.
3. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA and cables are connected correctly, go to step 4.
v No, continue with step 6.
4. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the DDM bay. See Locating an SSA Cable.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to step 5.
v No, go to step 6.
5. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 6.
6. Locate the SSA cables displayed on the service terminal as possible FRUs.
For this isolation procedure, one of the the SSA cables is connected between
a DDM bay and an SSA device card. The other SSA cable is connected
between the same DDM bay and another DDM bay. The service terminal will
identify the drawer and its SSA connector, and the SSA device card and its
SSA connector. To locate a drawer see ″Locating a DDM Bay or SSA DASD
Model 020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise
Storage Server Service Guide, Volume 3. To locate SSA cable connectors on a
drawer, see Figure 41 on page 115.
Note: The SSA device card cable connector is in the format R1-Bx-Ky-yy,
where Bx is the bay location, Ky is the card location, and yy is the cable
connector. To locate an SSA device card cable connector, see Figure 42
on page 115.
114
VOLUME 1, ESS Service Guide
MAP 3050: SSA Link Error
Figure 41. DDM bay SSA Connectors (S007693l)
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Use the figure below to locate an SSA device card cable connector.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 42. Cluster SSA Device Card Connector Locations (S008022m)
a. Disconnect one of the two SSA device cables shown in Figure 40 on
page 113, and listed in the Problem FRU list.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Reconnect both ends of the SSA device cable, ensure good connection.
c. Run the repair verification. Select one cable from the Problem FRU list and
follow the repair process and verification without actually replacing the
cable.
Did repair verification run without error?
Problem Isolation Procedures, CHAPTER 3
115
MAP 3050: SSA Link Error
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, select one of the following.
– If you have inspected only one cable, repeat the above steps on the
second cable,
– If you have inspected both cables, go to step 7.
7. Locate DDM bay A, it may be in the front or rear of the 2105. Observe all of
the DDM bay, DDM Ready and Check indicators. See Figure 43.
Are any of the DDM bay DDM indicators on?
v Yes, go to step 8.
v No, there is a DDM bay power problem, go to “MAP 3395: Isolating an SSA
DASD DDM Bay Power Problem” on page 259.
8. Locate DDM bay B, it may be in the front or rear of the 2105. Observe all of
the DDM bay, DDM Ready and Check indicators.
Are any of the DDM bay DDM indicators on?
v Yes, go to step 9.
v No, there is a DDM bay power problem, go to “MAP 3395: Isolating an SSA
DASD DDM Bay Power Problem” on page 259.
Figure 43. DDM bay DDM Indicator Locations (S008021l)
9. Replace the DDM displayed on the service terminal, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 10.
10. Replace SSA device card displayed on the service terminal, then verify the
repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 11.
11. Replace the passthrough cards displayed on the service terminal. Replace
these cards one at a time, see ″Bypass and Passthrough Cards, DDM Bay″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2. After
each card is replaced, verify the repair.
116
VOLUME 1, ESS Service Guide
MAP 3050: SSA Link Error
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No,
– If all of the cards shown in Figure 40 on page 113, have been replaced,
go to step 12.
– If all of the cards shown in Figure 40 on page 113, have NOT been
replaced, repeat this step until all of the cards have been replaced.
12. Replace one of the two SSA device cables displayed on the service terminal
FRU list, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing.
– If both of the SSA device cables shown in Figure 40 on page 113, have
been replaced, go to step 13.
– If both of the SSA device cables shown in Figure 40 on page 113, have
NOT been replaced, repeat this step until all of the cables have been
replaced.
13. Replace the DDM bay frames displayed on the service terminal, one at a time:
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing.
– If all of the backplanes shown in Figure 40 on page 113, have been
replaced, the SSA link is still failing, call the next level of support.
– If all of the backplanes shown in Figure 40 on page 113, have NOT been
replaced, repeat this step until all of the backplanes have been replaced.
MAP 3060: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Problem Isolation Procedures, CHAPTER 3
117
MAP 3060: Degraded SSA Link
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
A 40 MB/s SSA link is degraded and is running at 20 MB/s, between a DDM and
the SSA device card. The degraded FRU is either a center DDM, a signal or bypass
card, a SSA device cable, or an SSA device card. See Figure 85 for the relationship
of the DDM, signal or bypass card, backplane, SSA device cable and SSA device
card FRUs involved with this failure.
v DDM bay A
v DDM bay B
SSA Device
Cable
SSA Device
Cable
Bypass
Card
SSA Device
Card
DDM Bay - A
Passthrough
Card
Passthrough
Card
DDM
DDM Bay - B
DDM Bay Backplane
DDM Bay Backplane
(Front or Back)
Figure 44. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008041l)
Isolation
1. Locate the SSA cables displayed on the service terminal as possible FRUs. For
this isolation procedure, one of the SSA cables is connected between a DDM
bay and an SSA device card. The other SSA cable is connected between the
same DDM bay and another DDM bay. The service terminal will identify the
drawer and its SSA connector, and the SSA device card and its SSA connector.
To locate a drawer see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. To locate SSA cable connectors on a drawer, see Figure 45
on page 119.
Note: The SSA device card cable connector is in the format R1-Bx-Ky-yy,
where Bx is the bay location, Ky is the card location, and yy is the cable
connector. To locate an SSA device card cable connector, see Figure 46
on page 119.
118
VOLUME 1, ESS Service Guide
MAP 3060: Degraded SSA Link
Figure 45. DDM bay SSA Connectors (S007693l)
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Use the figure below to locate an SSA device card cable connector.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 46. Cluster SSA Device Card Connector Locations (S008022m)
a. Disconnect one of the two SSA device cables shown in Figure 44 on
page 118, and listed in the Problem FRU list.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
There should be six pins in each plug. If there are less than six pins, replace
the cable. Reconnect both ends of the SSA device cable, ensure good
connection.
c. Run the repair verification. Select one cable from the Problem FRU list and
follow the repair process and verification without actually replacing the cable.
Did repair verification run without error?
Problem Isolation Procedures, CHAPTER 3
119
MAP 3060: Degraded SSA Link
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded:
– If you have inspected only one cable, repeat the above steps on the
second cable,
– If you have inspected both cables, go to step 2.
2. Replace the passthrough and bypass cards displayed on the service terminal.
Replace these cards one at a time, see ″Bypass and Passthrough Cards, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
After each card is replaced, verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded:
– If all of the cards shown in Figure 44 on page 118, have been replaced, go
to step 3.
– If all of the cards shown in Figure 44 on page 118, have NOT been
replaced, repeat this step until all of the cards have been replaced.
3. Replace one of the two SSA device cables displayed on the service terminal
FRU list, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded:
– If both of the SSA device cables shown in Figure 44 on page 118, have
been replaced, go to step 4.
– If both of the SSA device cables shown in Figure 44 on page 118, have
NOT been replaced, repeat this step until all of the cables have been
replaced.
4. Replace the DDM displayed on the service terminal, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 5.
5. Replace SSA device card displayed on the service terminal, then verify the
repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 6.
6. Replace the DDM bay frames displayed on the service terminal, one at a time:
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
120
VOLUME 1, ESS Service Guide
MAP 3060: Degraded SSA Link
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded:
– If all of the backplanes shown in Figure 44 on page 118, have been
replaced, the SSA link is still degraded, call the next level of support.
– If all of the backplanes shown in Figure 44 on page 118, have NOT been
replaced, repeat this step until all of the backplanes have been replaced.
MAP 3077: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link between a DDM and two SSA device cards is failing. The failing link
includes two SSA device cards, one bypass card, one passthrough card, three SSA
cables, and the DDM bay backplane. See Figure 47 for the relationship of these
FRUs.
The failure or incorrect connection of any of these components can cause the link
to fail. Other failures can also cause the link to fail. For example, a hot reset line to
the SSA device card can cause the connection between the two loop inputs to
appear to be open.
v Drawer models, DDM bay
Figure 47. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008141l)
Problem Isolation Procedures, CHAPTER 3
121
MAP 3077: SSA Link Error
Isolation
1. Write the following information on a piece of paper.
a. The Problem ID of this problem.
b. The number of the failing cluster, cluster 1 or 2.
c. The number of the other cluster:
v If cluster 1 is the failing cluster, record the other cluster as cluster 2.
v If cluster 2 is the failing cluster, record the other cluster as cluster 1.
2. Press F3 on the service terminal to list other problems.
Are there any other problems whose Failing Cluster is the other cluster written
down in step 1c?
v Yes, repair and verify them now. Repairing these problems may correct this
problem. After repair verification, continue with the next step.
v No, continue with step 4
3. Did the repair of the other problems resolve the problem recorded in the last
step (problem ID not displayed)?
v Yes, this problem is resolved.
v No, continue with the next step.
4. Return to the original problem. Select one of the SSA device cards from the
Possible FRU to Replace list. Continue through the repair and verify process
but do not replace any FRU.
Did the verification test run without error?
v Yes, the problem is resolved. This problem was caused by a condition that
has now been resolved.
v No, continue with the next step.
5. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, continue with the
next step.
v No, continue with step 8 on page 124.
6. Verify that the SSA cables are connected correctly. Locate all of the three SSA
cables displayed by the service terminal as possible FRUs. These SSA cables
will each be connected between a DDM bay and an SSA device card. The
service terminal FRU Location will identify the DDM bay and SSA connector
where each end of the SSA cable is connected. To locate the DDM bay see
″Locating a DDM Bay or SSA DASD Model 020 or 040 Drawer in a 2105
Rack″ in chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
To locate SSA cable connectors on a DDM bay, see Figure 48 on page 123.
122
VOLUME 1, ESS Service Guide
MAP 3077: SSA Link Error
Figure 48. DDM bay SSA Connector Locations (S007693l)
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 49.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 49. Cluster SSA Device Card SSA Connector Locations (S008022m)
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, continue with the next
step.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
v No, go to step 8 on page 124.
7. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Problem Isolation Procedures, CHAPTER 3
123
MAP 3077: SSA Link Error
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, continue with the next step.
8. Locate the DDM bay, it may be located in the front or rear of the 2105.
Observe all of the DDM bay DDM Ready and Check indicators.
Are any of the DDM bay DDM indicators on?
v Yes, go to step 9.
v No, there is a DDM bay problem, go to “MAP 3395: Isolating an SSA DASD
DDM Bay Power Problem” on page 259.
Figure 50. DDM bay DDM Indicator Locations (S008021l)
9. Replace the DDM displayed on the service terminal, then verify the repair. See
″SSA Disk Drive Module, DDM Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, continue with the next step.
10. Replace one of the SSA device cards displayed on the service terminal, then
verify the repair. See ″SSA Service Card, Cluster Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, continue with the next step.
11. Replace the other SSA device card displayed on the service terminal, then
verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, continue with the next step.
124
VOLUME 1, ESS Service Guide
MAP 3077: SSA Link Error
12. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card. See the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, continue with the next step.
13. Replace the passthrough card displayed on the service terminal, then verify
the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, continue with the next step.
14. Replace the first SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 6 on page 122.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
15. Replace the second SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 6 on page 122.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
16. Replace the third SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 6 on page 122.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
17. Replace the backplane in the DDM bay, then verify the repair:
See ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Note: For a DDM bay, the backplanes are replaced by replacing the frame
assembly.
Did repair verification run without error?
Problem Isolation Procedures, CHAPTER 3
125
MAP 3077: SSA Link Error
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, call the next level of support.
MAP 3078: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
A 40 MB/s SSA link between a DDM and two SSA device cards is degraded and is
running at 20 MB/s. The degraded link includes two SSA device cards, one bypass
card, one passthrough card, three SSA cables, and the DDM bay backplane. See
Figure 51 for the relationship of these FRUs. The failure or incorrect connection of
any of these components can cause the link to run at a slower speed.
v Drawer models, DDM bay
Figure 51. SSA Link Failure, Passthrough and Bypass Card Link Between a DDM and SSA Device Card (S008141l)
Isolation
1. Locate all of the three SSA cables displayed by the service terminal as
possible FRUs. These SSA cables will each be connected between a DDM
bay and an SSA device card. The service terminal FRU Location will identify
the DDM bay and SSA connector where each end of the SSA cable is
connected. To locate the DDM bay, see ″Locating a DDM Bay or SSA DASD
Model 020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise
Storage Server Service Guide, Volume 3. To locate SSA cable connectors on a
DDM bay, see Figure 52 on page 127.
126
VOLUME 1, ESS Service Guide
MAP 3078: Degraded SSA Link
Figure 52. DDM bay SSA Connector Locations (S007693l)
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 53.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 53. Cluster SSA Device Card SSA Connector Locations (S008022m)
Disconnect both ends of each of these SSA cables.
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
Inspect the cable connectors for bent pins and correct any problems found.
There should be three pins in each plug. If there are less than three pins,
replace the cable. Reconnect both ends of the SSA device cables, ensure
good connection.
Continue with the next step.
Problem Isolation Procedures, CHAPTER 3
127
MAP 3078: Degraded SSA Link
2. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select any of the cables. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate a repair and
run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, continue with the next step.
3. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
4. Replace the passthrough card displayed on the service terminal, then verify
the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
5. Replace the first SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 1 on page 126.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to the next step.
6. Replace the second SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 1 on page 126.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to the next step.
7. Replace the third SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 1 on page 126.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to the next step.
128
VOLUME 1, ESS Service Guide
MAP 3078: Degraded SSA Link
8. Replace the DDM displayed on the service terminal, then verify the repair. See
″SSA Disk Drive Module, DDM Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
9. Replace one of the SSA device cards displayed on the service terminal, then
verify the repair. See ″SSA Service Card, Cluster Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
10. Replace the other SSA device card displayed on the service terminal, then
verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
11. Replace the backplane in the DDM bay, then verify the repair:
See ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Note: For a DDM bay, the backplanes are replaced by replacing the frame
assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, call the next level of support.
MAP 3080: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Problem Isolation Procedures, CHAPTER 3
129
MAP 3080: SSA Link Error
Description
The SSA link between two DDMs is failing. One of the following conditions is
present:
v The failing link is between two end DDMs, on different backplanes, and the
bypass card that links them. See Figure 54.
v The failing link is between two center DDMs, on the same backplane, and the
bypass card they are connected to. See Figure 55.
v Drawer models, SSA DASD Model 020 or 040 drawer
v DDM locations in drawer, two DDMs in DDM drawer positions 1 and 16, 8 and 9,
4 and 5, or 12 and 13.
v Bypass card location in drawer, lower left (J8 and J9) or upper right (J1 and J16),
lower right (J12 and J13) or upper left (J4 and J5),
Figure 54. SSA Link Failure, Bypass Card and Two DDMs (S008144m)
Figure 55. SSA Link Failure, Bypass Card and Two DDMs (S008143l)
Isolation
1. Review if any other problems (pending or open) have a single DDM as the
FRU.
Are there any pending or open problems with a single DDM as the FRU?
v Yes, go to step 2.
v No, go to step 3 on page 131.
2. Compare the single DDM FRU in the pending or open problem with the DDMs
in the problem you are working on.
130
VOLUME 1, ESS Service Guide
MAP 3080: SSA Link Error
Is the DDM in the open or pending problem the same as one of the DDMs in
the problem you are working on?
v Yes, repair the problem with the single DDM FRU first, it should fix the
problem you are working on.
v No, go to step 3.
3. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, go to step 4.
v No, continue with step 6.
4. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the drawer or DDM bay. See Locating an SSA Cable.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to step 5.
v No, go to step 6.
5. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 6.
6. Determine if the SSA DASD drawer bypass card jumpers are correct.
Is the SSA DASD drawer with the error a newly installed drawer or was the
bypass card in the drawer just replaced?
v Yes, continue with step 7.
v No, continue with step 9 on page 132.
7. Select the bypass card from the Possible FRUs to Repair List. Remove the
drawer bypass card, see ″Bypass Cards, 7133 Model 020/040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Are the jumpers correct?
v Yes, reinstall the bypass card and go to step 9 on page 132.
v No, continue with the next step.
8. Correctly install the jumpers, reinstall the bypass card, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 9 on page 132.
Problem Isolation Procedures, CHAPTER 3
131
MAP 3080: SSA Link Error
9. Replace the first of the two DDMs displayed on the service terminal, then
verify the repair.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 10.
10. Replace the second DDM displayed on the service terminal using the DDM
removed in step 9, then verify the repair.
Notes:
a. If the first DDM is the same capacity the second DDM on the FRU list, use
the first DDM to replace the second DDM.
b. The service terminal will determine if the second DDM being replaced is in
the same array as the first DDM. If both DDMs are in the same array, the
service terminal will instruct you to wait for sparing to completed. When
sparing for the first DDM replacement completes, the second DDM can be
replaced.
c. DDM sparing time can be many hours. Sparing time varies with system
usage and the storage capacity of the DDM being spared. An 18 GB drive
may take 36 hours to spare on a heavily used system.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 11.
11. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass Cards, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Verify the jumpers on the new bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 12.
12. Replace the front or back backplane or frame assembly displayed on the
service terminal, then verify the repair.
v SSA DASD Model 020
– Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD Model 040
132
VOLUME 1, ESS Service Guide
MAP 3080: SSA Link Error
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 13.
13. Determine if the drawer is a SSA DASD Model 020.
Is the drawer a SSA DASD Model 020 drawer?
v Yes, replace the backplane not previously replaced, then verify the repair.
– SSA DASD Model 020
- Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
- Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– If verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
– If verification failed, seek technical aid.
v No, The SSA link is still failing, call the next level of support.
MAP 3081: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The 40 MB/s SSA link between two DDMs is degraded and is running at 20 MB/s.
One of the following conditions is present:
v The degraded link is between two end DDMs, on different backplanes, and the
bypass card that links them. See Figure 56.
v The degraded link is between two center DDMs, on the same backplane, and the
bypass card they are connected to. See Figure 57.
v Drawer models, SSA DASD Model 040
v DDM locations in drawer, two DDMs in DDM drawer positions 1 and 16, 8 and 9,
4 and 5, or 12 and 13.
v Bypass card location in drawer, lower left (J8 and J9) or upper right (J1 and J16),
lower right (J12 and J13) or upper left (J4 and J5),
Problem Isolation Procedures, CHAPTER 3
133
MAP 3081: SSA Link Degraded
Figure 56. SSA Link Failure, Bypass Card and Two DDMs (S008144m)
Figure 57. SSA Link Failure, Bypass Card and Two DDMs (S008143l)
Isolation
1. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass Cards, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Verify the jumpers on the new bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
2. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 3.
3. Replace the second DDM displayed on the service terminal using the DDM
removed in step 2, then verify the repair.
134
VOLUME 1, ESS Service Guide
MAP 3081: SSA Link Degraded
Notes:
a. If the first DDM is the same capacity the second DDM on the FRU list, use
the first DDM to replace the second DDM.
b. The service terminal will determine if the second DDM being replaced is in
the same array as the first DDM. If both DDMs are in the same array, the
service terminal will instruct you to wait for sparing to completed. When
sparing for the first DDM replacement completes, the second DDM can be
replaced.
c. DDM sparing time can be many hours. Sparing time varies with system
usage and the storage capacity of the DDM being spared. An 18 GB drive
may take 36 hours to spare on a heavily used system.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, go to step 4.
4. Replace the frame assembly displayed on the service terminal, then verify the
repair.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, The SSA link is still degraded, call the next level of support.
MAP 3082: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link between two DDMs is failing. The DDMs are in the same drawer. The
failing link goes through two DDMs, a bypass card, an SSA device card, two SSA
cables, and a drawer backplane. See Figure 58 for the relationship of these FRUs.
Problem Isolation Procedures, CHAPTER 3
135
MAP 3082: SSA Link Error
The failure of any of these components can cause the link to fail. Other failures can
also cause the link to fail. For example, a hot reset line to the SSA device card can
cause the connection between the two loop inputs to appear to be open.
v Drawer models, SSA DASD Model 020 or 040 drawer
Figure 58. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device
Card (S008142l)
Isolation
1. Write the following information on a piece of paper.
a. The Problem ID of this problem.
b. The number of the failing cluster, cluster 1 or 2.
c. The number of the other cluster:
v If cluster 1 is the failing cluster, record the other cluster as cluster 2.
v If cluster 2 is the failing cluster, record the other cluster as cluster 1.
2. Press F3 on the service terminal to list other problems.
Are there any other problems whose Failing Cluster is the other cluster written
down in step 1c?
v Yes, repair and verify them now. Repairing these problems may correct this
problem. After repair verification, continue with the next step.
v No, continue with step 4
3. Select any FRU under Probable FRUs to Replace. Continue through repair
and verify, but do not actually replace any FRU.
Did the verify run without error?
v Yes, the problem is resolved. This problem was caused by another problem
that has now been resolved.
v No, continue with the next step.
4. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, go to the next step.
v No, continue with step 10 on page 139.
5. Verify that the two SSA cables are connected correctly. Look at the cables
displayed under Possible FRUs to Replace. Look at the Resource Location
Code, it will give the location of the connectors at both ends of the cable. On
the Detail Problem screen, compare the cables displayed with the cabling from
the drawer to the SSA device card.
The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy, where:
v Tx is the cluster, 1 or 2
136
VOLUME 1, ESS Service Guide
MAP 3082: SSA Link Error
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Use the drawing below to locate the SSA cable connectors on an SSA device
card.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 59. Cluster SSA Device Card SSA Connector Locations (S008022m)
The drawer card cable connector is in the format Rx-Yy-Jzz, where:
v Rx is rack 2, 3, or 4
v Yy is the drawer location
v Jzz is the cable connector
To locate a drawer see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3.
Use the drawing below to locate the SSA cable connectors on a drawer.
Problem Isolation Procedures, CHAPTER 3
137
MAP 3082: SSA Link Error
7133 Model 020
J4
J1
J5
J16
J8
J13
J9
J12
4
Rear View
7133 Model 040
J4
J1
3
J5
J16
J8
J13
J9
J12
Rear View
Figure 60. Drawer SSA Connector Locations (S008762p)
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to the next step.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
v No, go to step 5 on page 136.
6. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to the next step.
7. Select the drawer bypass card listed under Possible FRUs to Replace.
Remove the drawer bypass card, see ″Bypass and Passthrough Cards, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
138
VOLUME 1, ESS Service Guide
MAP 3082: SSA Link Error
Are the jumpers correct?
v Yes, reinstall the bypass card and continue with the next step.
v No, continue with the next step.
8. Move the jumpers to the correct positions. Reinstall the bypass card. Select
the bypass card from the FRUs to Replace list. Continue through repair and
verify, but do not actually replace any FRU.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to the next step.
9. Replace the SSA device card displayed on the service terminal, then verify the
repair
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
10. Replace the first of the two DDMs displayed on the service terminal, then
verify the repair. See the ″SSA Disk Drive Model, 7133 Model 020/040″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
11. Replace the second DDM displayed on the service terminal with the DDM
removed in the last step, then verify the repair. See ″SSA Disk Drive Model,
7133 Model 020/040″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Note: It may take many hours before the second DDM can be replaced.
The service terminal will determine if the second DDM being replaced is in the
same array as the first DDM. If both DDMs are in the same array, the service
terminal will instruct you to wait for sparing to complete. When sparing for the
first DDM replacement completes, the second DDM can be replaced. DDM
sparing time for 18 MB DDMs can be up to 36 hours. Sparing time varies with
system usage and the storage capacity of the DDM being spared.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
12. Replace the bypass card displayed on the service terminal, then verify the
repair
Problem Isolation Procedures, CHAPTER 3
139
MAP 3082: SSA Link Error
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to the next step.
13. Do not replace either of the SSA cables in the FRU list. Both of these cables
would have to be open to cause these failure symptoms. Continue with the
next step.
14. Replace the front or back backplane or frame assembly displayed on the
service terminal:
v SSA DASD Model 020
– Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, call the next level of support.
MAP 3083: Isolating a Degraded SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
A 40 MB/s SSA link between two DDMs is degraded and is running at 20 MB/s. The
DDMs are in the same drawer. The failing link goes through two DDMs, a bypass
card, an SSA device card, two SSA cables, and a drawer backplane. See Figure 61
for the relationship of these FRUs. The degradation of any of these components
can cause the link to run slower.
v Drawer models, SSA DASD Model 040
140
VOLUME 1, ESS Service Guide
MAP 3083: SSA Link Degraded
Figure 61. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device
Card (S008142l)
Isolation
1. Examine each of the two SSA cables. Look at the cables displayed under
Possible FRUs to Replace. Look at the Resource Location Code, it will give the
location of the connectors at both ends of the cable. On the Detail Problem
screen, compare the cables displayed with the cabling from the drawer to the
SSA device card.
The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy, where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Use the drawing below to locate the SSA cable connectors on an SSA device
card.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 62. Cluster SSA Device Card SSA Connector Locations (S008022m)
The drawer card cable connector is in the format Rx-Yy-Jzz, where:
v Rx is rack 2, 3, or 4
v Yy is the drawer location
Problem Isolation Procedures, CHAPTER 3
141
MAP 3083: SSA Link Degraded
v Jzz is the cable connector
To locate a drawer, see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3.
Use the drawing below to locate the SSA cable connectors on a drawer.
7133 Model 020
J4
J1
J5
J16
J8
J13
J9
J12
4
Rear View
7133 Model 040
J4
J1
3
J5
J16
J8
J13
J9
J12
Rear View
Figure 63. Drawer SSA Connector Locations (S008762p)
Disconnect both ends of each of these SSA cables.
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
Inspect the cable connectors for bent pins and correct any problems found.
There should be six pins in each plug. If there are less than six pins, replace
the cable. Reconnect both ends of the SSA device cable, ensure good
connection.
Continue with the next step.
2. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select any cable from the FRU list. Proceed through the repair
but do not replace any FRU or disconnect any cables. This will simulate a repair
and run verification.
Did verification run without error?
142
VOLUME 1, ESS Service Guide
MAP 3083: SSA Link Degraded
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, continue with the next step.
3. Replace the bypass card displayed on the service terminal, then verify the
repair
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
4. Replace the SSA device card displayed on the service terminal, then verify the
repair
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
5. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair. See ″SSA Disk Drive Model, 7133 Model 020/040″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
6. Replace the second DDM displayed on the service terminal with the DDM
removed in the last step, then verify the repair. See ″SSA Disk Drive Model,
7133 Model 020/040″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Note: It may take many hours before the second DDM can be replaced.
The service terminal will determine if the second DDM being replaced is in the
same array as the first DDM. If both DDMs are in the same array, the service
terminal will instruct you to wait for sparing to complete. When sparing for the
first DDM replacement completes, the second DDM can be replaced. DDM
sparing time for 18 MB DDMs can be up to 36 hours. Sparing time varies with
system usage and the storage capacity of the DDM being spared.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, continue with the next step.
7. Do not replace either of the SSA cables in the FRU list. Both of these cables
would have to be open to cause these failure symptoms. Continue with the next
step.
8. Replace the front or back backplane or frame assembly displayed on the
service terminal:
Problem Isolation Procedures, CHAPTER 3
143
MAP 3083: SSA Link Degraded
v Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both replaced
at the same time by replacing the frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still degraded, call the next level of support.
MAP 3085: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link failed between two SSA device cards. The failing FRU is one of the
FRUs displayed in the FRU list. See Figure 64 for the relationship of these FRUs.
v Drawer models, DDM bay
– SSA device cards connected through the DDM bay
SSA Device
Cable
SSA Device
Card
Passthrough
Card
Bypass
Card
SSA Device
Cable
SSA Device
Card
DDM Bay Backplane
Figure 64. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S007649l)
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, go to step 2 on
page 145.
v No, continue with step 4 on page 145.
144
VOLUME 1, ESS Service Guide
MAP 3085: SSA Link Error
2. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the drawer or DDM bay. See Locating an SSA Cable.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to step 3.
v No, go to step 4.
3. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 4.
4. Locate the two SSA cables displayed on the service terminal as possible
FRUs. For this isolation procedure, the SSA cables will be connected between
a DDM bay and SSA device cards. The service terminal will identify the DDM
bays and their SSA connectors, and the SSA device cards and their SSA
connectors. To locate a drawer see ″Locating a DDM Bay or SSA DASD Model
020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage
Server Service Guide, Volume 3. To locate SSA cable connectors on a DDM
bay, see Figure 65.
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 66 on page 146.
Figure 65. DDM bay SSA Connector Locations (S007693l)
Problem Isolation Procedures, CHAPTER 3
145
MAP 3085: SSA Link Error
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 66. Cluster SSA Device Card SSA Connector Locations (S008022m)
a. Disconnect the SSA device cable from the cluster SSA device card and the
DDM bay.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Reconnect both ends of the SSA device cable, ensure good connection.
c. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable. Proceed through the repair but do not
replace any FRU or disconnect any cables. This will simulate a repair and
run verification.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 5.
5. Locate DDM bay, it may be located in the front or rear of the 2105. Observe all
of the DDM bay DDM and card indicators.
Are any of the DDM bay indicators on?
v Yes, go to step 6.
v No, there is a DDM bay problem, go to “MAP 3395: Isolating an SSA DASD
DDM Bay Power Problem” on page 259.
6. Replace the first SSA device card displayed on the service terminal, then verify
the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 7 on page 147.
146
VOLUME 1, ESS Service Guide
MAP 3085: SSA Link Error
7. Replace the other SSA device card displayed on the service terminal, then
verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 8.
8. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: Verify the jumpers on the bypass card are in the correct positions
before replacing the card, see the SSA DASD Model 020 and 040
Drawer Bypass Card Jumper Settings figure in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 9.
9. Replace the passthrough card displayed on the service terminal, then verify
the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, go to step 10.
10. Replace the SSA device cables displayed on the service terminal one at a
time, then verify each repair.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, of you have not replaced the other cable, replace it and verify the
repair. If both cables have been replaced, and the SSA link is still failing, go
to step 11.
11. Replace the frame (DDM bay) assembly displayed on the service terminal:
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, the SSA link is still failing, call the next level of support.
Problem Isolation Procedures, CHAPTER 3
147
MAP 3086: Degraded SSA Link
MAP 3086: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
A 40 MB/s SSA link is degraded between two SSA device cards is degraded and is
running at 20 MB/s. The degraded FRU is one of the FRUs displayed in the FRU
list. See Figure 67 for the relationship of these FRUs.
v Drawer models, DDM bay
– SSA device cards connected through the DDM bay
SSA Device
Cable
SSA Device
Card
Passthrough
Card
Bypass
Card
SSA Device
Cable
SSA Device
Card
DDM Bay Backplane
Figure 67. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S007649l)
Isolation
1. Locate the two SSA cables displayed on the service terminal as possible FRUs.
For this isolation procedure, the SSA cables will be connected between a DDM
bay and SSA device cards. The service terminal will identify the DDM bays and
their SSA connectors, and the SSA device cards and their SSA connectors. To
locate a drawer, see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. To locate SSA cable connectors on a DDM bay, see Figure 68
on page 149.
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 69 on page 149.
148
VOLUME 1, ESS Service Guide
MAP 3086: Degraded SSA Link
Figure 68. DDM bay SSA Connector Locations (S007693l)
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 69. Cluster SSA Device Card SSA Connector Locations (S008022m)
a. Disconnect the SSA device cables from the cluster SSA device cards and
the DDM bay.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Each connector should have three pins. If there are less than three pins,
replace the cable. Reconnect both ends of the SSA device cables, ensure
good connection.
c. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable. Proceed through the repair but do not
replace any FRU or disconnect any cables. This will simulate a repair and
run verification.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8 on page 150.
v No, continue with the next step.
2. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
149
MAP 3086: Degraded SSA Link
Note: Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer
Bypass Card Jumper Settings″ figure in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, the SSA link is still degraded, continue with the next step.
3. Replace the passthrough card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, the SSA link is still degraded, continue with the next step.
4. Replace the SSA device cables displayed on the service terminal one at a time,
then verify each repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, if you have not replaced the other cable, replace it and verify the repair.
If both cables have been replaced, and the SSA link is still degraded, go to
step 5.
5. Replace the first SSA device card displayed on the service terminal, then verify
the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, the SSA link is still degraded, continue with the next step.
6. Replace the other SSA device card displayed on the service terminal, then
verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, the SSA link is still degraded, continue with the next step.
7. Replace the frame (DDM bay) assembly displayed on the service terminal:
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 8.
v No, the SSA link is still degraded, call the next level of support.
8. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3095: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
150
VOLUME 1, ESS Service Guide
MAP 3095: SSA Link Error
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link between two DDMs is failing. The DDMs are in separate DDM bays.
The failing link goes through two passthrough cards, a bypass card, SSA cable(s),
and possibly an SSA device adapter. See Figure 70 for the relationship of these
FRUs.
The failure or incorrect connection of any of these components can cause the link
to fail. Other failures can also cause the link to fail. For example, a hot reset line to
the SSA device card can cause the connection between the two loop inputs to
appear to be open.
v Drawer models, DDM bay
Bypass
Card
SSA Device
Cable
SSA
Device
Card
SSA Device
Cables
DDM
DDM Bay - A
Passthrough
Card
Passthrough
Card
DDM Bay Backplane
DDM DDM Bay - B
DDM Bay Backplane
Figure 70. SSA Link Failure, Signal or Bypass Card Link Between a DDM and SSA Device Card (S008140l)
Isolation
1. Write the following information on a piece of paper.
a. The Problem ID of this problem.
b. The number of the failing cluster, cluster 1 or 2.
c. The number of the other cluster:
v If cluster 1 is the failing cluster, record the other cluster as cluster 2.
v If cluster 2 is the failing cluster, record the other cluster as cluster 1.
2. Press F3 on the service terminal to list other problems.
Are there any other problems whose Failing Cluster is the other cluster written
down in step 1c?
v Yes, repair and verify them now. Repairing these problems may correct this
problem. After repair verification, continue with the next step.
v No, go to step 5 on page 152.
3. Did the repair of the other problems resolve the problem recorded in the last
step (problem ID not displayed)?
v Yes, this problem is resolved.
Problem Isolation Procedures, CHAPTER 3
151
MAP 3095: SSA Link Error
v No, continue with the next step.
4. Return to the original problem. Select the SSA device card from the Possible
FRU to Replace list. Continue through the repair and verify process but do not
replace any FRU.
Did the verification test run without error?
v Yes, the problem is resolved. This problem was caused by another problem
that has now been resolved.
v No, continue with the next step.
5. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, continue with the
next step.
v No, continue with step 10 on page 153.
6. Locate the SSA cables displayed on the service terminal as possible FRUs.
One of these SSA cables will be connected between two separate DDM bays.
The service terminal will identify the drawer and SSA connector that each end
of the SSA cable is connected to. To locate a DDM bay, see ″Locating a DDM
Bay or SSA DASD Model 020 or 040 Drawer in a 2105 Rack″ in chapter 7 of
the Enterprise Storage Server Service Guide, Volume 3. To locate SSA cable
connectors on a SSA DASD drawer, see Figure 71.
Is the SSA cable connected to the correct connectors?
v Yes, continue with the next step.
v No, connect the cable correctly. Continue with the next step.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group. After the cable is
connected correctly, go to step 9 on page 153.
7. Disconnect both ends of the SSA device cable.
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
Inspect the cable connectors for bent pins and correct any problems found.
Reconnect both ends of the SSA device cable, ensure good connection.
Continue with the next step.
Figure 71. DDM bay SSA Connector Locations (S007693l)
152
VOLUME 1, ESS Service Guide
MAP 3095: SSA Link Error
8. Locate the two remaining SSA cables in the Possible FRU list. These SSA
cable will be connected between a DDM bay and an SSA device card. The
service terminal will identify the drawer and its SSA connector, and the SSA
device card and its SSA connector.
Locate the DDM bay end of the SSA cable, see the instructions in step 6 on
page 152.
Note: The SSA device card cable connector is in the format R1-Tx-P2-Ky-yy,
where:
v Tx is the cluster, 1 or 2
v Ky is the card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 72.
Are the SSA cables connected to the correct connectors?
v Yes, step 10.
v No, connect the cable correctly. After the cable is connected correctly, go to
step 9.
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 72. Cluster SSA Device Card SSA Connector Locations (S008022m)
9. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select any cable in the Possible FRUs to Replace list.
Proceed through the repair but do not replace any FRU or disconnect any
cables. This will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, continue with the next step.
10. Replace the SSA device card displayed on the service terminal then verify the
repair See ″SSA Service Card, Cluster Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Did repair verification run without error?
Problem Isolation Procedures, CHAPTER 3
153
MAP 3095: SSA Link Error
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, the SSA link is still failing, continue with the next step.
11. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14. See ″SSA Disk Drive Module, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, the SSA link is still failing, continue with the next step.
12. Replace the second DDM displayed on the service terminal with the DDM
removed in the last step, then verify the repair. See ″SSA Disk Drive Model,
7133 Model 020/040″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Note: It may take many hours before the second DDM can be replaced.
The service terminal will determine if the second DDM being replaced is in the
same array as the first DDM. If both DDMs are in the same array, the service
terminal will instruct you to wait for sparing to complete. When sparing for the
first DDM replacement completes, the second DDM can be replaced. DDM
sparing time for 18 MB DDMs can be up to 36 hours. Sparing time varies with
system usage and the storage capacity of the DDM being spared.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, the SSA link is still failing, continue with the next step.
13. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, the SSA link is still failing, continue with the next step.
14. Replace the first passthrough card displayed on the service terminal, then
verify the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21 on page 155.
v No, the SSA link is still failing, continue with the next step.
15. Replace the second passthrough card displayed on the service terminal, then
verify the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Use the card removed in the last step.
Did repair verification run without error?
154
VOLUME 1, ESS Service Guide
MAP 3095: SSA Link Error
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, continue with the next step.
16. Replace the SSA device cable that connects the two DDM bays. This cable is
displayed in the FRU list on the service terminal. To locate the cable, see step
6 on page 152.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, continue with the next step.
17. Replace the second SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 8 on page 153.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, continue with the next step.
18. Replace the third SSA device cable displayed on the FRU list on the service
terminal. To locate the cable, see step 8 on page 153.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, continue with the next step.
19. Replace the frame assembly (backplane) in DDM bay A, see ″Frame
Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Did repair verification run without error?
Note: For DDM bays, the backplanes are replaced by replacing the frame
assembly.
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, continue with the next step.
20. Replace the backplane in DDM bay B, then verify the repair:
v DDM bay see ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Note: For DDM bays, the backplanes are replaced by replacing the frame
assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 21.
v No, the SSA link is still failing, call the next level of support.
21. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3096: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Problem Isolation Procedures, CHAPTER 3
155
MAP 3096: Degraded SSA Link
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
A 40 MB/s SSA link between two DDMs is degraded and is running at 20 MB/s. The
DDMs are in separate DDM bays. The degraded link goes through two passthrough
cards, a bypass card, and an SSA cable. See Figure 73 for the relationship of these
FRUs.
The degradation of any of these components can cause the link to run slower.
v Drawer models, DDM bay
SSA Device
Cable
Bypass
Card
DDM
DDM Bay - A
Passthrough
Card
Passthrough
Card
DDM Bay Backplane
DDM DDM Bay - B
DDM Bay Backplane
Figure 73. SSA Link Degraded, Two Passthrough and Bypass Card Link Between Two DDMs (S008384l)
Isolation
1. Locate the SSA cable displayed on the service terminal as possible FRU. This
SSA cable will be connected between two separate DDM bays. The service
terminal will identify the drawer and SSA connector that each end of the SSA
cable is connected to. To locate a DDM bay, see ″Locating a DDM Bay or SSA
DASD Model 020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the
Enterprise Storage Server Service Guide, Volume 3. To locate SSA cable
connectors on a SSA DASD drawer, see Figure 74 on page 157.
Continue with the next step.
2. Disconnect both ends of the SSA device cable.
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
Inspect the cable connectors for bent pins and correct any problems found.
Each connector should have three pins. If there are less than three pins,
replace the cable. Reconnect both ends of the SSA device cable, ensure good
connection. Continue with the next step.
156
VOLUME 1, ESS Service Guide
MAP 3096: Degraded SSA Link
Figure 74. DDM bay SSA Connector Locations (S007693l)
3. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select any cable in the Possible FRUs to Replace list.
Proceed through the repair but do not replace any FRU or disconnect any
cables. This will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Go to step 12 on page 158.
v No, continue with the next step.
4. Replace the bypass card displayed on the service terminal, then verify the
repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12 on page 158.
v No, the SSA link is still degraded, continue with the next step.
5. Replace the first passthrough card displayed on the service terminal, then
verify the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12 on page 158.
v No, the SSA link is still degraded, continue with the next step.
6. Replace the second passthrough card displayed on the service terminal, then
verify the repair. See ″Bypass and Passthrough Cards, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Use the card removed in the last step.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12 on page 158.
v No, the SSA link is still degraded, continue with the next step.
7. Replace the SSA device cable that connects the two DDM bays. This cable is
displayed in the FRU list on the service terminal. To locate the cable, see step
1 on page 156.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12 on page 158.
v No, the SSA link is still degraded, continue with the next step.
8. Replace the first of the two DDMs displayed on the service terminal, then
verify the repair.
Problem Isolation Procedures, CHAPTER 3
157
MAP 3096: Degraded SSA Link
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12.
v No, the SSA link is still degraded, continue with the next step.
9. Replace the second DDM displayed on the service terminal with the DDM
removed in the last step, then verify the repair. See ″SSA Disk Drive Model,
7133 Model 020/040″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Note: It may take many hours before the second DDM can be replaced.
The service terminal will determine if the second DDM being replaced is in the
same array as the first DDM. If both DDMs are in the same array, the service
terminal will instruct you to wait for sparing to complete. When sparing for the
first DDM replacement completes, the second DDM can be replaced. DDM
sparing time for 18 MB DDMs can be up to 36 hours. Sparing time varies with
system usage and the storage capacity of the DDM being spared.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12.
v No, the SSA link is still degraded, continue with the next step.
10. Replace the frame assembly (backplane) in DDM bay A, see ″Frame
Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Did repair verification run without error?
Note: For DDM bays, the backplanes are replaced by replacing the frame
assembly.
v Yes, the problem is resolved. Go to step 12.
v No, the SSA link is still degraded, continue with the next step.
11. Replace the backplane in DDM bay B, then verify the repair:
v DDM bay see ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Note: For DDM bays, the backplanes are replaced by replacing the frame
assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 12.
v No, the SSA link is still degraded, call the next level of support.
12. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3100: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
158
VOLUME 1, ESS Service Guide
MAP 3100: SSA Link Error
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The SSA link between two DDMs is failing. The failing link is between two DDMs, in
different drawers or DDM bays, two signal and/or bypass cards and the SSA cable
that links them. See Figure 75 for the relationship of the DDM, signal and/or bypass
card, and backplane FRUs involved with this failure.
DDM locations in drawers
v SSA DASD Model 020 or 040 drawers:
– Drawer-A DDM 1, 4, 5, 8, 9, 13, or 16
– Drawer-B DDM 1, 4, 5, 8, 9, 13, or 16
v DDM bays:
– DDM 1 or 8
SSA Device
Cable
Drawer-A
DDM
Passthrough or
Bypass Cards
DDM
Drawer-B
Backplane or
DDM Bay Backplane
Backplane or
DDM Bay Backplane
(Front or Back)
(Front or Back)
Figure 75. SSA Link Failure, Passthrough/Bypass Cards and Two DDMs (S007650l)
Isolation
1. Review if any other problems (pending or open) have a single DDM as the
FRU.
Are there any pending or open problems with a single DDM as the FRU?
v Yes, go to step 2.
v No, go to step 3.
2. Compare the single DDM FRU in the pending or open problem with the DDMs
in the problem you are working on.
Is the DDM in the open or pending problem the same as one of the DDMs in
the problem you are working on?
v Yes, repair the problem with the single DDM FRU first, it should fix the
problem you are working on.
v No, go to step 3.
3. Determine if the SSA cables to the failing drawers have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, go to step 4 on
page 160.
v No, continue with step 6 on page 160.
Problem Isolation Procedures, CHAPTER 3
159
MAP 3100: SSA Link Error
4. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the drawer or DDM bay. See Locating an SSA Cable.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to step 5.
v No, go to step 6.
5. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, go to step 6.
6. Determine if Drawer-A, in Figure 75 on page 159, is a Model 040.
Note: For this repair, pick one of the drawers to be Drawer-A and the other
drawer to be Drawer-B. Use these drawer names for the service call.
Is Drawer-A a Model 040?
v Yes, go to step 12 on page 161.
v No, go to step 7.
7. Determine if Drawer-A, in Figure 75 on page 159, is a DDM bay.
Is Drawer-A a DDM bay?
v Yes, go to step 19 on page 163.
v No, go to step 8.
8. Use Figure 76 on page 161 in the following steps to locate the switch and
indicators on the SSA DASD drawer power control panel:
Note: Drawer A is a SSA DASD Model 020 drawer.
Power Switch (On/Off)
Power Indicator (green)
Check Indicator (amber)
160
VOLUME 1, ESS Service Guide
MAP 3100: SSA Link Error
Figure 76. SSA DASD Model 020 Power Control Panel Locations (S008020m)
9. Go to the front of the 2105 and locate Drawer-A with a DDM shown for
replacement. Observe the SSA DASD drawer green power indicator on the
drawer power control panel.
Is the green drawer power indicator on?
v Yes, go to step 11.
v No, continue with the next step.
10. Press and release the drawer power switch, on the drawer power control
panel.
Is the Power indicator on the drawer power control panel now on?
v Yes, determine if the problem is resolved. Return to the service terminal
Detail Problem screen. Select any FRU. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate a repair
and run verification.
v No, go to “MAP 3352: Isolating SSA DASD Drawer Power Problems” on
page 219.
11. Observe the SSA DASD drawer amber check indicator on the drawer power
control panel.
Is the Check indicator on the drawer power control panel on or blinking?
v Yes, go to “MAP 3150: Isolating an SSA DASD Drawer Power Problem” on
page 188.
v No, go to step 20 on page 163.
12. Go to the rear of the drawer. Observe the PWR (power) indicators on both
power supply assemblies, see Figure 77 on page 162.
Are both PWR indicators off?
v Yes, go to step 14 on page 162.
v No, go to step 13.
13. At the rear of the drawer. Observe the CHK/PWR Good (check/power)
indicators on both power supply assemblies, see Figure 77 on page 162.
Are either of the CHK/PWR indicators on green?
v Yes, go to step 20 on page 163.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172.
Problem Isolation Procedures, CHAPTER 3
161
MAP 3100: SSA Link Error
Figure 77. SSA DASD Model 040 Power Supply Assembly Indicator Locations (S008019m)
14. Verify that both drawer power cables are plugged into the drawer power supply
assemblies. Verify that the other ends of these cables are plugged into the
primary power supplies. See ″2105 Model 100 Rack Cable Removals and
Replacements″ in chapter 4, of the 2105 Model 100 Attachment to ESS
Service Guide book to determine where the power cables should be plugged.
Observe the PWR indicators on both of the power supply assemblies.
Are either of the drawer power supply PWR indicators now on?
v Yes, go to step 17.
v No, go to step 15.
15. Go to the front of the 2105 Model Exx/Fxx and press the Local Power switch
to On (up), then release it. This should reset any PPS internal circuit breakers
that are tripped.
Are both of the drawer power supply PWR indicators still off?
v Yes, go to “MAP 1320: Isolating Problems Using Visual Symptoms” on
page 58 in chapter 3 of this book, and determine if any visual failure
symptoms are present:
– If visual failure symptoms are found, repair them.
– If visual failure symptoms are not found, call your next level of support.
v No, the drawer now has power. Continue with the next step.
16. Run verification to determine if the problem is now resolved. Select any FRU
and go through the verification and repair process, but do not replace any
FRU.
Was verification successful?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, repair the new problem that was generated by the verification process.
17. At the rear of the drawer, observe the CHK/PWR GOOD indicators on both of
the power supply assemblies.
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 16.
v No, continue with the next step.
18. At the rear of the drawer, locate the power switch on each power supply
assembly:
Note: Pull the switch out before moving it up or down.
162
VOLUME 1, ESS Service Guide
MAP 3100: SSA Link Error
a. Set both power switches to off (down).
b. Wait about 10 seconds.
c. Set both power switches to on (up).
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 16 on page 162.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172
19. Locate DDM bay-A, it may be located in the front or rear of the 2105. Observe
all of the DDM bay DDM indicators, see Figure 78.
Are any of the DDM bay indicators on?
v Yes, go to step 32 on page 165.
v No, there is a DDM bay power problem, go to “MAP 3395: Isolating an SSA
DASD DDM Bay Power Problem” on page 259.
Figure 78. DDM Bay DDM Indicator Locations (S008021l)
20. Determine if Drawer-B, in Figure 75 on page 159, is a Model 040.
Is Drawer-B a Model 040?
v Yes, go to step 25 on page 164.
v No, continue with the next step.
21. Determine if Drawer-B, in Figure 75 on page 159, is a DDM bay.
Is Drawer-B a DDM bay?
v Yes, go to step 32 on page 165.
v No, continue with the next step.
22. Use Figure 76 on page 161 in the following steps to locate the switch and
indicators on the SSA DASD drawer power control panel:
Note: Drawer-B is a SSA DASD Model 020 drawer.
Power Switch (On/Off)
Power Indicator (green)
Check Indicator (amber)
23. Go to the front of Drawer-B. Observe the SSA DASD drawer green power
indicator on the drawer power control panel.
Is the green drawer power indicator on?
v Yes, go to step 24 on page 164.
v No, press and release the drawer power switch, on the drawer power
control panel.
Problem Isolation Procedures, CHAPTER 3
163
MAP 3100: SSA Link Error
Is the SSA DASD drawer power indicator is now on?
– Yes, determine if the problem is resolved. Return to the service terminal
Detail Problem screen. Select any FRU. Proceed through the repair but
do not replace any FRU or disconnect any cables. This will simulate a
repair and run verification.
– No, go to “MAP 3352: Isolating SSA DASD Drawer Power Problems” on
page 219.
24. Observe the SSA DASD drawer amber check indicator on the drawer power
control panel, see Figure 76 on page 161.
Is the SSA DASD drawer check indicator is or blinking?
v Yes, go to “MAP 3150: Isolating an SSA DASD Drawer Power Problem” on
page 188.
v No, go to step 33 on page 165.
25. Go to the rear of Drawer-B. Observe the PWR (power) indicators on both
power supply assemblies, see Figure 77 on page 162.
Are both PWR indicators off?
v Yes, go to step 27.
v No, go to step 26.
26. At the rear of the drawer, observe the CHK/PWR Good (check/power)
indicators on both power supply assemblies, see Figure 77 on page 162.
Are either of the CHK/PWR indicators on green?
v Yes, go to step 33 on page 165.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172.
27. Verify that both drawer power cables are plugged into the drawer power supply
assemblies. Verify that the other end of these cables are plugged into the
primary power supplies. See ″2105 Model 100 Rack Cable Removals and
Replacements″ in chapter 4, of the 2105 Model 100 Attachment to ESS
Service Guide book to determine where the power cables should be plugged.
Observe the PWR indicators on both of the power supply assemblies.
Are either of the drawer power supply PWR indicators now on?
v Yes, go to step 30 on page 165.
v No, go to step 28.
28. Go to the front of the 2105 Model Exx/Fxx and press the Local Power switch
to On (up), then release it. This should reset any PPS internal circuit breakers
that are tripped.
Are both of the drawer power supply PWR indicators still off?
v Yes, go to “MAP 1320: Isolating Problems Using Visual Symptoms” on
page 58 in chapter 3, volume 1 of this book, and determine if any visual
failure symptoms are present:
– If visual failure symptoms are found, repair them.
– If visual failure symptoms are not found, call your next level of support.
v No, the drawer now has power. Continue with the next step.
29. Run verification to determine if the problem is now resolved. Select any FRU
and go through the verification and repair process, but do not replace any
FRU.
Was verification successful?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, repair the new problem that was generated by the verification process.
164
VOLUME 1, ESS Service Guide
MAP 3100: SSA Link Error
30. At the rear of the drawer, observe the CHK/PWR GOOD indicators on both of
the power supply assemblies.
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 29 on page 164.
v No, continue with the next step.
31. At the rear of the drawer, locate the power switch on each power supply
assembly:
Note: Pull the switch out before moving it up or down.
a. Set both power switches to off (down).
b. Wait about 10 seconds.
c. Set both power switches to on (up).
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 29 on page 164.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172
32. Locate DDM bay-B, it may be located in the front or rear of the 2105. Observe
all of the DDM bay DDM indicators, see Figure 78 on page 163.
Are any of the DDM bay indicators on?
v Yes, go to step 33.
v No, there is a DDM bay power problem, go to “MAP 3395: Isolating an SSA
DASD DDM Bay Power Problem” on page 259.
33. Locate the SSA cable displayed on the service terminal as a possible FRU.
For this isolation procedure, the SSA cable will be connected between two
separate drawers or DDM bays. The service terminal FRU Location will identify
the drawer and SSA connector to which each end of the SSA cable is
connected. To locate a drawer, see ″Locating a DDM Bay or SSA DASD Model
020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage
Server Service Guide, Volume 3. Use the drawing below to locate SSA cable
connectors on a drawer. Select the cable shown on the service terminal for
repair.
Problem Isolation Procedures, CHAPTER 3
165
MAP 3100: SSA Link Error
7133 Model 020
J4
J1
J5
J16
J8
J13
J9
J12
4
Rear View
7133 Model 040
J4
J1
3
J5
J16
J8
J13
J9
J12
Rear View
Figure 79. SSA DASD Model 020 and 040 drawer SSA Connectors (S008762p)
Figure 80. DDM Bay SSA Connectors (S007693l)
a. Disconnect the SSA device cable between the two drawers.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Reconnect both ends of the SSA device cable, ensure good connection.
c. Run the repair verification. Determine if the problem is resolved. Return to
the service terminal Detail Problem screen. Select any FRU. Proceed
through the repair but do not replace any FRU or disconnect any cables.
This will simulate a repair and run verification.
166
VOLUME 1, ESS Service Guide
MAP 3100: SSA Link Error
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, go to step 34.
34. Replace the first of the two DDMs displayed on the service terminal, then
verify the repair.
Note: If the amber check indicator on one of the two DDMs is on, replace that
DDM first, see Figure 6 on page 14.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, the SSA link is still failing, go to step 35.
35. Replace the second DDM displayed on the service terminal with the DDM
removed in step 34, then verify the repair.
Note: The service terminal will determine if the second DDM being replaced is
in the same array as the first DDM. If both DDMs are in the same array,
the service terminal will instruct you to wait for sparing to completed.
When sparing for the first DDM replacement completes, the second
DDM can be replaced.
DDM sparing time can be many hours. Sparing time varies with system usage
and the storage capacity of the DDM being spared. An 18 GB drive may take
36 hours to spare on a heavily used system.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, the SSA link is still failing, go to step 36.
36. Replace the first of the two passthrough or bypass cards displayed on the
service terminal, then verify the repair. See ″Bypass and Passthrough Cards,
DDM Bay″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
If you are replacing a bypass card, verify the jumpers on the bypass card are
in the correct positions before replacing the card, see the ″SSA DASD Model
020 and 040 Drawer Bypass Card Jumper Settings″ figure in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41 on page 168.
v No, the SSA link is still failing, go to step 37.
37. Replace the second passthrough or bypass card displayed on the service
terminal, then verify the repair. See ″Bypass and Passthrough Cards, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Use the card removed in step 36.
If you are replacing a bypass card, verify the jumpers on the bypass card are
in the correct positions before replacing the card, see see the ″SSA DASD
Model 020 and 040 Drawer Bypass Card Jumper Settings″ figure in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved.
v No, the SSA link is still failing, go to step 38 on page 168.
Problem Isolation Procedures, CHAPTER 3
167
MAP 3100: SSA Link Error
38. Replace the SSA device cable displayed on the service terminal, see ″SSA
Cables, DDM Bay and 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41.
v No, the SSA link is still failing, go to step 39.
39. Replace the backplane in Drawer-A, see “MAP 3400: Replacing an SSA DASD
Drawer Backplane or Frame” on page 263.
Note: For SSA DASD Model 040 drawers or DDM bays, the backplanes are
replaced by replacing the frame (DDM bay) assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41.
v No, the SSA link is still failing, go to step 40.
40. Replace the backplane in Drawer-B, see “MAP 3400: Replacing an SSA DASD
Drawer Backplane or Frame” on page 263, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 41.
v No, the SSA link is still failing, call the next level of support.
41. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3101: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The 40 MB/s SSA link between two DDMs is degraded and is running at 20 MB/s.
The degraded link is between two DDMs, in different drawers or DDM bays, two
signal and/or bypass cards and the SSA cable that links them. See Figure 81 for
the relationship of the DDM, signal and/or bypass card, and backplane FRUs
involved with this failure.
DDM locations in drawers
v SSA DASD Model 040:
– Drawer-A DDM 1, 4, 5, 8, 9, 13, or 16
– Drawer-B DDM 1, 4, 5, 8, 9, 13, or 16
v DDM bays:
– Both are DDM 8
168
VOLUME 1, ESS Service Guide
MAP 3101: Degraded SSA Link
SSA Device
Cable
Drawer-A
DDM
Passthrough or
Bypass Cards
DDM
Drawer-B
Backplane or
DDM Bay Backplane
Backplane or
DDM Bay Backplane
(Front or Back)
(Front or Back)
Figure 81. SSA Link Failure, Passthrough/Bypass Cards and Two DDMs (S007650l)
Isolation
1. Locate the SSA cable displayed on the service terminal as a possible FRU. For
this isolation procedure, the SSA cable will be connected between two separate
drawers or DDM bays. The service terminal FRU Location will identify the
drawer and SSA connector to which each end of the SSA cable is connected.
To locate a drawer, see Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. Use the drawing below to locate SSA cable connectors on a
drawer. Select the cable shown on the service terminal for repair.
Problem Isolation Procedures, CHAPTER 3
169
MAP 3101: Degraded SSA Link
7133 Model 020
J4
J1
J5
J16
J8
J13
J9
J12
4
Rear View
7133 Model 040
J4
J1
3
J5
J16
J8
J13
J9
J12
Rear View
Figure 82. SSA DASD Model 020 and 040 Drawer SSA Connectors (S008762p)
Figure 83. DDM bay SSA Connectors (S007693l)
a. Disconnect the SSA device cable between the two drawers.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Disconnect both ends of each of these SSA cables.
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
170
VOLUME 1, ESS Service Guide
MAP 3101: Degraded SSA Link
c. Inspect the cable connectors for bent pins and correct any problems found.
There should be six pins in each plug. If there are less than six pins, replace
the cable. Reconnect both ends of the SSA device cable, ensure good
connection. Reconnect both ends of the SSA device cable, ensure good
connection.
d. Run the repair verification. Determine if the problem is resolved. Return to
the service terminal Detail Problem screen. Select any FRU. Proceed
through the repair but do not replace any FRU or disconnect any cables.
This will simulate a repair and run verification.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9 on page 172.
v No, continue with the next step.
2. Replace the first of the two passthrough or bypass cards displayed on the
service terminal, then verify the repair. See Bypass and Passthrough Cards,
DDM Bay in chapter 4 of the Enterprise Storage Server Service Guide, Volume
2.
If you are replacing a bypass card, verify the jumpers on the bypass card are in
the correct positions before replacing the card, see the SSA DASD Model 020
and 040 Drawer Bypass Card Jumper Settings figure in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9 on page 172.
v No, the SSA link is still degraded, continue with the next step.
3. Replace the second passthrough or bypass card displayed on the service
terminal, then verify the repair. See Bypass and Passthrough Cards, DDM Bay
in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Use the card removed in step 2.
If you are replacing a bypass card, verify the jumpers on the bypass card are in
the correct positions before replacing the card, see the SSA DASD Model 020
and 040 Drawer Bypass Card Jumper Settings figure in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9 on page 172.
v No, the SSA link is still degraded, continue with the next step.
4. Replace the SSA device cable displayed on the service terminal, see SSA
Cables, DDM Bay and 7133 Model 020/040 in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9 on page 172.
v No, the SSA link is still degraded, go to step 5.
5. Replace the first of the two DDMs displayed on the service terminal, then verify
the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9 on page 172.
v No, the SSA link is still degraded, continue with the next step.
6. Replace the second DDM displayed on the service terminal with the DDM
removed in step 5. then verify the repair.
Note: The service terminal will determine if the second DDM being replaced is
in the same array as the first DDM. If both DDMs are in the same array,
Problem Isolation Procedures, CHAPTER 3
171
MAP 3101: Degraded SSA Link
the service terminal will instruct you to wait for sparing to completed.
When sparing for the first DDM replacement completes, the second DDM
can be replaced.
DDM sparing time can be many hours. Sparing time varies with system usage
and the storage capacity of the DDM being spared. An 18 GB drive may take 36
hours to spare on a heavily used system.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9.
v No, the SSA link is still degraded, continue with the next step.
7. Replace the backplane in Drawer-A, see “MAP 3400: Replacing an SSA DASD
Drawer Backplane or Frame” on page 263.
Note: For SSA DASD Model 040 drawers or DDM bays, the backplanes are
replaced by replacing the frame (DDM bay) assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9.
v No, the SSA link is still degraded, continue with the next step.
8. Replace the backplane in Drawer-B, see “MAP 3400: Replacing an SSA DASD
Drawer Backplane or Frame” on page 263, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 9.
v No, the SSA link is still degraded, call the next level of support.
9. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3105: Isolating a Loss of Power to a SSA DASD Model 040
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Another MAP determined that both power supply assemblies in a SSA DASD Model
040 drawer are failing to provide power to the drawer. This MAP determines the
cause of the SSA DASD Model 040 power problem.
v Drawer model, SSA DASD Model 040
v One or more of the four indicators on the drawer power supply assemblies is not
on (green).
172
VOLUME 1, ESS Service Guide
MAP 3105: Power Loss to a SSA DASD Model 040
Isolation:
1. Go to the rear of the SSA DASD Model 040 drawer. Observe the PWR (power)
indicators and the CHK/PWR Good (check/power) indicators on both power
supply assemblies.
Do the following steps on the power supply assembly with both indicators NOT
on green:
a. Set both of the PWR/FLT Reset SW (power/fault reset switches) on the rear
of the power supply assembly to Off (down).
Note: Pull a switch out before moving it up or down.
b. Wait 20 seconds for power to drop completely.
c. Set both of the the switches to On (up).
Figure 84. SSA DASD Model 040 Power Supply Locations (S008019m)
2. Are both of the drawer power supply assembly CHK/PWR (check/power) Good
indicators off?
v Yes, call your next level of support.
v No, the problem may be resolved. Verify the repair, “MAP 3500: Verifying an
SSA DASD Drawer Repair” on page 279.
MAP 3120: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
An SSA link failed between a DDM and the SSA device card. See Figure 85 for the
relationship of the DDM, passthrough or bypass card, backplane, SSA device cable
and SSA device card FRUs involved with this failure.
Problem Isolation Procedures, CHAPTER 3
173
MAP 3120: SSA Link Error
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Figure 85. SSA Link Failure, Passthrough or Bypass Card Link Between a DDM and SSA
Device Card (S007652l)
Isolation
1. Review if any other problems (pending or open) have a single DDM or SSA
device card as the FRU.
Are there any pending or open problems with a single DDM or SSA device
card as the FRU?
v Yes, go to step 2.
v No, go to step 3.
2. Compare the single DDM or SSA device card FRU in the pending or open
problem with the DDM in the problem you are working on.
Is the FRU in the open or pending problem the same as the FRU in the
problem you are working on?
v Yes, repair the open or pending problem with the single FRU first, it should
fix the problem you are working on.
v No, go to step 3.
3. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, verify that the SSA cables are connected correctly, go to step 4.
v No, continue with step 6 on page 175.
4. Verify that the SSA cables are connected correctly. Look at the SSA cables
displayed on the Detail Problem screen. Compare the SSA cables displayed
with the cabling of the drawer or DDM bay. See ″Locating an SSA Cable″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
Are any of the SSA cables connected wrong?
v Yes, Connect the SSA cables to the correct connectors, go to step 5.
v No, go to step 6 on page 175.
5. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select the cable you just connected correctly. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Go to step 27 on page 180.
174
VOLUME 1, ESS Service Guide
MAP 3120: SSA Link Error
v No, go to step 6.
6. Determine if the drawer in Figure 75 on page 159, is a Model 040.
Is the drawer a Model 040?
v Yes, go to step 13 on page 176.
v No, go to step 7.
7. Determine if the drawer in Figure 75 on page 159, is a DDM bay.
Is the drawer a DDM bay?
v Yes, go to step 20 on page 177.
v No, go to step 8.
8. Use Figure 86 in the following steps to locate the switch and indicators on the
SSA DASD drawer power control panel:
Note: The drawer is a SSA DASD Model 020 drawer.
Power Switch (On/Off)
Power Indicator (green)
Check Indicator (amber)
Figure 86. SSA DASD Model 020 Power Control Panel Locations (S008020m)
9. Locate the SSA DASD Model 020. Observe the SSA DASD drawer green
power indicator on the drawer power control panel.
Is the green drawer power indicator on?
v Yes, go to step 12 on page 176.
v No, continue with the next step.
10. Press and release the drawer power switch, on the drawer power control
panel. Observe the green drawer Power On indicator.
Is the drawer Power On indicator on (green)?
v Yes, determine if the problem is resolved. Return to the service terminal
Detail Problem screen. Select any FRU. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate a repair
and run verification. Continue with the next step.
v No, go to “MAP 3352: Isolating SSA DASD Drawer Power Problems” on
page 219.
11. Was the verification successful?
Problem Isolation Procedures, CHAPTER 3
175
MAP 3120: SSA Link Error
v Yes, the problem is resolved, at the service terminal continue the repair
process to return the resources to the customer and close the problem.
v No, repair the new problem from the verification process.
12. Observe the SSA DASD drawer amber check indicator on the drawer power
control panel.
Is the amber Check indicator on or flashing?
v Yes, the SSA DASD drawer check indicator is on or blinking, go to “MAP
3150: Isolating an SSA DASD Drawer Power Problem” on page 188.
v No, the SSA DASD drawer check indicator is off, go to step 21 on page 178.
13. Go to the rear of the 7133 Model 040 drawer. Observe the PWR (power)
indicators on both power supply assemblies.
Are both PWR indicators off?
v Yes, go to step 15.
v No, go to step 14.
14. At the rear of the drawer. Observe the CHK/PWR Good (check/power)
indicators on both power supply assemblies.
Is either CHK/PWR GOOD on green?
v Yes, go to step 21 on page 178.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172.
15. Verify that both drawer power cables are plugged into the drawer power supply
assemblies. Verify that the other end of these cables are plugged into the
primary power supplies. See ″Bulk Power Supply Connection Physical Location
Codes″ in chapter 7 of the 2105 Model 100 Attachment to ESS Service Guide
book to determine where the power cables should be plugged. Go to the rear
of the drawer and observe the PWR indicators on both of the power supply
assemblies.
Are either of the drawer power supply PWR indicators now on?
v Yes, go to step 18 on page 177.
v No, go to step 16.
16. Go to the front of the 2105 Model E10/E20 and press the Local Power switch
to On (up), then release it. This should reset any PPS internal circuit breakers
that are tripped.
Note: Pressing the Local Power switch momentarily to On (up) clears any
power errors that were generated by the failure. It also restores any
power that was removed because of these failures. It does not affect
2105 power.
Are both of the drawer power supply PWR indicators still off?
v Yes, go to “MAP 1320: Isolating Problems Using Visual Symptoms” on
page 58 in chapter 3, volume 1 of this book, and determine if any visual
failure symptoms are present:
– If visual failure symptoms are found, repair them.
– If visual failure symptoms are not found, call your next level of support.
v No, the drawer now has power. Continue with the next step.
17. Run verification to determine if the problem is now resolved. Select any FRU
and go through the verification and repair process, but do not replace any
FRU.
Was verification successful?
176
VOLUME 1, ESS Service Guide
MAP 3120: SSA Link Error
v Yes, the problem is resolved. Go to step 27 on page 180.
v No, repair the new problem that was generated by the verification process.
18. Go to the rear of the drawer and observe the CHK/PWR GOOD indicators on
both of the power supply assemblies.
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 17 on page 176.
v No, continue with the next step.
19. Go to the rear of the drawer and locate the power switch on each power
supply assembly:
Note: Pull the switch out before moving it up or down.
a. Set both power switches to off (down).
b. Wait about 10 seconds.
c. Set both power switches to on (up).
Are either of the CHK/PWR GOOD indicators on green?
v Yes, the drawer now has power, go to step 17 on page 176.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172
Figure 87. SSA DASD Model 040 Power Supply Indicator Locations (S008019m)
20. Locate the DDM bay, it may be located in the front or rear of the 2105.
Observe all of the DDM bay DDM indicators, see Figure 88 on page 178.
Are any of the DDM bay indicators on?
v Yes, go to step 21 on page 178.
v No, there is a DDM bay power problem, go to “MAP 3395: Isolating an SSA
DASD DDM Bay Power Problem” on page 259.
Problem Isolation Procedures, CHAPTER 3
177
MAP 3120: SSA Link Error
Figure 88. DDM bay DDM Indicator Locations (S008021l)
21. Locate the SSA cable displayed on the service terminal as a possible FRU.
For this isolation procedure, the SSA cable will be connected between a
drawer and an SSA device card. The service terminal FRU Location will
identify the drawer and its SSA connector, and the SSA device card and its
SSA connector. To locate a drawer, see ″Locating a DDM Bay or SSA DASD
Model 020 or 040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise
Storage Server Service Guide, Volume 3. To locate SSA cable connectors on a
drawer, see Figure 89.
Note: The SSA device card cable connector is in the format R1-Tx-P2-Kx-yy,
where
v R1 is rack 1
v Tx is the cluster, 1 or 2
v P2 is the cluster planar
v Kx is the SSA device card location, slot
v yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 89 and Figure 90 on
page 179.
Figure 89. DDM bay SSA Connector Locations (S007693l)
178
VOLUME 1, ESS Service Guide
MAP 3120: SSA Link Error
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 90. Cluster SSA Device Card SSA Connector Locations (S008022m)
a. Disconnect the SSA device cable from the SSA device card and the SSA
DASD drawer
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Reconnect both ends of the SSA device cable, ensure good connection.
c. Run the repair verification, go to the Problem Detail screen on the service
terminal. Select any FRU for replacement, go through the repair and
verification procedure but do not remove or replace any FRU. This will
verify if the problem is resolved.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27 on page 180.
v No, go to step 22.
22. Replace the DDM displayed on the service terminal, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27 on page 180.
v No, the SSA link is still failing, go to step 23.
23. Replace SSA device card displayed on the service terminal, then verify the
repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27 on page 180.
v No, the SSA link is still failing, go to step 24.
24. Replace the passthrough or bypass card displayed on the service terminal,
then verify the repair. See ″Bypass Cards, 7133 Model 020/040″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Note: Verify the jumpers on the bypass card are in the correct positions
before replacing the card, see the ″SSA DASD Model 020 and 040
Problem Isolation Procedures, CHAPTER 3
179
MAP 3120: SSA Link Error
Drawer Bypass Card Jumper Settings″ figure in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27.
v No, the SSA link is still failing, go to step 25.
25. Replace the SSA device cable displayed on the service terminal probable FRU
list, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27.
v No, the SSA link is still failing, go to step 26.
26. Replace the backplane or frame assembly displayed on the service terminal:
v SSA DASD Model 020
– Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 27.
v No, the SSA link is still failing, call the next level of support.
27. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3121: Isolating a Degraded SSA Link
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
180
VOLUME 1, ESS Service Guide
MAP 3121: Degraded SSA Link
Description
A 40 MB/s SSA link between a DDM and the SSA device card is degraded and is
running at 20 MB/s. See Figure 91 for the relationship of the DDM, passthrough or
bypass card, backplane, SSA device cable and SSA device card FRUs involved
with this degraded link.
v Drawer models, SSA DASD Model 040 or SSA DASD DDM bay
Figure 91. SSA Link Failure, Passthrough or Bypass Card Link Between a DDM and SSA
Device Card (S007652l)
Isolation
1. Locate the SSA cable displayed on the service terminal as a possible FRU. For
this isolation procedure, the SSA cable will be connected between a drawer and
an SSA device card. The service terminal FRU Location will identify the drawer
and its SSA connector, and the SSA device card and its SSA connector. To
locate a drawer, see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. To locate SSA cable connectors on a drawer, see Figure 92
on page 182.
Note: The SSA device card cable connector is in the format R1-Tx-P2-Kx-yy,
where
v
v
v
v
v
R1 is rack 1
Tx is the cluster, 1 or 2
P2 is the cluster planar
Kx is the SSA device card location, slot
yy is the cable connector, A1, A2, B1, or B2
To locate an SSA device card cable connector, see Figure 92 on page 182 and
Figure 93 on page 182.
Problem Isolation Procedures, CHAPTER 3
181
MAP 3121: Degraded SSA Link
Figure 92. DDM bay SSA Connector Locations (S007693l)
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 93. Cluster SSA Device Card SSA Connector Locations (S008022m)
a. Disconnect the SSA device cable from the SSA device card and the SSA
DASD drawer
Note: To prevent damage to the SSA device cable connector screws,
always use the special screwdriver (SSA tool, P/N 32H7059) to turn
them. This screwdriver is in the 2105 ship group.
b. Inspect the cable connectors for bent pins and correct any problems found.
Each connector should have three pins. If there are less than three pins,
replace the cable. Reconnect both ends of the SSA device cable, ensure
good connection.
c. Run the repair verification, go to the Problem Detail screen on the service
terminal. Select any FRU for replacement, go through the repair and
verification procedure but do not remove or replace any FRU. This will verify
if the problem is resolved.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7 on page 183.
v No, continue with the next step.
2. Replace the passthrough or bypass card displayed on the service terminal, then
verify the repair. See ″Bypass Cards, 7133 Model 020/040″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
182
VOLUME 1, ESS Service Guide
MAP 3121: Degraded SSA Link
Note: Verify the jumpers on the bypass card are in the correct positions before
replacing the card, see the ″SSA DASD Model 020 and 040 Drawer
Bypass Card Jumper Settings″ figure in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7.
v No, the SSA link is still degraded, continue with the next step.
3. Replace the DDM displayed on the service terminal, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7.
v No, the SSA link is still degraded, continue with the next step.
4. Replace SSA device card displayed on the service terminal, then verify the
repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7.
v No, the SSA link is still degraded, continue with the next step.
5. Replace the SSA device cable displayed on the service terminal probable FRU
list, then verify the repair.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7.
v No, the SSA link is still degraded, continue with the next step.
6. Replace the backplane or frame assembly displayed on the service terminal:
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Note: For SSA DASD Model 040 drawers, the backplanes are both
replaced at the same time by replacing the frame assembly.
v DDM bay
– Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
Did repair verification run without error?
v Yes, the problem is resolved. Go to step 7.
v No, the SSA link is still degraded, call the next level of support.
7. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3123: Array Repair Required
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
183
MAP 3123: Array Repair Required
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
This failure indicates that a DDM failure occurred during an array build. The array
needs to be rebuilt.
v Drawer models, SSA DASD Model 020 or 040 drawer, or DDM bay
Isolation
1. Repair any other problems before continuing with this MAP.
2. Display the problem and record the information with the FRU Engineering
Name. This information should be rank## or ssa## with ## being a one or two
digit number.
3. Record the SRN and the rank or SSA number, then call your next level of
support. They will help you and the system operator through the array disband
and rebuild.
4. This problem will have to be manually closed after the rebuild is started.
MAP 3124: Isolating Between DDM Hardware and Microcode Failures
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
This failure indicates that either the hardware or the microcode of a DDM has failed.
This MAP will determine if which has failed.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Display the problem logs. From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
2. Review the SRN portion of each one line problem description.
Does this same SRN appear in more than one problem?
v Yes, this is a complex problem that the maintenance procedures are unable
to resolve. Call your next level of support.
v No, select the DDM in this problem for replacement. Follow the service
terminal instructions for the replacement of the DDM.
MAP 3125: Isolating an Unexpected SSA SRN
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
184
VOLUME 1, ESS Service Guide
MAP 3125: Unexpected SSA SRN
Description
The cluster received an unexpected service request number (SRN) from the SSA.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3.
2. Look at the SSA cables displayed on the Detail Problem screen. Compare the
SSA cables displayed with the cabling of the drawer or DDM bay.
Are any of the SSA cables connected wrong?
v Yes, Connect the cables to the correct connectors. Verify the repair, go to
“MAP 3500: Verifying an SSA DASD Drawer Repair” on page 279.
v No, go to step 3.
3. The problem cannot be corrected with a service procedure.
4. Call your next level of support.
Note: An unassisted repair can disrupt customer operation and may loose
customer data.
MAP 3126: Isolating an Unexpected SSA Test Result
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
The cluster received unexpected results from the SSA.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Determine if the SSA jumpers or SSA cables to the failing drawer have just
been changed or installed.
Have the SSA jumpers or cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3.
2. Look at the SSA cables displayed on the Detail Problem screen. Compare the
SSA cables displayed with the cabling of the drawer or DDM bay.
Are any of the SSA cables connected wrong?
v Yes, Connect the jumper cables to the correct connectors. Verify the repair,
go to “MAP 3500: Verifying an SSA DASD Drawer Repair” on page 279.
v No, continue with the next step.
3. Check if there are any other open problems:
v If there are no other problems to repair, go to step 5 on page 186.
Problem Isolation Procedures, CHAPTER 3
185
MAP 3126: Unexpected SSA Test Result
v If there are other problems, repair them before continuing with this MAP, then
continue with the next step.
4. If this problem is still open after repairing the other problems, continue with the
next step.
5. The problem cannot be corrected with a service procedure.
6. Call your next level of support.
Note: An unassisted repair can disrupt customer operation and may loose
customer data.
MAP 3127: Formatting of a DDM Has Not Completed
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
Disk drive module (DDM) still formatting from previous installation or repair.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Wait for the formatting of the DDM to complete. Formatting is complete when
the indicators on the DDM stop flickering.
2. Retry the verification test.
MAP 3128: Isolating an Unknown DDM Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
DDM Failure(s) have left array(s) with no spares.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Check for any other DDM or SSA problems:
Display problems needing repair.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Repair Menu
Show / Repair Problems Needing Repair.
v If there are other DDM or SSA problems, repair and test them.
v If there are not any other DDM or SSA problems, continue with the next step.
2. Call your next level of support.
186
VOLUME 1, ESS Service Guide
MAP 3129: Array Repair Required
MAP 3129: Isolating an Array Repair Required Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
Array is not available for customer use. There may be multiple problems that can
be repaired to restore access. If no problems are found call your next level of
support.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Check for any other DDM or SSA problems:
Display problems needing repair.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Repair Menu
Show / Repair Problems Needing Repair.
v If there are other DDM or SSA problems, repair and test them.
v If there are not any other DDM or SSA problems, continue with the next step.
2. Call your next level of support.
MAP 3142: Isolating Multiple DDMs on an SSA Loop Cannot be
Accessed
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Multiple DDMs on an SSA loop cannot be accessed.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Check if there are any other open problems:
Note: Priority should be given to problems with the same ssaxx (SSA device
card) or rsDDMxxxx as Failing Resource.
Note the problem ID of the problem you are working on. To find other problems,
press F3 until the Main Service Menu is displayed.
Problem Isolation Procedures, CHAPTER 3
187
MAP 3142: Multiple DDMs on an SSA Loop Cannot be Accessed
From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems
v If there are no other problems that can be repaired, go to step 3.
v If there are other problems, repair them before continuing with this MAP, then
continue with the next step.
2. If this problem is still open after repairing the other problems, continue with the
next step.
3. The problem cannot be corrected with a service procedure.
4. Call your next level of support.
Note: An unassisted repair can disrupt customer operation and may loose
customer data.
MAP 3150: Isolating an SSA DASD Drawer Power Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
You might have been sent here because:
v The system problem determination procedures sent you here.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Review the service terminal screen that sent you to this MAP. One of the FRUs
named is rsssaPwrTray#, the # is a one or two digit number. Use the FRU
location to determine which SSA DASD drawer to service.
2. Inspect the failing SSA DASD drawer.
Is the SSA DASD drawer emitting smoke or a smell of burning?
v Yes, perform the following actions:
a. If the SSA DASD drawer is powered on, power it off, refer to ″Drawer
Power, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Press and release the drawer power switch on the SSA DASD drawer
power control panel.
b. At the back of the SSA DASD drawer, unplug all three power cables from
the fan-and-power-supply assemblies.
c. Allow the SSA DASD drawer to cool.
d. Go to “MAP 3356: Isolating SSA DASD Drawer Power On Problems” on
page 227.
v No, go to step 3 on page 189.
188
VOLUME 1, ESS Service Guide
MAP 3150: SSA DASD Drawer Power
3. Observe the SSA DASD drawer indicators, see “SSA DASD Model 020 Drawer
Indicators and Power Switch” on page 9.
Is this SSA DASD drawers amber drawer check indicator on or blinking?
v Yes, go to step 4.
v No, go to step 8 on page 191.
4. Check the indicators on the fan-and-power-supply assemblies in the failing SSA
DASD drawer.
Does any fan-and-power-supply assembly in the SSA DASD drawer have its
fan-and-power CHK (check) indicator on or blinking?
Figure 94. SSA DASD Drawer Fan-and-Power-Supply Assembly Indicators (S008029l)
v Yes, check for the following conditions:
– If the fan-and-power CHK (check) indicator is permanently on, go to step 5
on page 190.
– If the fan-and-power CHK (check) indicator is blinking:
a. Select the rsssaM2PwrSup## listed on the service terminal as a FRU for
the problem being repaired. Follow the service terminal instructions to
replace the fan-and-power-supply assembly, see ″Fan and Power
Supply Assembly, 7133 Model 020″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
b. Return to the service terminal and verify the repair:
- if repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to
the customer.
- If repair verification is not successful, go to step 6 on page 190.
v No, go to step “MAP 3354: Isolating an SSA DASD Drawer Multiple DDM
Redundant Visual Power Fault” on page 223.
Problem Isolation Procedures, CHAPTER 3
189
MAP 3150: SSA DASD Drawer Power
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 95. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
5. Check the indicators on the fan-and-power-supply assemblies in the failing SSA
DASD drawer.
Does any fan-and-power-supply assembly whose fan-and-power CHK (check)
indicator is on have its PWR (power) indicator on?
Note: The fan-and-power supply PWR (power) indicators may be hidden
behind the fan mounting latches.
v Yes, perform the following repairs:
a. Select the rsssaM2PwrSup## listed on the service terminal as a FRU for
the problem being repaired. Follow the service terminal instructions to
replace the fan-and-power-supply assembly, see ″Fan and Power Supply
Assembly, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
b. Return to the service terminal and verify the repair:
– if repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer.
– If repair verification is not successful, go to step 6.
v No, go to “MAP 3350: Isolating SSA DASD Drawer Power Problems” on
page 212.
6. Check the indicators on the fan-and-power-supply assembly #1 in the failing
SSA DASD drawer.
Is the fan-and-power CHK (check) indicator on fan-and-power-supply assembly
#1 (Fan 1 on Figure 94 on page 189) on or blinking?
v Yes, perform the following:
a. Select the rsssaPwrTray## listed on the service terminal as a FRU for the
problem being repaired. Follow the service terminal instructions to replace
the right-power-distribution-tray assembly in the failing drawer, see
″Power Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Check the Power Complete, Line Cord 1 and 2 indicators in Figure 95, on
the front of the 2105 Model Exx/Fxx.
190
VOLUME 1, ESS Service Guide
MAP 3150: SSA DASD Drawer Power
– If both indicators are on, go to step 6c.
– If either indicator is off or blinking, press the Local Power switch in
Figure 95 on page 190, to On (up) for two seconds then release it. Go
to step 6c.
Note: Pressing the Local Power switch resets any tripped electronic
circuit breakers in the PPS that control power to the SSA DASD
drawer.
c. Return to the service terminal and verify the repair:
– if repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer.
– If repair verification is not successful, go to step 7.
v No, go to step 7.
7. Check the indicators on the fan-and-power-supply assembly #3 in the failing
SSA DASD drawer.
Is the fan-and-power CHK (check) indicator on fan-and-power-supply assembly
#3 (Fan 3 on Figure 94 on page 189) on or blinking?
v Yes, perform the following:
a. Select the rsssaPwrTray## listed on the service terminal as a FRU for the
problem being repaired. Follow the service terminal instructions to replace
the left-power-distribution-tray assembly in the failing drawer, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Check the Power Complete, Line Cord 1 and 2 indicators in Figure 95 on
page 190, on the front of the 2105 Model Exx/Fxx.
– If both indicators are on, go to step 7c.
– If either indicator is off or blinking, press the Local Power switch in
Figure 95 on page 190, to On (up) for two seconds then release it. Go
to step 7c.
Note: Pressing the Local Power switch resets any tripped electronic
circuit breakers in the PPS that control power to the SSA DASD
drawer.
c. Return to the service terminal and verify the repair:
– if repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer.
– If repair verification is not successful, go to step 8.
v No, call your next level of support.
8. Check the Drawer Power indicator on the failing SSA DASD drawer.
Is this SSA DASD drawers Drawer Power indicator off?
v Yes, go to step 9.
v No, call your next level of support.
9. Check if the SSA DASD drawer is powered on (check whether any disk drive
modules have indicators that are on)?
Is the failing SSA DASD drawer powered on?
v Yes, go to “MAP 3352: Isolating SSA DASD Drawer Power Problems” on
page 219.
v No, perform the following actions:
Problem Isolation Procedures, CHAPTER 3
191
MAP 3150: SSA DASD Drawer Power
a. Power on the SSA DASD drawer, press and release the drawer power
switch on the SSA DASD drawer power control panel.
b. Go to step 2 on page 188 in this MAP.
MAP 3151: Isolating an SSA DASD Drawer Visual Power Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
v Drawer model, SSA DASD Model 020 drawer
Isolation
You might have been sent here because:
v A visual symptom sent you here.
v Another MAP sent you here.
v A customer observed a problem that was not detected by the system problem
determination procedures.
1. Inspect the failing SSA DASD drawer.
Is the SSA DASD drawer emitting smoke or a smell of burning?
v Yes, perform the following actions:
a. If the SSA DASD drawer is powered on, power it off, refer to ″Drawer
Power, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Press and release the drawer power switch on the SSA DASD drawer
power control panel.
b. At the back of the SSA DASD drawer, unplug all three power cables from
the fan-and-power-supply assemblies.
c. Allow the SSA DASD drawer to cool.
d. Go to “MAP 3356: Isolating SSA DASD Drawer Power On Problems” on
page 227
v No, go to step 2.
2. Use the service terminal to determine if there are any related power problems
with the RPC or SSA DASD drawer.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair.
Are there any open power problems (SSA, DDM, or RPC card)?
v Yes, follow the instructions on the service terminal to repair the power
problem. This repair should also fix your visual symptom.
v No, from the visual symptoms you should already know the SSA DASD
drawer location, go to step 3 on page 193.
192
VOLUME 1, ESS Service Guide
MAP 3151: SSA DASD Drawer Power
3. Observe the SSA DASD drawer indicators, see “SSA DASD Model 020 Drawer
Indicators and Power Switch” on page 9.
Is this SSA DASD drawer amber drawer check indicator on or blinking?
v Yes, go to step 4.
v No, go to step 8 on page 195.
4. Check the indicators on the fan-and-power-supply assemblies in the failing SSA
DASD drawer.
Does any fan-and-power-supply assembly in the SSA DASD drawer have its
fan-and-power CHK (check) indicator on or blinking?
Figure 96. SSA DASD Drawer Fan-and-Power-Supply Assembly Indicators (S008029l)
v Yes, check for the following conditions:
– If the fan-and-power CHK (check) indicator is permanently on, go to step 5
on page 194.
– If the fan-and-power CHK (check) indicator is blinking:
a. Replace the fan-and-power-supply assembly with the blinking CHK
(check) indicator, see ″Fan and Power Supply Assembly, 7133 Model
020″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
b. Go to “MAP 3500: Verifying an SSA DASD Drawer Repair” on
page 279 to verify the repair.
- If repair verification is not successful, go to step 6 on page 194.
- If repair verification is successful, go to step “MAP 3360: Ending a
DASD Service Action” on page 231.
v No, go to step “MAP 3354: Isolating an SSA DASD Drawer Multiple DDM
Redundant Visual Power Fault” on page 223.
Problem Isolation Procedures, CHAPTER 3
193
MAP 3151: SSA DASD Drawer Power
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 97. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
5. Check the indicators on the fan-and-power-supply assemblies in the failing SSA
DASD drawer.
Does any fan-and-power-supply assembly whose fan-and-power CHK (check)
indicator is on have its PWR (power) indicator on?
Note: The fan-and-power supply PWR (power) indicators may be hidden
behind the fan mounting latches.
v Yes, perform the following repairs:
a. Replace all fan-and-power-supply assemblies whose
fan-and-power-supply supply CHK (check) and PWR (power) indicators
are both on, see ″Fan and Power Supply Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
b. Go to “MAP 3500: Verifying an SSA DASD Drawer Repair” on page 279
to verify the repair.
– If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
– If repair verification is not successful, go to step 6.
v No, go to “MAP 3351: Isolating SSA DASD Drawer Visual Power Problems”
on page 216.
6. Check the indicators on fan-and-power-supply assembly number 1 in the failing
SSA DASD drawer.
Is the fan-and-power CHK (check) indicator on the fan-and-power-supply
assembly number 1 (Fan 1 on Figure 96 on page 193) on or blinking?
v Yes, perform the following:
a. Replace the right-power-distribution-tray assembly in the failing drawer,
see ″Power Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
b. Check the Power Complete, Line Cord 1 and 2 indicators in Figure 97, on
the front of the 2105 Model Exx/Fxx.
– If both indicators are on, go to step 6c on page 195.
194
VOLUME 1, ESS Service Guide
MAP 3151: SSA DASD Drawer Power
– If either indicator is off or blinking, press the Local Power switch in
Figure 97 on page 194, to On (up) for two seconds then release it. Go
to step 6c.
Note: Pressing the Local Power switch resets any tripped electronic
circuit breakers in the PPS that control power to the SSA DASD
drawer.
c. Go to “MAP 3500: Verifying an SSA DASD Drawer Repair” on page 279
to verify the repair.
– If repair verification is not successful, go to step 7.
– If repair verification is successful, go to step “MAP 3360: Ending a
DASD Service Action” on page 231.
v No, go to step 7.
7. Check the indicators on fan-and-power-supply assembly number 3 in the failing
SSA DASD drawer.
Is the fan-and-power CHK (check) indicator on the fan-and-power-supply
assembly number 3 (Fan 3 on Figure 96 on page 193) on or blinking?
v Yes, perform the following:
a. Replace the left-power-distribution-tray assembly in the failing drawer, see
″Power Distribution Tray Assembly″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
b. Check the Power Complete, Line Cord 1 and 2 indicators in Figure 97 on
page 194, on the front of the 2105 Model Exx/Fxx.
– If both indicators are on, go to step 7c.
– If either indicator is off or blinking, press the Local Power switch in
Figure 97 on page 194, to On (up) for two seconds then release it. Go
to step 7c.
Note: Pressing the Local Power switch resets any tripped electronic
circuit breakers in the PPS that control power to the SSA DASD
drawer.
c. Go to “MAP 3500: Verifying an SSA DASD Drawer Repair” on page 279
to verify the repair.
– If repair verification is not successful, go to step 8.
– If repair verification is successful, go to step “MAP 3360: Ending a
DASD Service Action” on page 231.
v No, call your next level of support.
8. Check the Drawer Power indicator on the failing SSA DASD drawer
fan-and-power-supply assemblies in the failing SSA DASD drawer.
Is this SSA DASD drawers Drawer Power indicator off?
v Yes, go to step 9.
v No, call your next level of support.
9. Check if the SSA DASD drawer is powered on (check whether any disk drive
modules have indicators that are on)
Is the failing SSA DASD drawer powered on?
v Yes, go to “MAP 3352: Isolating SSA DASD Drawer Power Problems” on
page 219.
v No, perform the following actions:
a. Power on the SSA DASD drawer, press and release the drawer power
switch on the SSA DASD drawer power control panel.
Problem Isolation Procedures, CHAPTER 3
195
MAP 3151: SSA DASD Drawer Power
b. Go to step 2 on page 192 in this MAP.
MAP 3155: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The SSA link between two drawers is failing. The failing link is between two SSA
DASD Model 020 drawers. DDM Z was found next to DDM X in the loop when
DDM Y was expected. The problem could be the bypass card in either drawer, or
Drawer B could be powered off. See Figure 98 for the relationship of the drawers
and bypass card and DDMs involved with this failure.
Drawer-A
7133 model
020
DDM
X
Drawer-B
7133 Model 020
SSA Device
Cable
Bypass
Card
Bypass
Card
DDM
Y
Backplane
(Front or Back)
Backplane
(Front or Back)
Backplane
(Front or Back)
DDM
Z
Figure 98. SSA Link Failure, Two SSA DASD Drawers (S007653n)
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3 on page 197.
2. Look at the SSA cables displayed on the Detail Problem screen. Compare the
SSA cables displayed with the cabling of the drawer or DDM bay. See ″Locating
an SSA Cable″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3.
196
VOLUME 1, ESS Service Guide
MAP 3155: SSA Link Error
Are any of the SSA cables connected wrong?
v Yes, Connect the cables to the correct connectors. Determine if the problem
is resolved. Return to the service terminal Detail Problem screen. Select any
of the FRUs. Proceed through the repair but do not replace any FRU or
disconnect any cables. This will simulate a repair and run verification.
v No, go to step 3.
3. Locate drawer-B, see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. Look at the Possible FRUs to Replace list. The Bypass Card
with 35% probability will be in Drawer B. Use the location code of that card to
find Drawer B. Use Figure 99 in the following steps to locate the switch and
indicators on the SSA DASD drawer-B power control panel:
v Power Switch (On/Off)
v Power Indicator (green)
v Check Indicator (amber)
Figure 99. SSA DASD Model 020 Power Control Panel Locations (S008020m)
4. Observe the green drawer power indicator on the power control panel of the
SSA DASD drawer shown in the service terminal FRU list (Drawer B).
Is the green drawer power indicator on?
v Yes, continue with step 7 on page 198.
v No, power the SSA DASD drawer on.
Press and release the drawer power switch on the drawer power control
panel.
5. Observe the green drawer power indicator on the power control panel.
Is the green drawer power indicator on?
v Yes, powering the drawer on may have fixed the problem. Determine if the
problem is resolved. Return to the service terminal Detail Problem screen.
Select the bypass card for repair. Proceed through the repair but do not
replace any FRU or disconnect any cables. This will simulate a repair and run
verification. Continue with the next step.
v No, the drawer has a power problem, go to “MAP 3352: Isolating SSA DASD
Drawer Power Problems” on page 219.
Problem Isolation Procedures, CHAPTER 3
197
MAP 3155: SSA Link Error
6. Was the verification successful?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, call your next level of support
7. Replace the bypass card that is the first FRU card in the problem Possible
FRUs to Replace list, then verify the repair.
Was the verification successful?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, continue with the next step.
8. Replace the other bypass card, then verify the repair.
Was the verification successful?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, call your next level of support.
MAP 3158: Isolating an SSA Link Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The SSA link between two drawers is failing. The failing link is between two SSA
DASD Model 040 drawers. DDM Z was found next to DDM X in the loop when
DDM Y was expected. The problem could be the bypass card in either drawer, or
Drawer B could be powered off. See Figure 100 on page 199 for the relationship of
the drawers and bypass card and DDMs involved with this failure.
198
VOLUME 1, ESS Service Guide
MAP 3158: SSA Link Error
Drawer-A
7133 model
040
DDM
X
SSA Device
Cable
Bypass
Card
Bypass
Card
Backplane
(Front or Back)
Drawer-B
7133 Model 040
DDM
Y
Backplane
(Front or Back)
Backplane
(Front or Back)
DDM
Z
Figure 100. SSA Link Failure, Two SSA DASD Drawers (S007654n)
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3.
2. Look at the SSA cables displayed on the Detail Problem screen. Compare the
SSA cables displayed with the cabling of the drawers and/or DDM bays. See
″Locating an SSA Cable″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3.
Are any of the SSA cables connected wrong?
v Yes, Connect the SSA cables to the correct connectors. Determine if the
problem is resolved. Return to the service terminal Detail Problem screen.
Select one of the cables. Proceed through the repair but do not replace any
FRU or disconnect any cables. This will simulate a repair and run verification.
– If the repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to
the customer.
– If the repair verification is not successful, continue with the next step.
v No, go to step 3.
3. Locate drawer-B, see ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3. Look at the possible FRUs to Replace list. The bypass card
with 35% probability will be in Drawer B. Use the location code of that card to
find Drawer B.
Go to the rear of drawer-B. Observe the PWR (power) indicators on both power
supply assemblies.
Are both PWR indicators off?
Problem Isolation Procedures, CHAPTER 3
199
MAP 3158: SSA Link Error
v Yes, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model 040”
on page 172.
v No, go to step 4.
4. Observe the and CHK/PWR Good (check/power) indicators on both power
supply assemblies.
Are either of the indicators on (green)?
v Yes, go to step 6.
v No, go to step 5.
Figure 101. SSA DASD Model 040 Power Supply Assembly Locations (S008019m)
5. Power both drawer power supplies Off then On:
a. Turn the Power/Reset switches on both power supply assemblies Off, pull
the switch out then push down.
b. Wait about twenty seconds.
c. Turn the Power/Reset switches on both power supply assemblies On, pull
the switch out then push up.
Are either of the green CHK/PWR indicators now On?
v Yes, determine if the problem is resolved. Return to the service terminal
Detail Problem screen. Select the bypass card for repair. Proceed through the
repair but do not replace any FRU or disconnect any cables. This will
simulate a repair and run verification.
v Return to the service terminal and verify the repair.
– If the repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to
the customer.
– If the repair verification is not successful, continue with the next step.
v No, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model 040”
on page 172.
6. Select and replace the first bypass card indicated by the service terminal, see
″Bypass Cards, 7133 Model 020/040″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
7. Return to the service terminal and verify the repair.
v If the repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to the
customer.
200
VOLUME 1, ESS Service Guide
MAP 3158: SSA Link Error
v If the repair verification is not successful, continue with the next step.
8. Select and replace the second bypass card indicated by the service terminal,
see ″Bypass Cards, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
9. Return to the service terminal and verify the repair.
v If the repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to the
customer.
v If the repair verification is not successful, call your next level of support.
MAP 3160: SSA DASD Drawer Isolating a Single DDM Redundant
Power Fault
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Only one DDM in the SSA DASD drawer is detecting a loss of redundant power or
cooling. This MAP helps you to isolate FRUs that are causing a power problem on
a SSA DASD drawer.
v Drawer models, SSA DASD Model 020 or 040 drawer or DDM bay
Isolation
v In the sequence shown, replace the following FRUs with new ones. After
replacement of each FRU, verification will test to see if the problem is fixed. If
verification completes successfully, the problem is resolved. If verification fails,
you will be directed to replace the next FRU and run verification again.
1. Disk drive module, see ″SSA Disk Drive Model, 7133 Model 020/040″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Did repair verification run without error?
– Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
– No, the failure is still present, go to step 2.
2. Replace the backplane or frame displayed on the service terminal:
– SSA DASD Model 020 drawer
- Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
- Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– SSA DASD Model 040 drawer
- Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
201
MAP 3160: SSA DASD Drawer Single DDM Redundant Power Fault
– DDM bay
- Frame assembly, see ″Frame Assembly, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Note: The DDM bay backplane is replaced by replacing the DDM Bay
frame assembly.
3. Return to the service terminal and verify the repair.
– If the repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to
the customer.
– If the repair verification is not successful, call you next level of support.
MAP 3180: Controller Card Failed or Wrong Drawer Type Installed
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
There are two possible causes of this error symptom:
1. A controller card has failed in a SSA DASD Model 040 or a DDM bay.
2. A SSA DASD Model 020 drawer has been installed where a SSA DASD Model
040 or DDM bay was expected.
If an attempt was made to install a different type of drawer onto an SSA loop than
was expected, the condition must be corrected. All of the drawers on the SSA loop
must be uninstalled then reinstalled. If the customer has any data on the SSA loop,
they will need to off load the data and reload it after the reinstallation.
v Drawer models, SSA DASD Model 020 or 040 drawer or DDM bay
Isolation
1. Use the service terminal to locate the controller card displayed as a Possible
FRU to Replace. Copy down the Resource Name of the card (rs40CtlCdxx or
rs8pkctlrxx). Also copy down the FRU Location Description for this controller
card (Rr-Yxx-CA or Rr-Ux-Wx-C5).
Is the Resource Name you recorded ″rs40CtlCdxx″?
v Yes, a SSA DASD Model 040 drawer was expected, copy down Model 040
then continue with the next step.
v No, the Resource Name you recorded is ″rs8pkctlrxx″. A DDM bay was
expected, copy down DDM bay then continue with the next step.
2. Locate the drawer SSA DASD drawer or DDM bay indicated by the FRU
Location Description for the controller card. Ignore the (-CA or -C5) in the FRU
location code. Use ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
202
VOLUME 1, ESS Service Guide
MAP 3180: Controller Card or Wrong Drawer Error
Guide, Volume 3 to locate the SSA DASD drawer or DDM bay and to determine
which type of drawer is installed at that location.
Is the DDM bay or SSA DASD drawer the same type as you wrote down earlier
(Model 40 or DDM bay)?
v Yes, go to step 3 to repair the controller card.
v No, the wrong type of drawer was indicated when this drawer was installed.
This drawer must be removed and reinstalled. Be sure to enter the correct
drawer type information this time. The drawer will need to be installed on a
different loop if the loop was mixed (7133s and DDM bays on the same loop).
Go to step 4 to remove the drawer.
3. Select the controller card listed with Possible FRUs to replace using the service
terminal. See ″Controller Card Assembly, 7133 Model 040″ or ″Controller Card,
DDM Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume
2.
Did repair verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem. Do not perform any more steps in this map, follow the
instructions on the service terminal to end the call.
v No, the problem is not resolved, call your next level of support.
4. Use the service terminal to remove the drawer.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawers
Select the drawer line with the Resource Location that matches the
controller card location, without the -CA or -C5. Continue through the
instructions to remove the drawer.
5. Use the service terminal to install the drawer.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Install a Device Drawer
Follow the install process, be sure to enter the correct drawer type
information this time.
MAP 3190: Wrong Drawer Type Installed
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar
Description
A SSA DASD Model 040 drawer or a DDM bay has been installed where a different
drawer type was expected. All of the drawers on the SSA loop must be uninstalled
Problem Isolation Procedures, CHAPTER 3
203
MAP 3190: Wrong Drawer Type Installed
then reinstalled. If the customer has any data on the SSA loop, they will need to off
load the data then reload it after the reinstallation.
v Drawer models, SSA DASD Model 040 drawer or DDM bay
Isolation
1. Use the service terminal to locate the drawer or DDM bay displayed as a
Possible FRU to Replace. Copy down the FRU Location Description (Rr-Yxx or
Rr-Ux-Wx).
2. Locate the improperly installed drawer. Use the location code copied down in
the last step. Use ″Locating a DDM Bay or SSA DASD Model 020 or 040
Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3 to locate the SSA DASD drawer or DDM bay, and to determine
which type of drawer is installed at that location. This drawer will need to be
removed from the loop and then reinstalled using the correct drawer type.
3. Use the service terminal to remove the drawer.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawers
Select the drawer line with the Resource Location that matches the
location copied down in step 1. Continue through the instructions to
remove the drawer.
4. Use the service terminal to install the drawer.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Install a Device Drawer
Follow the install process, be sure to enter the correct drawer type
information this time.
Note: 7133 drawers and DDM bays may not be mixed on the same
loop. If the previous installation attempted to mix both on the
same loop, this installation must be to an empty loop or to one
where all drawers are the same type.
MAP 3200: Uninstalled SSA DDMs Connected to Loop A
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Installation of SSA DDM drawers on loop B failed, when loop A on the same SSA
device card had uninstalled DDMs. The SSA cables attached to loop A must be
disconnected.
204
VOLUME 1, ESS Service Guide
MAP 3200: Uninstalled DDMS on Loop A
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Use the service terminal to locate the SSA device card displayed as a Possible
FRU to Replace. Copy down the FRU location.
2. Locate the cluster and the SSA device card using the information below and in
Figure 102.
Note: The SSA device card connector location is in the format R1-Tx-P2-Kx-yy,
where:
v R1 is rack 1
v Tx is the cluster, 1 or 2
v P2 is the cluster planar
v Kx is the SSA device card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 102. Cluster SSA Device Card Locations (S008022m)
3. Disconnect the SSA device cables from SSA device card connectors A1 and A2
on the indicated card.
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
4. Locate the same SSA device card position in the other cluster. Disconnect the
SSA device cables from connectors A1 and A2 on this card also.
5. Go to the service terminal and press F3 until the Main Service Menu is
displayed.
Restart the installation process.
MAP 3210: Uninstalled SSA DDMs Connected to Loop B
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
205
MAP 3210: Uninstalled DDMS on Loop B
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Installation of SSA DDM drawers on loop A failed, when loop B on the same SSA
device card had uninstalled DDMs. The SSA cables attached to loop B must be
disconnected.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Isolation
1. Use the service terminal to locate the SSA device card displayed as a Possible
FRU to Replace. Copy down the FRU location.
2. Locate the cluster and the SSA device card using the information below and in
Figure 103.
Note: The SSA device card connector location is in the format R1-Tx-P2-Kx-yy,
where:
v R1 is rack 1
v Tx is the cluster, 1 or 2
v P2 is the cluster planar
v Kx is the SSA device card location, slot
v yy is the cable connector, A1, A2, B1, or B2
Cluster 1/2 (Model Exx/Fxx)
SSA Device Card
Connectors
B2
B1
A2
A1
CLUSTER 1
CLUSTER 2
SSA Device Cards
Front
View
R1-Tx-P2-K1-yy
R1-Tx-P2-K2-yy
R1-Tx-P2-K3-yy
R1-Tx-P2-K4-yy
(Model F10/F20 only)
R1-Tx-P2-K9-yy
(Model E10/E20 only)
Front View
Figure 103. Cluster SSA Device Card Locations (S008022m)
3. Disconnect the SSA device cables from SSA device card connectors B1 and B2
on the indicated card.
206
VOLUME 1, ESS Service Guide
MAP 3210: Uninstalled DDMS on Loop B
Note: To prevent damage to the SSA device cable connector screws, always
use the special screwdriver (SSA tool, P/N 32H7059) to turn them. This
screwdriver is in the 2105 ship group.
4. Locate the same SSA device card position in the other cluster. Disconnect the
SSA device cables from connectors B1 and B2 on this card also.
5. Go to the service terminal and press F3 until the Main Service Menu is
displayed.
Restart the installation process.
MAP 3220: Isolating too Few DDMs in an SSA DASD DDM Bay
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The wrong number of DDMs were found where eight were expected.
v Drawer model, SSA DASD DDM Bay
v Disk drive module (DDM) locations in DDM bay:
– New DDM locations: 1, 2, 3, 4, 5, 6, 7, and 8
1
N
2
N
3
N
4
N
5
N
6
N
7
N
8
N
DDMs
N = Newly Installed DDM
Figure 104. Expected SSA DASD Drawer DDM Locations (S007657l)
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3 on page 208.
2. Verify that the SSA cables are connected correctly. Look at the cables displayed
on the Detail Problem screen. Compare the cables displayed with the cabling of
the drawer or DDM bay.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors. Use the service terminal
to verify that the problem is resolved. Select the cable that was incorrectly
connected from the cable list and continue through verification without
replacing the cable.
Problem Isolation Procedures, CHAPTER 3
207
MAP 3220: Too Few DDMs in an DDM bay
v No, go to step 3.
3. Check the drawer in the Additional Message area to see if the correct number of
DDMs are installed. See Figure 104 on page 207.
v All eight slots should contain DDMs, If too few new DDMs are installed,
remove any dummy DDMs and replace them with new DDMs.
Where any additional DDMs installed in the DDM bay?
v Yes, to verify that the problem has been corrected, select any cable from the
service terminal. Continue through verification without replacing the cable.
v No, go to step 4.
4. Observe the indicators on the following FRUs at the front of the DDM bay:
v DDMs (eight)
v Bypass card
v Controller card
Are any of the indicators on?
v Yes, call your next level of support.
v No, go to “MAP 3395: Isolating an SSA DASD DDM Bay Power Problem” on
page 259
Figure 105. DDM bay Indicator Locations (S008018l)
MAP 3280: Isolating too Few DDMs in an SSA Drawer
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The wrong number of new DDMs were found where 16 were expected.
v Drawer models, SSA DASD Model 020 or 040 drawer
v Disk drive module (DDM) locations in drawer:
– New DDM locations: 1 to 16
208
VOLUME 1, ESS Service Guide
MAP 3280: Too Few New DDMs in SSA Drawer
Figure 106. Expected SSA DASD Drawer DDM Locations (s007319l)
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, continue with step 2.
v No, continue with step 3.
2. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the drawer or DDM bay. See ″Locating an SSA Cable″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
Are any of the cables connected wrong?
v Yes, select the incorrectly connected cable, then connect the SSA cables to
the correct connectors. Continue the call through verification.
v No, go to step 3.
3. Determine if the drawer you are working on is a Model 040 drawer.
Is the drawer a Model 040?
v Yes, go to step 7 on page 210.
v No, go to step 4.
4. Go to the front of the 2105 and locate the SSA DASD drawer with the DDM
shown for replacement. See ″SSA DASD Drawer Component Physical
Location Codes, Model 020 Drawer″ in chapter 7 of the Enterprise Storage
Server Service Guide, Volume 3.
Use Figure 107 on page 210 in the following steps to locate the switch and
indicators on the SSA DASD drawer-B power control panel:
Power Switch (On/Off)
Power Indicator (green)
Check Indicator (amber)
Problem Isolation Procedures, CHAPTER 3
209
MAP 3280: Too Few New DDMs in SSA Drawer
Figure 107. SSA DASD Model 020 Power Control Panel Locations (S008020m)
5. Observe the green drawer power indicator on the power control panel of the
SSA DASD drawer shown in the service terminal FRU list (Drawer B).
Is the green drawer power indicator on?
v Yes, continue with step 9 on page 211.
v No, power the SSA DASD drawer on.
Press and release the drawer power switch on the drawer power control
panel.
6. Observe the green drawer power indicator on the power control panel.
Is the green drawer power indicator on?
v Yes, powering the drawer on may have fixed the problem. Determine if the
problem is resolved, go to “MAP 3500: Verifying an SSA DASD Drawer
Repair” on page 279.
v No, the drawer has a power problem, go to “MAP 3352: Isolating SSA
DASD Drawer Power Problems” on page 219.
7. Go to the rear of the drawer. Observe the PWR (power) indicators on both
power supply assemblies.
Are both PWR indicators off?
v Yes, go to “MAP 3105: Isolating a Loss of Power to a SSA DASD Model
040” on page 172.
v No, continue with the next step.
210
VOLUME 1, ESS Service Guide
MAP 3280: Too Few New DDMs in SSA Drawer
Figure 108. SSA DASD Model 040 Power Supply Assembly Indicators (S008019m)
8. Observe CHK/PWR Good (check/power) indicators on both power supply
assemblies.
Are either of the indicators on (green)?
v Yes, go to step 9.
v No, go to “MAP 3380: Isolating 7133 Model 040 SSA DASD Drawer Power
Problems” on page 234.
9. Check the drawer in the FRU list to see if the correct number of DDMs are
installed in the correct positions. See Figure 106 on page 209.
Too few new DDMs are installed, remove any dummy DDMs and replace them
with new DDMs.
10. If the problem is not resolved, call your next level of support.
MAP 3300: Repair Alternate Cluster to Run SSA Loop Test
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so. If
you are not familiar with these MAPs, read “Using the SSA DASD drawer
Maintenance Analysis Procedures (MAPs)” on page 108 first.
Description
During a repair or installation, the SSA Loop Verify test could not run from both
clusters because one of the clusters is failing. To verify SSA loop operation, the
SSA Loop test must be run from both clusters. The other (failing) cluster or cluster
communications must be repaired before the SSA loop repair or installation can be
completed
v Drawer model, SSA DASD DDM Bay
Isolation
1. Check for open cluster or cluster communications problems.
From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems Needing Repair
Problem Isolation Procedures, CHAPTER 3
211
MAP 3300: Repair Alternate Cluster Berore SSA Loop
Look for all cluster and cluster communications problems.
Were any cluster or cluster communications problems found?
v Yes, go to step 2.
v No, unexpected results, call your next level of support.
2. Repair all cluster and communications problems in the following order:
a. Cluster (local) problems
b. Cluster to cluster communications problems
c. Cluster (alternate) problems
When all cluster and cluster communications problems are resolved, check to
see if the original problem is resolved. Go to “MAP 3500: Verifying an SSA
DASD Drawer Repair” on page 279.
MAP 3350: Isolating SSA DASD Drawer Power Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A fan-and-power-supply assembly has its fan-and-power CHK (check) indicator
on.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 020 or 040 drawer
Isolation
1. Observe PWR (power) indicators on the fan-and-power-supply assemblies in the
failing drawer, see Model 020 drawer in Figure 109 on page 213.
Note: The fan-and-power-supply PWR (power) indicators may be hidden
behind the fan mounting latches.
a. Determine if the fan-and-power-supply with the failing PWR (power) indicator
(off) is in drawer fan position 1, 2, or 3.
b. Observe the PWR (power) indicators on the fan-and-power-supply or power
supply assemblies in the same fan position in the other drawers in the same
rack.
v SSA DASD Model 020 drawers, see PWR (power) indicators on
fan-and-power-supply assemblies 1, 2, and 3.
v SSA DASD Model 040 drawers, see PWR (power) indicators on power
supply assemblies 1 and 2 (position 3 is unused).
212
VOLUME 1, ESS Service Guide
MAP 3350: SSA DASD Drawer Power
Figure 109. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008030p)
Is the PWR (power) indicator off, on another fan-and-power-supply assembly
in the same fan position in another drawer?
v Yes, observe the state of the Power Complete Line Cord indicators Use
the state of these indicators with “MAP 1320: Isolating Problems Using
Visual Symptoms” on page 58.
v No, go to step 2.
2. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch to On (up) for about two seconds.
Problem Isolation Procedures, CHAPTER 3
213
MAP 3350: SSA DASD Drawer Power
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 110. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
At the rear of the 2105, is the fan-and-power-supply assembly PWR (power)
indicator still off?
v Yes, go to step 3.
v No, the problem may be resolved. Verify the repair. Select any
fan-and-power-supply assembly. shown on the service terminal. Proceed
through the repair but do not replace any FRU or disconnect any cables. This
will simulate a repair and run verification.
3. Is the fan-and-power-supply with the PWR (power) indicator that is off in drawer
fan position 2 (center)?
v Yes, select the fan-and-power-supply from the problem FRU list on the
service terminal. Follow the service terminal instructions to replace the SSA
DASD drawer fan-and-power-supply. See ″Fan and Power Supply Assembly,
7133 Model 020″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Before you verify the repair, do the following:
– Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press
the Local Power switch in Figure 110, to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to the service terminal.
Indicate that replacement is complete and verify the repair.
- If verification is not successful, call your next level of support.
- If verification is successful, the problem is resolved. Return to the
service terminal and Continue Repair Process to return the resources to
the customer and cancel the problem.
– If the failing PWR (power) indicator is still off, replace the drawer power
cable connected to the failing fan. See ″2105 Model 100 Rack Cable
Removals and Replacements″ in chapter 4, of the 2105 Model 100
Attachment to ESS Service Guide book. Connect the new cable to the
original connector on the primary power supply.
– Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press
the Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to the service terminal.
Indicate that replacement is complete and verify the repair.
214
VOLUME 1, ESS Service Guide
MAP 3350: SSA DASD Drawer Power
- If verification is not successful, call your next level of support.
- If verification is successful, the problem is resolved. Return to the
service terminal and Continue Repair Process to return the resources to
the customer and cancel the problem.
– If the failing PWR (power) indicator is still off, call your next level of
support.
v No, select the fan-and-power-supply from the problem FRU list on the service
terminal. Follow the service terminal instructions to replace the SSA DASD
drawer fan-and-power-supply. See ″Fan and Power Supply Assembly, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2. After the repair, go to step 4.
4. Determine where the drawer power cable, from the failing SSA DASD drawer
fan-and-power-supply, connects to a primary power supply. Also determine
where the other drawer power cable from this drawer connects to the same
primary power supply. Use the cabling information in Figure 111 for primary
power supply connector names and locations.
Figure 111. 2105 Primary Power Supply Connectors (S007380l)
a. Locate where the two drawer power cables, from the failing
fan-and-power-supply, connect to the same primary power supply.
Disconnect both cables from the primary power supply. Swap the two
drawer power cable connectors and reconnect them to the primary power
supply.
b. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch to On (up) for about two seconds.
At the rear of the 2105, is the SSA DASD drawer PWR (power) indicator still
off?
v Yes, reconnect the swapped drawer power cables to the correct
connectors on the PPS. Replace the SSA DASD drawer
fan-and-power-supply assembly, go to ″Fan and Power Supply Assembly,
7133 Model 020″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2. After the repair:
– Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to “MAP 3500: Verifying
an SSA DASD Drawer Repair” on page 279.
- If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
- If repair verification fails, repair the problem from the verification.
Problem Isolation Procedures, CHAPTER 3
215
MAP 3350: SSA DASD Drawer Power
– If the failing PWR (power) indicator is still off, replace the drawer
power cable connected to the failing fan, see ″Cables, 2105 Model
Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Connect the new cable to
the original connector on the primary power supply.
– Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
– Return to the service terminal. Select any FRU for repair. Proceed
through the repair but do not replace any FRU or disconnect any
cables. This will simulate repair and run verification.
- If verification is successful, the problem is resolved. Return to the
service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
- If repair verification fails, repair the problem from the verification.
v No, the internal electronic circuit breaker for the original connector on the
power supply has failed:
1) Replace the primary power supply with the swapped cables, see
″Primary Power Supply, 2105 Model Exx/Fxx and Expansion
Enclosure″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
2) Reconnect the swapped cables to the correct location on the new
primary power supply.
3) Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
4) Return to the service terminal. Select any FRU for repair. Proceed
through the repair but do not replace any FRU or disconnect any
cables. This will simulate repair and run verification. Verify the repair,
go to
– If verification is successful, the problem is resolved. Return to the
service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
– If repair verification fails, repair the problem from the verification.
MAP 3351: Isolating SSA DASD Drawer Visual Power Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A fan-and-power-supply assembly has its fan-and-power CHK (check) indicator
on.
216
VOLUME 1, ESS Service Guide
MAP 3351: SSA DASD Drawer Power
v Another MAP sent you here.
v Drawer model, SSA DASD Model 020 or 040 drawer
Isolation
1. Observe PWR (power) indicators on the fan-and-power-supply assemblies in the
failing drawer, see Model 020 drawer in Figure 112 on page 218.
Note: The fan-and-power-supply PWR (power) indicators may be hidden
behind the fan mounting latches.
a. Determine if the fan-and-power-supply with the failing PWR (power) indicator
(off) is in drawer fan position 1, 2, or 3.
b. Observe the PWR (power) indicators on the fan-and-power-supply or power
supply assemblies in the same fan position in the other drawers in the same
rack.
v SSA DASD Model 020 drawers, see PWR (power) indicators on
fan-and-power-supply assemblies 1, 2, and 3.
v SSA DASD Model 040 drawers, see PWR (power) indicators on power
supply assemblies 1 and 2 (position 3 is unused).
Is the PWR (power) indicator off, on another fan-and-power-supply or power
supply assembly in the same fan position in another drawer?
v Yes, observe the state of the Power Complete Line Cord indicators Use the
state of these indicators with “MAP 1320: Isolating Problems Using Visual
Symptoms” on page 58 in chapter 3, volume 1 of this book.
v No, go to step 2 on page 218.
Problem Isolation Procedures, CHAPTER 3
217
MAP 3351: SSA DASD Drawer Power
Figure 112. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008030p)
2. Go to the operator panel on the front of the 2105 Model Exx/Fxx, rack 1. Press
the Local Power switch to On (up) for about two seconds.
At the rear of the 2105, is the SSA DASD drawer PWR (power) indicator still
off?
v Yes, go to step 3 on page 219.
v No, the problem may be resolved. Verify the repair, go to “MAP 3500:
Verifying an SSA DASD Drawer Repair” on page 279.
– If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
– If repair verification fails, repair the problem from the verification.
218
VOLUME 1, ESS Service Guide
MAP 3351: SSA DASD Drawer Power
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Messages
Cluster 1
Cluster 2
Front
View
Figure 113. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
3. Is the fan-and-power-supply with the PWR (power) indicator that is off in drawer
fan position 2 (center)?
v Yes, replace the SSA DASD drawer fan-and-power-supply assembly, go to
″Fan and Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
After the repair:
v Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to “MAP 3500: Verifying an
SSA DASD Drawer Repair” on page 279.
- If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
- If repair verification fails, repair the problem from the verification.
– If the failing PWR (power) indicator is still off, replace the drawer power
cable connected to the failing fan, see ″Cables, 2105 Model Exx/Fxx and
Expansion Enclosure″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2. Connect the new cable to the original connector
on the primary power supply.
– Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press
the Local Power switch to On (up) for about two seconds.
– Verify the repair, go to “MAP 3500: Verifying an SSA DASD Drawer
Repair” on page 279.
- If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
- If repair verification fails, repair the problem from the verification.
v No, go to step 4 on page 215.
MAP 3352: Isolating SSA DASD Drawer Power Problems
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
219
MAP 3352: SSA DASD Drawer Power
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A fan-and-power-supply assembly has its fan-and-power CHK (check) indicator
on.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Does the fan-and-power-supply assembly in either position 2 or position 3 have
its PWR (power) indicator on?
Note: The fan-and-power supply PWR (power) indicators may be hidden
behind the fan mounting latches.
v Yes, go to step 2.
v No, go to “MAP 3350: Isolating SSA DASD Drawer Power Problems” on
page 212 .
Figure 114. SSA DASD Model 020 Fan-and-Power-Supply Assembly Indicators (S008029l)
2. Does any fan-and-power-supply assembly in this SSA DASD drawer have its
fan-and-power CHK (check) indicator on?
v Yes, perform the following repairs:
a. Select the rsssaM1PwrSup## or rsssaM2PwrSup## listed as a FRU for the
problem being repaired. Replace the fan-and-power-supply assembly, see
″Fan and Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Check the Power Complete, Line Cord 0 and 1 indicators in Figure 95 on
page 190, on the front of the 2105 Model E10/E20.
220
VOLUME 1, ESS Service Guide
MAP 3352: SSA DASD Drawer Power
– If both indicators are on, go to step 2c.
– If either indicator is off or blinking, press the Local Power switch in
Figure 95 on page 190, to On (up) for two seconds then release it. Go
to step 2c.
Note: Pressing the Local Power switch resets any tripped electronic
circuit breakers in the PPS that control power to the SSA DASD
drawer.
c. Return to the service terminal and verify the repair:
– If repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer and cancel the problem.
– If repair verification is not successful, repair the problem from the
verification.
v No, In the sequence shown, replace the following FRUs with new ones.
Ensure that after each FRU replacement, you return to the service terminal to
verify the repair.
– If repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources to
the customer and cancel the problem.
– If repair verification is not successful, replace the next FRU on the list then
verify it.
a. Drawer power control panel, see ″Power Control Panel, 7133 Model 020″
in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Return to the service terminal and verify the repair:
– If repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer and cancel the problem.
– If repair verification is not successful, repair the problem from the
verification.
b. Left-hand power-distribution tray assembly, see ″Power Distribution Tray
Assembly, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Return to the service terminal and verify the repair:
– If repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer and cancel the problem.
– If repair verification is not successful, repair the problem from the
verification.
c. Right-hand power-distribution tray assembly, see ″Power Distribution Tray
Assembly, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Return to the service terminal and verify the repair:
– If repair verification is successful, the problem is closed. Return to the
service terminal and Continue Repair Process to return the resources
to the customer and cancel the problem.
– If repair verification is not successful, repair the problem from the
verification.
MAP 3353: Isolating SSA DASD Drawer Visual Power Problems
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
221
MAP 3353: SSA DASD Drawer Power
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A fan-and-power-supply assembly has its fan-and-power CHK (check) indicator
on.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Does the fan-and-power-supply assembly in either position 2 or position 3 have
its PWR (power) indicator on?
Note: The fan-and-power supply PWR (power) indicators may be hidden
behind the fan mounting latches.
v Yes, go to step 2.
v No, go to “MAP 3351: Isolating SSA DASD Drawer Visual Power Problems”
on page 216.
Figure 115. SSA DASD Model 020 Fan-and-Power-Supply Assembly Indicators (S008029l)
2. Does any fan-and-power-supply assembly in this SSA DASD drawer have its
fan-and-power CHK (check) indicator on?
v Yes, perform the following repairs:
a. Replace the fan-and-power-supply assembly whose fan-and-power CHK
(check) indicator is on, see ″Fan and Power Supply Assembly, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
b. Go to the operator panel on front of the 2105 Model Exx/Fxx Press the
Local Power switch to on (up) for two seconds then release it.
222
VOLUME 1, ESS Service Guide
MAP 3353: SSA DASD Drawer Power
Note: Pressing the Local Power switch momentarily to On (up) clears
any power errors that were generated by the failure. It also
restores any power that was removed because of these failures. It
does not affect 2105 power.
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Messages
Cluster 1
Cluster 2
Front
View
Figure 116. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
c. Verify the repair, go to “MAP 3500: Verifying an SSA DASD Drawer
Repair” on page 279.
– If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
– If repair verification fails, go to “MAP 3350: Isolating SSA DASD
Drawer Power Problems” on page 212.
v No, In the sequence shown, replace the following FRUs with new ones.
Ensure that after each FRU replacement, you go to “MAP 3500: Verifying an
SSA DASD Drawer Repair” on page 279 to verify the repair.
– If repair verification is successful, the repair is complete.
– If repair verification fails, relocate the next FRU listed.
a. Drawer power control panel, see ″Power Control Panel, 7133 Model 020″
in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
b. Left-hand power-distribution tray assembly, see ″Power Distribution Tray
Assembly, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
c. Right-hand power-distribution tray assembly, see ″Power Distribution Tray
Assembly, 7133 Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
MAP 3354: Isolating an SSA DASD Drawer Multiple DDM Redundant
Visual Power Fault
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
223
MAP 3354: Multiple DDM Redundant Power Fault
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Multiple DDMs in the SSA DASD drawer are detecting a loss of redundant power or
cooling.
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Check for the following conditions:
a. Observe the Power Card indicators 1, 2, 3, and 4. Note which
indicators are on, and which indicators are off.
Note: Some indicators may be hidden from view by the internal cabling. If
required, move the cables using a non-conductive tool such as a
wooden pencil.
b. Go to step 2.
Figure 117. SSA DASD drawer Power Card Indicators (s007227l)
2. Perform the following actions:
a. Find the row whose pattern of Power Card indicators matches the pattern of
the Power Card indicators of the SSA DASD drawer in Table 18 on
page 225.
b. In the sequence given in that row, replace the FRUs with new ones. Ensure
that for each FRU replacement, you go to “MAP 3500: Verifying an SSA
DASD Drawer Repair” on page 279 to verify the repair.
v If repair verification is successful, go to “MAP 3360: Ending a DASD
Service Action” on page 231.
224
VOLUME 1, ESS Service Guide
MAP 3354: Multiple DDM Redundant Power Fault
v If repair verification fails, replace the next FRU for this indicator pattern.
Table 18. Power Card Indicator (Ind.) Patterns
Power
Card Ind.
1
Power
Card Ind.
2
Power
Card Ind.
3
Power
Card Ind.
4
On
Off
On
On
FRUs
Replace the following FRUs:
a. Fan-and-power-supply assembly in position 1, see ″Fan and
Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Right-hand back-power card, see ″Back Power Cards, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
c. Right-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
On
On or Off
On
Off
Right-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
On
On
Off
On
a. Fan-and-power-supply assembly in position 3, see ″Fan and
Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Left-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Off
On
On
On
a. Left-hand back-power card, see ″Back Power Cards, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
b. Left-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Off
On
On
Off
Fan-and-power-supply assembly in position 2 (″Fan and Power
Supply Assembly, 7133 Model 020″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2).
On
On
On
On
The front backplane assembly if the reporting disk drive module is
at the front of the SSA DASD drawer, see “MAP 3400: Replacing
an SSA DASD Drawer Backplane or Frame” on page 263.
The back backplane assembly if the reporting disk drive module is
at the back of the SSA DASD drawer, see “MAP 3400: Replacing
an SSA DASD Drawer Backplane or Frame” on page 263.
Note: Any other patterns of indicators means multiple problems. In such instances, solve those problems one at a
time.
MAP 3355: Isolating an SSA DASD Drawer Multiple DDM Redundant
Power Fault
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
225
MAP 3355: Multiple DDM Redundant Power Fault
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Multiple DDMs in the SSA DASD drawer are detecting a loss of redundant power or
cooling.
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Check for the following conditions:
a. Observe the Power Card indicators 1, 2, 3, and 4. Note which
indicators are on, and which indicators are off.
Note: Some indicators may be hidden from view by the internal cabling. If
required, move the cables using a non-conductive tool such as a
wooden pencil.
b. Go to step 2.
Figure 118. SSA DASD drawer Power Card Indicators (s007227l)
2. Perform the following actions:
a. Find the row whose pattern of Power Card indicators matches the pattern of
the Power Card indicators of the SSA DASD drawer in Table 19 on
page 227.
b. In the sequence given in that row, select the FRU to be replaced from the
problem display. Replace the FRUs with new ones. Ensure that after each
FRU replacement, you indicate that the FRU replacement is complete and
then verify it.
v If repair verification is successful, the problem is resolved.
v If repair verification fails, replace the next FRU for this indicator pattern.
226
VOLUME 1, ESS Service Guide
MAP 3355: Multiple DDM Redundant Power Fault
Table 19. Power Card Indicator (Ind.) Patterns
Power
Card Ind.
1
Power
Card Ind.
2
Power
Card Ind.
3
Power
Card Ind.
4
On
Off
On
On
FRUs
Replace the following FRUs:
a. Fan-and-power-supply assembly in position 1, see ″Fan and
Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Right-hand back-power card, see ″Back Power Cards, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
c. Right-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
On
On or Off
On
Off
Right-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
On
On
Off
On
a. Fan-and-power-supply assembly in position 3, see ″Fan and
Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. Left-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Off
On
On
On
a. Left-hand back-power card, see ″Back Power Cards, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
b. Left-hand power-distribution tray assembly, see ″Power
Distribution Tray Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Off
On
On
Off
Fan-and-power-supply assembly in position 2 (″Fan and Power
Supply Assembly, 7133 Model 020″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2).
On
On
On
On
The front backplane assembly if the reporting disk drive module is
at the front of the SSA DASD drawer, see “MAP 3400: Replacing
an SSA DASD Drawer Backplane or Frame” on page 263.
The back backplane assembly if the reporting disk drive module is
at the back of the SSA DASD drawer, see “MAP 3400: Replacing
an SSA DASD Drawer Backplane or Frame” on page 263.
Note: Any other patterns of indicators means multiple problems. In such instances, solve those problems one at a
time.
MAP 3356: Isolating SSA DASD Drawer Power On Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Problem Isolation Procedures, CHAPTER 3
227
MAP 3356: SSA DASD Drawer Power
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Perform the following actions:
a. Remove all power from the SSA DASD drawer, see ″Drawer Power, 7133
Model 020″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
b. Remove all the fan-and-power-supply assemblies from the SSA DASD
drawer, see ″Fan and Power Supply Assembly, 7133 Model 020″ in chapter
4 of the Enterprise Storage Server Service Guide, Volume 2.
c. Remove all the disk drive modules from the SSA DASD drawer, see ″SSA
Disk Drive Model, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Record which slot each DDM was
in so they can be returned to the same location.
d. Go to step 2.
2. Perform the following actions:
a. Reinstall a fan-and-power-supply assembly into position 3; that is, the
right-hand slot (viewed from the back of the SSA DASD drawer).
b. Connect the drawer power cable to the fan-and-power-supply assembly,
then power on the SSA DASD drawer:
c. Go to the front of the 2105 and locate the failing SSA DASD drawer.
d. Use Figure 119 in the following steps to locate the switch and indicators on
the SSA DASD drawer power control panel:
Power Switch (On/Off)
Power Indicator (green)
Check Indicator (amber)
Figure 119. SSA DASD Model 020 Power Control Panel Locations (S008020m)
Press and release the drawer power switch, on the drawer power control
panel.
v If the SSA DASD drawer power indicator is on, go to step 3 on page 229.
228
VOLUME 1, ESS Service Guide
MAP 3356: SSA DASD Drawer Power
v If the SSA DASD drawer power indicator is off, go to step 2e .
Note: Leave the SSA DASD drawer powered on for the remainder of this
MAP.
e. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch, in Figure 120, to On (up) for about two seconds.
Note: Pressing the Local Power switch momentarily to On (up) clears any
power errors that were generated by the failure. It also restores any
power that was removed because of these failures. It does not affect
2105 power.
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 120. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
f. Observe the SSA DASD drawer power indicator on the power control panel
on the front of the failing drawer.
Is the power indicator on?
v Yes, go to step 3.
v No, replace the following FRUs one at a time and do steps 2e and 2f until
the drawer power indicator is on. When the problem is corrected, indicator
is on, go to step 3.
– Drawer fan-and-power supply assembly (position 3), see ″Fan and
Power Supply Assembly, 7133 Model 020″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
– Front primary power supply, see ″2105 Model 100 Rack Bulk Power
Supply Removals and Replacements″ in chapter 4, of the 2105 Model
100 Attachment to ESS Service Guide book.
– Drawer power control panel, see ″Power Control Panel, 7133 Model
020″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
If the problem is not resolved, seek technical support.
3. Observe the failing SSA DASD drawer.
Problem Isolation Procedures, CHAPTER 3
229
MAP 3356: SSA DASD Drawer Power
Does the SSA DASD drawer emit smoke or a smell of burning?
v Yes, perform the following repair:
a. Replace the fan-and-power-supply assembly.
b. Go to step 4.
v No, go to step 4.
4. Perform the following actions:
a. Reinstall a fan-and-power-supply assembly into position 2.
b. Connect the drive power cable to the fan-and-power-supply assembly that is
in position 2.
c. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch, in Figure 120 on page 229, to On (up) for about two
seconds.
Does the SSA DASD drawer emit smoke or a smell of burning?
v Yes, perform the following actions:
a. Replace the fan-and-power-supply assembly that is in position 2.
b. Connect the drawer power cable to the new fan-and-power-supply
assembly,
c. Go to step 5.
v No, go to step 5.
5. Perform the following actions:
a. Reinstall the fan-and-power-supply assembly into position 1.
b. If reinstalling a fan-and-power supply assembly, connect the drawer
c. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch, in Figure 120 on page 229, to On (up) for about two
seconds.
Does the SSA DASD drawer emit smoke or a smell of burning?
v Yes, perform the following actions:
a. Replace the fan-and-power-supply assembly.
b. Go to step 6.
v No, Go to step 6.
6. Reinstall a disk drive module into the slot from which it was originally removed,
see ″SSA Disk Drive Model, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Does the SSA DASD drawer emit smoke or a smell of burning?
v Yes, perform the following repair:
a. Replace the disk drive module.
b. Go to step 7.
v No, go to step 7.
7. Reinstall the next disk drive module into the slot from which it was originally
removed.
Does the SSA DASD drawer emit smoke or a smell of burning?
v Yes, perform the following actions:
a. Replace the disk drive module.
b. Go to step 8 on page 231.
v No, go to step 8 on page 231.
230
VOLUME 1, ESS Service Guide
MAP 3356: SSA DASD Drawer Power
8. Have you reinstalled all the disk drive modules?
v Yes, go to step 9.
v No, return to step 7 on page 230.
9. Have you solved the problem?
v Yes, go to step the next step in the procedure that sent you to this procedure
and continue.
v No, remove all power from the SSA DASD drawer, and call for assistance.
MAP 3360: Ending a DASD Service Action
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Before some DASD visual symptom service actions can be completed, this
procedure must be done to ensure the status of the 2105 subsystem:
Display any related problems shown as needing repair and change their status to
closed.
Procedure
Use the description above and these procedures to complete the service action.
1. Display problems needing repair.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Repair Menu
Show / Repair Problems Needing Repair
Select a Problem to View or Repair
v Record the Problem ID of all problems with a Failing Resource of
rsrpc.....
Note: To find the Failing Resource, select the problem and display
the Detail Problem Record. Scroll down the screen until
Failing Resource... is displayed.
v Press F3 on the service terminal to display the next problem.
Record its Problem ID if its Failing Resource is rsrpc.....
Repeat this step until all related problem IDs problems have been
recorded.
2. Change the state of the open problem with a Failing Resource of rsrpc.... to
Closed.
Press F3 on the service terminal until the Main Service Menu is displayed, then
select:
Utility Menu
Problem Log Menu
Change A Problem State
Select a problem whose ID was recorded in the last step.
Press F4, select Closed, then press Enter.
v If this was the only problem with a Failing Resource of rsrpc....,
the repair is complete.
Problem Isolation Procedures, CHAPTER 3
231
MAP 3360: End a DASD Service Action
v If you recorded other problems with a Failing Resource of
rsrpc...., continue with the next step.
3. Close any other open problems recorded earlier.
Press F3 on the service terminal twice to display the Problem Log Menu, then
select:
Change A Problem State
Select a problem whose ID was recorded in the step 1 on page 231.
Press F4, select Closed, then press Enter.
Repeat this step until all open problems recorded earlier are closed. When
these problems are all closed the repair is complete.
MAP 3375: Isolating a Storage Cage Fan/Power Sense Card Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Only one DDM bay has sensed a storage cage fan/power sense card failure. The
other installed DDM bays, that monitor the same card, did not sense the failure. If
the storage cage fan/power sense card was failing, all of the DDM bays should
have reported the failure. This indicates that the storage cage fan/power sense card
is OK. The fault reporting path, through the DDM bay that reported the failure, is not
working correctly.
Isolation
1. Determine which DDM bay reported the storage cage fan/power sense card
failure and replace its DDM bay controller card. See ″Controller Card, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 2.
2. Replace the power planar to 8-pack planar cable to the DDM bay that reported
the failure. See ″Cables, 2105 Model Exx/Fxx and Expansion Enclosure″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Verify the repair. Return to the service terminal and select the DDM bay
controller card for replacement. Proceed through the repair but do not replace
the DDM bay controller card, this will simulate a repair and run verification.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 3.
3. Replace the 8-pack frame assembly (backplane) in the DDM bay that reported
the failure. See ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Verify the repair. Return to the service terminal and select the DDM bay
controller card for replacement. Proceed through the repair but do not replace
the DDM bay controller card, this will simulate a repair and run verification.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
232
VOLUME 1, ESS Service Guide
MAP 3375: Storage Cage Fan/Power Sense Card Problem
v No, go to step 4.
4. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
After the replacement verify the repair.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
MAP 3378: Isolating a Storage Cage Fan/Power Sense Card Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Multiple DDM bays have sensed a storage cage fan/power sense card failure. The
storage cage fan/power sense card is the most likely FRU. There is a small chance
that the storage cage power planar is failing.
Isolation
1. Replace the storage cage fan/power sense card. See ″Storage Cage Fan/Power
Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2. After the replacement, verify the repair.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 2.
2. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
After the replacement verify the repair.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
MAP 3379: Analyzing a Storage Cage Fan/Power Sense Card Check
Summary Indicator On
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A storage cage fan/power sense card Check Summary indicator is on. This indicator
is on when the fan/power sense card detects a problem with one of the storage
cage fans or power supplies that it monitors.
Problem Isolation Procedures, CHAPTER 3
233
MAP 3379: Storage Cage Fan/Power Sense Card Check Summary Indicator On
Isolation
1. Use the service terminal to check for open problems:
From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems Needing Repair Menu
If there are any open storage cage fan or power supply faults, select and repair
them.
v If there are any open storage cage fan or power supply faults, select and
repair them.
v If there are not any open storage cage fan or power supply faults, go to the
next step.
2. Run the machine test on All SSA Loops.
From the service terminal Main Service Menu, select:
Machine Test Menu
SSA Loops Menu
Select SSA Loops by SSA Device Card
All Loops
Run the SSA loop test on all SSA loops attached to an SSA
device card
– If Machine Test found any problems, repair them.
– If Machine Test did not fine any problems, replace the storage
cage fan/power sense card, see ″Storage Cage Fan/Power
Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Is the problem resolved?
- Yes, end call.
- No, call your next level of support.
MAP 3380: Isolating 7133 Model 040 SSA DASD Drawer Power
Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A power supply assembly has its power supply CHK (check) indicator on.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 040
234
VOLUME 1, ESS Service Guide
MAP 3380: 7133 Model 040 SSA DASD Drawer Power
Isolation
1. Observe PWR (power) indicators on the power supply assemblies in the failing
drawer.
a. Determine if the power supply with the failing PWR (power) indicator (off) is
in drawer power supply position 1 or 2.
b. Observe the PWR (power) indicators on the fan-and-power-supply or power
supply assemblies in the same fan position in the other drawers in the same
rack.
v SSA DASD Model 020 drawers, see PWR (power) indicators on
fan-and-power-supply assemblies 1, 2, and 3.
v SSA DASD Model 040 drawers, see PWR (power) indicators on power
supply assemblies 1 and 2 (position 3 is unused).
Figure 121. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations (S008019m)
Is the PWR (power) indicator off, on another power supply or fan-and-power
supply assembly in the same position on other drawers in the same rack?
v Yes, observe the state of the Power Complete Line Cord indicators. See
Figure 122 on page 236. Use the state of these indicators with “MAP 1320:
Isolating Problems Using Visual Symptoms” on page 58.
v No, go to step 2.
2. Go to the operator panel on the front of the 2105 Model E10/E20. Press the
Local Power switch to On (up) for about two seconds.
Note: Pressing the Local Power switch momentarily to On (up) clears any
power errors that were generated by the failure. It also restores any
power that was removed because of these failures. It does not affect
2105 power.
Problem Isolation Procedures, CHAPTER 3
235
MAP 3380: 7133 Model 040 SSA DASD Drawer Power
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 122. 2105 Model E10/E20 Operator Panel Locations (S008810m)
At the rear of the 2105, is the power supply assembly PWR (power) indicator
still off?
v Yes, go to step 3.
v No, the problem may be resolved. Verify the repair. Select any of the FRUs
shown on the service terminal. Proceed through the repair but do not replace
any FRU or disconnect any cables. This will simulate a repair and run
verification.
3. Is the power supply with the PWR (power) indicator that is off in drawer fan
position 2 (center)?
v Yes, select the power supply from the problem FRU list on the service
terminal. Follow the service terminal instructions to replace the SSA DASD
drawer power supply. See ″Power Supply Assembly, 7133 Model 040″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Before you verify the repair, do the following:
– Go to the operator panel on the front of the 2105 Model E10/E20. Press
the Local Power switch in Figure 122, to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to the service terminal.
Indicate that replacement is complete and verify the repair.
- If verification is not successful, call your next level of support.
- If verification is successful, the repair is complete. Return to the service
terminal and Continue Repair Process to return the resources to the
customer and cancel the problem.
– If the failing PWR (power) indicator is still off, replace the drawer power
cable connected to the failing fan, see ″Cables, 2105 Model Exx/Fxx and
Expansion Enclosure″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2. Connect the new cable to the original connector
on the primary power supply.
– Go to the operator panel on the front of the 2105 Model E10/E20. Press
the Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, go to the service terminal.
Indicate that replacement is complete and verify the repair.
236
VOLUME 1, ESS Service Guide
MAP 3380: 7133 Model 040 SSA DASD Drawer Power
- If verification is not successful, call your next level of support.
- If verification is successful, the repair is complete. Return to the service
terminal and Continue Repair Process to return the resources to the
customer and cancel the problem.
– If the failing PWR (power) indicator is still off, call your next level of
support.
– Continue with the next step.
v No, go to step 5.
4. Proceed through the verification process.
v If verification is not successful, follow the instructions with the problem
produced by the verification failure.
v If verification is successful, the repair is complete. Return to the service
terminal and Continue Repair Process to return the resources to the
customer and cancel the problem.
5. Determine where the drawer power cable, from the failing SSA DASD drawer
power supply, connects to a primary power supply. Also determine where the
other drawer power cable from this drawer connects to the same primary power
supply. See ″Bulk Power Supply Connection Physical Location Codes″ in
chapter 7 of the 2105 Model 100 Attachment to ESS Service Guide book, to
determine where the power cables should be plugged. Refer to Figure 123 for
primary power supply connector names and locations.
Figure 123. 2105 Primary Power Supply Connectors (5007380l)
a. Locate where the drawer power cable from the failing power supply
assembly connect to the primary power supply. Disconnect this cables from
the primary power supply and reconnect it into an unused connector on the
primary power supply.
b. Go to the operator panel on the front of the 2105 Model E10/E20. Press the
Local Power switch to On (up) for about two seconds.
At the rear of the 2105, is the SSA DASD drawer PWR (power) indicator still
off?
v Yes, reconnect the swapped drawer power cable to its original connector
on the PPS. Replace the SSA DASD drawer power supply assembly, go
to ″Power Supply Assembly, 7133 Model 040″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2. After the repair:
– Go to the operator panel on the front of the 2105 Model E10/E20.
Press the Local Power switch to On (up) for about two seconds.
Problem Isolation Procedures, CHAPTER 3
237
MAP 3380: 7133 Model 040 SSA DASD Drawer Power
– If the failing PWR (power) indicator is on, return to the service
terminal. Select any FRU for repair. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate
repair and run verification.
- If repair verification is successful, the problem is resolved. Return to
the service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
- If repair verification fails, repair the problem from the verification.
– If the failing PWR (power) indicator is still off, replace the drawer
power cable connected to the failing power supply assembly, see
″Cables, 2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2. Connect
the new cable to the original connector on the primary power supply.
– Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, return to the service
terminal. Select any FRU for repair. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate
repair and run verification.
- If repair verification is successful, the problem is resolved. Return to
the service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
- If repair verification fails, repair the problem from the verification.
v No, the internal electronic circuit breaker for the original connector on the
power supply has failed:
1) Replace the primary power supply that you moved the drawer power
cable to an unused connector on, see ″Primary Power Supply, 2105
Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
2) Reconnect the swapped drawer power cable to its original connector
on the new PPS.
3) Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
4) If the failing PWR (power) indicator is on, return to the service
terminal. Select any FRU for repair. Proceed through the repair but do
not replace any FRU or disconnect any cables. This will simulate
repair and run verification.
– If repair verification is successful, the problem is resolved. Return
to the service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
– If repair verification fails, repair the problem from the verification.
MAP 3381: Isolating a Storage Cage Fan/Power Sense Card Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
238
VOLUME 1, ESS Service Guide
MAP 3381: Storage Cage Fan/Power Sense Card Problem
Description
Only one DDM bay sensed a storage cage fan/power sense card failure. No other
DDM bays are installed in the half-rack being sensed by the storage cage
fan/power sense card. The most likely FRUs are the storage cage fan/power sense
card or the DDM bay controller card in the reporting DDM bay. The problem could
be a failure in the error reporting path.
Isolation
1. Replace the storage cage fan/power sense card. See ″Storage Cage Fan/Power
Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 2.
2. Determine which DDM bay reported the storage cage fan/power sense card
failure and replace its DDM bay controller card. See ″Controller Card, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 3.
3. Replace the power planar to 8-pack planar cable to the DDM bay that reported
the failure. See ″Cables, 2105 Model Exx/Fxx and Expansion Enclosure″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Verify the repair. Return to the service terminal and select the DDM bay
controller card for replacement. Proceed through the repair but do not replace
the DDM bay controller card, this will simulate a repair and run verification.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 4.
4. Replace the 8-pack frame assembly (backplane) in the DDM bay that reported
the failure. See ″Frame Assembly, DDM Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Verify the repair. Return to the service terminal and select the DDM bay
controller card for replacement. Proceed through the repair but do not replace
the DDM bay controller card, this will simulate a repair and run verification.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 5.
5. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
After the replacement verify the repair.
Is the storage cage fan/power sense card problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
MAP 3384: Isolating a Storage Cage Fan Failure
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
239
MAP 3384: Storage Cage Fan Problem
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A storage cage cooling fan failure has been reported. It could be one of the storage
cage fans in the top of the 2105, or one of the two fans in the front of the 2105
Model E10/E20 between the DDM bays. The most likely FRU is the failing fan. The
fan fault reporting circuits could also be reporting a false fan error.
Isolation
1. Determine which storage cage fan reported the storage cage fan failure. Locate
the failing fan in the 2105, see chapter 7, volume 3 of this book for:
v ″2105 Model Exx/Fxx and Expansion Enclosure Storage Cage Fan (Top)
Location Codes″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3
v ″2105 Model Exx/Fxx and Expansion Enclosure Storage Cage Fan (Center)
Location Codes″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3
Is there a real fan, not a dummy fan, installed in the failing fans location?
v Yes, go to step 4 on page 241.
v No, go to step 2.
2. Verify that fan jumper for the failing fan is installed on the storage cage power
planar.
Is the storage cage power planar fan jumper installed correctly for the failing
fan?
v Yes, go to step 5 on page 241.
v No, install the storage cage power planar fan jumper. Continue with the next
step.
3. Verify the repair. Return to the service terminal and select the storage cage fan
for replacement. Proceed through the repair but do not replace the storage cage
fan, this will simulate a repair and run verification.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 5 on page 241.
240
VOLUME 1, ESS Service Guide
MAP 3384: Storage Cage Fan Problem
Storage Bay Power Planar
J18
J28
J31
J33
J32
J17
J27
J16
J26
J15
J25
J14
J24
J13
J23
J12
J22
J11
J21
J35
J34
J36
J37
J39
J38
J41
J40
J42
J43
J44
Front View
Figure 124. Storage Cage Power Planar Fan Jumper Locations (S008352p)
4. Replace the failing storage cage fan. See ″Storage Cage Fan (Center), 2105
Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 5.
5. Replace the storage cage fan/power sense card. See ″Storage Cage Fan/Power
Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, continue with the next step.
6. Replace the DDM bay controller card. See ″Controller Card Removal and
Replacement, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, continue with the next step.
7. Disconnect the cable to the failing fan at the fan and the storage cage power
planar. Connect a storage cage fan FRU cable to the fan and the storage cage
power planar.
Is the storage cage fan problem resolved?
Problem Isolation Procedures, CHAPTER 3
241
MAP 3384: Storage Cage Fan Problem
v Yes, use the service terminal to close the problem and end the call.
v No, go to step 8.
8. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
After the replacement verify the repair.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
MAP 3387: Isolating a Storage Cage Power Supply Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A storage cage power supply failure has been reported. The failure could be the
storage cage power supply, its dc input voltage, or its error reporting path.
Isolation
1. Determine which storage cage power supply is failing. Locate the failing power
supply, see ″Rack, 2105 Model Exx/Fxx and Expansion Enclosure Storage
Cage Power Supply Location Codes″ in chapter 7 of the Enterprise Storage
Server Service Guide, Volume 3.
Is there a real power supply, not a dummy power supply, installed in the failing
power supply location?
v Yes, go to step 2.
v No, go to step 14 on page 246.
2. Observe the power switch on the failing storage cage power supply.
Is the storage cage power supplies power switch set to On (up).?
v Yes, go to step 3 on page 243.
v No, set the switch to On (up). Use the service terminal to verify the repair of
the storage cage power supply.
242
VOLUME 1, ESS Service Guide
MAP 3387: Storage Cage Power Supply Problem
Storage Cage
Power Supply
Input Power
Indicators
Power
Switch
CHK/PWR
Good Indicator
Figure 125. Storage Cage Power Supply Locations (S008495m)
3. Observe the two green input power indicators 1 on the failing storage cage
power supply:
PWR-1, PPS-1 Power
PWR-2, PPS-2 Power
Are both of the indicators on?
v Yes, go to step 12 on page 246.
v No, do one of the following:
– If the PWR-1 and PWR-2 indicators are both off, go to step 4.
– If only the PWR-1 indicator is off, go to step 5.
– If only the PWR-2 indicator is off, go to step 6.
4. Replace the failing storage cage power supply, then verify the repair.
Is the storage cage power supply problem corrected?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
5. Do the following steps only on PPS-1 and the failing storage cage power supply.
Go to step 7.
6. Do the following steps only on PPS-2 and the failing storage cage power supply.
Go to step 7.
7. Locate primary power supply (PPS) circuit protector (CP that supplies power to
the failing storage cage power supply:
CB1 CB2 CB3 CB4 CB5
J1 J2 J3 J4
J7-1 J7-2 J7-3 J7-4 J7-5
J5A
CB00
J5B J6
Rear
View
Figure 126. Primary Power Supply CB and Connector Locations (S008496l)
Problem Isolation Procedures, CHAPTER 3
243
MAP 3387: Storage Cage Power Supply Problem
Failing Storage
Cage Power
Supply (SCPS)
CB Check for 2105 Model
CB Check for Expansion
E10/E20 and Expansion
Enclosure Storage Cages 3 and 4
Enclosure Storage Cages 1 and 2
(lower)
(upper)
SCPS-1
CP-3
CP-1
SCPS-2
CP-4
CP-2
SCPS-3
CP-3
CP-1
SCPS-4
CP-4
CP-2
SCPS-5
CP-3
CP-1
SCPS-6
CP-4
CP-2
Is the input power CP for the failing storage cage power supply tripped (down)?
v Yes, go to “MAP 2520: PPS Output Circuit Breaker Tripped” on page 107.
v No, go to step 8.
8. Check the indicators on the front of the PPS
Are the following PPS indicators as shown?
v PPS Good indicator, On
v PPS Fault indicator, Off
v Yes, go to step 9.
v No, go to step “MAP 1320: Isolating Problems Using Visual Symptoms” on
page 58 in chapter 3, volume 1 of this book.
9. Locate the primary power supply (PPS) to storage cage power supply (SCPS)
cable for the failing indicator PWR-1 or -2 power indicator. Verify that the cable
is connected at the storage cage power supply and the PPS.
Use the correct table below for the failing SCPS and the storage cages it is
associated with (upper or lower):
v If the failing SCPS is in an 2105 Model E10/E20 rack, use Table 20 on
page 245
v If the failing SCPS is in an 2105 Expansion Enclosure, storage cages 1 and 2
(upper) use Table 20 on page 245
v If the failing SCPS is in an 2105 Expansion Enclosure, storage cages 3 and 4
(lower) use Table 21 on page 245
2105 Model E10/E20 and Expansion Enclosure, Storage Cages 1 and 2
(upper)
244
VOLUME 1, ESS Service Guide
MAP 3387: Storage Cage Power Supply Problem
Table 20. 2105 Model E10/E20 and Expansion Enclosure, Storage Cages 1 and 2 (upper)
Failing Storage Cage
Power Supply
(SCPS)
Failing SCPS PWR
(Power) Indicator
SCPS and PPS Connectors to Check
SCPS-1
PWR-1
SCPS-1, J2 and PPS-1, J7-3
SCPS-1
PWR-2
SCPS-1, J1 and PPS-2, J7-3
SCPS-2
PWR-1
SCPS-2, J2 and PPS-1, J7-4
SCPS-2
PWR-2
SCPS-2, J1 and PPS-2, J7-4
SCPS-3
PWR-1
SCPS-3, J2 and PPS-1, J7-3
SCPS-3
PWR-2
SCPS-3, J1 and PPS-2, J7-3
SCPS-4
PWR-1
SCPS-4, J2 and PPS-1, J7-4
SCPS-4
PWR-2
SCPS-4, J1 and PPS-2, J7-4
SCPS-5
PWR-1
SCPS-5, J2 and PPS-1, J7-3
SCPS-5
PWR-2
SCPS-5, J1 and PPS-2, J7-3
SCPS-6
PWR-1
SCPS-6, J2 and PPS-1, J7-4
SCPS-6
PWR-2
SCPS-6, J1 and PPS-2, J7-4
Expansion Enclosure, Storage Cages 3 and 4 (lower)
Table 21. Expansion Enclosure, Storage Cages 3 and 4 (lower)
Failing Storage Cage
Power Supply
(SCPS)
Failing SCPS PWR
(Power) Indicator
SCPS and PPS Connectors to Check
SCPS-1
PWR-1
SCPS-1, J2 and PPS-1, J7-1
SCPS-1
PWR-2
SCPS-1, J1 and PPS-2, J7-1
SCPS-2
PWR-1
SCPS-2, J2 and PPS-1, J7-2
SCPS-2
PWR-2
SCPS-2, J1 and PPS-2, J7-2
SCPS-3
PWR-1
SCPS-3, J2 and PPS-1, J7-1
SCPS-3
PWR-2
SCPS-3, J1 and PPS-2, J7-1
SCPS-4
PWR-1
SCPS-4, J2 and PPS-1, J7-2
SCPS-4
PWR-2
SCPS-4, J1 and PPS-2, J7-2
SCPS-5
PWR-1
SCPS-5, J2 and PPS-1, J7-1
SCPS-5
PWR-2
SCPS-5, J1 and PPS-2, J7-1
SCPS-6
PWR-1
SCPS-6, J2 and PPS-1, J7-2
SCPS-6
PWR-2
SCPS-6, J1 and PPS-2, J7-2
Is the storage cage P.S. cable connected correctly?
v Yes, go to step 10.
v No, reseat the cable as required.
– If the green PWR-1 or -2 Power indicator is now on, the problem is
resolved. Use the service terminal to verify the problem and close it.
– If the green PWR-1 or -2 Power indicator is still off, go to step 10.
10. Swap the two input power cables, J1 and J2, on the rear of the failing storage
cage power supply. Observe the status of the PWR-1 and -2 Power indicators.
Problem Isolation Procedures, CHAPTER 3
245
MAP 3387: Storage Cage Power Supply Problem
Did the PWR-1 and -2 Power indicator swap states (On now Off and Off now
On)?
v Yes, go to step 11.
v No, replace the storage cage power supply. See ″Storage Cage Power
Supply, 2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
If the problem is not resolved, call your next level of support.
11. Swap the two input power cables, J1 and J2, back to their original positions.
Replace the primary P.S. to storage cage P.S. cable associated with the
PWR-1 or -2 power indicator that is Off. See ″Cables, 2105 Model Exx/Fxx and
Expansion Enclosure″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
If the problem is not resolved, call your next level of support.
12. Replace the storage cage power supply. See ″Storage Cage Power Supply,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Observe the CHK/PWR GOOD
indicator On (green)?
Is the storage cage power supply problem resolved?
v Yes, the problem is resolved. Return to the service terminal and Continue
Repair Process to return the resources to the customer and cancel the
problem.
v No, continue with the next step.
13. Is the CHK/PWR GOOD indicator On (amber) on all installed storage cage
power supplies?
v Yes, go to “MAP 3391: Isolating a Storage Cage Power System Problem” on
page 253
v No, go to step 14.
14. Replace the storage cage fan/power sense card. See ″Storage Cage
Fan/Power Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Is the storage cage power supply problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, continue with the next step.
15. Replace the DDM bay controller card. See ″Controller Card Removal and
Replacement, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Is the storage cage fan problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, continue with the next step.
16. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
After the replacement verify the repair.
Is the storage cage power supply problem resolved?
v Yes, use the service terminal to close the problem and end the call.
v No, call your next level of support.
246
VOLUME 1, ESS Service Guide
MAP 3390: SSA DASD Drawer Power Visual
MAP 3390: Isolating SSA DASD Drawer Visual Power Problems, Model
040 Drawer
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to isolate FRUs that are causing a power problem on a SSA
DASD drawer.
You are here because of one or more of the following:
v A power supply assembly has its power supply CHK (check) indicator on.
v Another MAP sent you here.
v Drawer model, SSA DASD Model 040
Isolation
1. Use the service terminal to determine if there are any related power problems
with the RPC or SSA DASD drawer.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair.
Are there any open power problems?
v Yes, follow the instructions on the service terminal to repair the power
problem. This repair should also fix your visual symptom.
v No, from the visual symptoms you should already know the SSA DASD
drawer number and location, go to step 2.
2. Observe PWR (power) indicators 1 and 4 on the power supply assemblies
in the failing drawer.
Are one or both of the PWR indicators off?
v Yes, go to step 7 on page 249.
v No, go to step 3 on page 248.
Problem Isolation Procedures, CHAPTER 3
247
MAP 3390: SSA DASD Drawer Power Visual
Figure 127. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations
(s007602l)
3. Observe CHK/PWR (check/power) Good indicators 3 and 6 on the power
supply assemblies in the failing drawer.
Are one or both of the CHK/PWR Good indicators on with the color amber?
v Yes, go to step 4.
v No, go to step 5.
4. Locate the PWR/FLT (power/fault) Reset switch 2 or 5 on the power supply
assembly with the amber CHK/PWR Good indicator.
a. Turn the PWR/FLT Reset switch off, pull the switch handle out then push the
switch down.
b. Wait about 10 seconds then turn the PWR/FLT Reset switch on, pull the
switch handle out then push the switch up.
c. Check the CHK/PWR Good indicator again.
Is the CHK/PWR Good indicator now green?
v Yes, the problem is now repaired.
v No, replace the power supply assembly with the amber indicator.
5. Go to the front of the failing drawer, observe the Fan Check indicator 8, 9,
and 10, on each of the three fan assemblies.
Are any of the Fan Check indicators on (amber)?
v Yes, replace the fan assembly with the Fan Check indicator on.
v No, go to step 6.
6. Observe the controller card check indicator 7.
Is the controller card check indicator on (amber)?
v Yes, replace the controller card with the check indicator on.
v No, call your next level of support.
248
VOLUME 1, ESS Service Guide
MAP 3390: SSA DASD Drawer Power Visual
7133 Model 040
Front View
Figure 128. Model 040 Drawer Indicators (S008416l)
7. Observe PWR (power) indicators 14 and 16 and the CHK/PWR Good
indicators 15 and 17 on the power supply assemblies in the failing drawer.
a. Determine if the power supply with the failing PWR (power) indicator (off) or
CHK/PWR Good indicator (on amber) is in drawer power supply position 1
or 2.
b. Observe the PWR (power) and CHK/PWR Good (Model 040 drawer only)
indicators on the fan-and-power-supply or power supply assemblies in the
same fan position in the other drawers in the same rack.
v SSA DASD Model 020 drawers, see PWR indicators on
fan-and-power-supply assemblies 1 11, and 2 12.
Note: Ignore PWR indicator on fan-and-power-supply assemblies 3 13
because it is not used in this analysis.
v SSA DASD Model 040 drawers, see PWR indicators 14 and 16 and
CHK/PWR Good indicators 15 and 17 on power supply assemblies 1
and 2 (there is no position 3 on Model 040 drawers 18).
Problem Isolation Procedures, CHAPTER 3
249
MAP 3390: SSA DASD Drawer Power Visual
7133 Model 020
4
Rear View
7133 Model 040
3
Rear View
Figure 129. SSA DASD Model 020 and 040 Drawer PWR (Power) Indicator Locations
(s007604p)
Is the PWR (power) indicator off, or the CHK/PWR Good on amber (Model 040
drawer only) on another power supply or fan-and-power supply assembly in the
same position on other drawers in the same rack?
v Yes, observe the state of the Power Complete Line Cord indicators. Use the
state of these indicators with “MAP 1320: Isolating Problems Using Visual
Symptoms” on page 58 in chapter 3, volume 1 of this book.
v No, go to step 8.
8. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch to On (up) for about two seconds.
Note: Pressing the Local Power switch momentarily to On (up) clears any
power errors that were generated by the failure. It also restores any
power that was removed because of these failures. It does not affect
2105 power.
250
VOLUME 1, ESS Service Guide
MAP 3390: SSA DASD Drawer Power Visual
2105 Model Exx/Fxx
Unit
Emergency
Local
Power
Ready
Cluster 1
Cluster 2
Power Complete
Line Cord 1
Line Cord 2
Front
View
Messages
Cluster 1
Cluster 2
Figure 130. 2105 Model Exx/Fxx Operator Panel Locations (S008810m)
At the rear of the 2105, is the power supply assembly PWR (power) indicator
still off?
v Yes, go to step 9.
v No, the problem may be resolved. Verify the repair. Select any of the power
supplies shown on the service terminal. Proceed through the repair but DO
NOT replace any FRU or disconnect any cables. This will simulate a repair
and run verification.
9. Is the power supply with the PWR (power) indicator that is off in drawer fan
position 2 (right)?
v Yes, replace the SSA DASD drawer power supply. See ″Power Supply
Assembly, 7133 Model 040″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Before you verify the repair, do the following:
– Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press
the Local Power switch, in Figure 130, to On (up) for about two seconds.
- If the failing PWR (power) indicator is now on, go to “MAP 3520: SSA
DASD Drawer Verification for Possible Problems” on page 280.
v If verification is not successful, call your next level of support.
v If verification is successful, the repair is complete.
- If the failing PWR (power) indicator is still off, replace the drawer power
cable connected to the failing power supply. See ″2105 Model 100 Rack
Cable Removals and Replacements″ in chapter 4, of the 2105 Model
100 Attachment to ESS Service Guide book. Connect the new cable to
the original connector on the primary power supply.
v No, go to step 10.
10. Determine where the drawer power cable, from the failing SSA DASD drawer
power supply, connects to a primary power supply. Also determine where the
other drawer power cable from this drawer connects to the same primary
power supply. See ″2105 Model 100 Rack Cable Removals and
Replacements″ in chapter 4, of the 2105 Model 100 Attachment to ESS
Service Guide book, to determine where the power cables should be plugged.
Problem Isolation Procedures, CHAPTER 3
251
MAP 3390: SSA DASD Drawer Power Visual
Refer to Figure 131 for primary power supply connector names and locations.
Figure 131. 2105 Primary Power Supply Connectors (S007380l)
a. Locate where the drawer power cable from the failing power supply
assembly connect to the primary power supply. Disconnect this cables from
the primary power supply and reconnect it into an unused connector on the
primary power supply.
b. Go to the operator panel on the front of the 2105 Model Exx/Fxx. Press the
Local Power switch to On (up) for about two seconds.
At the rear of the 2105, is the SSA DASD drawer PWR (power) indicator
still off?
v Yes, reconnect the swapped drawer power cable to its original connector
on the PPS. Replace the SSA DASD drawer power supply assembly, go
to ″Power Supply Assembly, 7133 Model 040″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2. After the repair:
– Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
– If the failing PWR (power) indicator is on, select any of the power
supplies shown on the service terminal. Proceed through the repair
but DO NOT replace any FRU or disconnect any cables. This will
simulate a repair and run verification.
– If the failing PWR (power) indicator is still off, replace the drawer
power cable connected to the failing power supply assembly. See
″2105 Model 100 Rack Cable Removals and Replacements″ in
chapter 4, of the 2105 Model 100 Attachment to ESS Service Guide
book. Connect the new cable to the original connector on the primary
power supply.
– Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
– Verify the repair, select any of the power supplies shown on the
service terminal. Proceed through the repair but DO NOT replace any
FRU or disconnect any cables. This will simulate a repair and run
verification.
v No, the internal electronic circuit breaker for the original connector on
the power supply has failed:
1) Replace the primary power supply on which you moved the drawer
power cable to an unused connector on. See ″2105 Model 100 Rack
252
VOLUME 1, ESS Service Guide
MAP 3390: SSA DASD Drawer Power Visual
Bulk Power Supply Removals and Replacements″ in chapter 4, of
the 2105 Model 100 Attachment to ESS Service Guide book.
2) Reconnect the swapped drawer power cable to its original connector
on the new PPS.
3) Go to the operator panel on the front of the 2105 Model Exx/Fxx.
Press the Local Power switch to On (up) for about two seconds.
4) Verify the repair, select any of the power supplies shown on the
service terminal. Proceed through the repair but DO NOT replace
any FRU or disconnect any cables. This will simulate a repair and
run verification.
MAP 3391: Isolating a Storage Cage Power System Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
SSA DASD DDM bay power problem.
A group of storage cage power supplies are failing. The storage cage power
supplies shut down when they cannot maintain their output voltage. This can be
caused by too few storage cage power supplies or by a short circuit on their output
voltage. All of the storage cage power supplies feed a common voltage bus. A short
on the bus will affect all attached storage cage power supplies. With this failure, the
CHK/POWER GOOD indicators on all associated storage cage power supplies will
be On (amber).
Note: The CHK/POWER GOOD indicator can be on with the color amber or green.
v Amber is CHK (check)
v Green is POWER GOOD
Isolation
1. Determine if the failing storage cage power supplies are associated with
storage cages 1 and 2 or storage cages 3 and 4.
To locate the failing power supply and which storage cage it is mounted in, see
″Rack, 2105 Model Exx/Fxx and Expansion Enclosure Storage Cage Power
Supply Location Codes″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3.
Note: A storage cage is the enclosure with four DDM bays, in front and four
DDM bays in the rear.
v 2105 Model E10/E20
– Storage cage 1 and 2, storage cage power supplies
v 2105 Expansion Enclosure
– Storage cage 1 and 2, storage cage power supplies
– Storage cage 3 and 4, storage cage power supplies
Problem Isolation Procedures, CHAPTER 3
253
MAP 3391: Storage Cage Power Problem
Verify that the switches on the rear of all affected storage cage power supplies
are set to On (up).
Were all of the switches set to On (up).
v Yes, go to step 2.
v No, set all of the switches to On (up), then go to step 2.
Storage Cage
Power Supply
Input Power
Indicators
Power
Switch
CHK/PWR
Good Indicator
Figure 132. Storage Cage Power Supply Locations (S008495m)
2. Determine if the correct number of storage cage power supplies are installed.
Count the DDM bays and the storage cage power supplies installed in the
storage cages associated with the failing power supplies (storage cages 1 and
2 or 3 and 4).
Table 22. Storage Cage Power Supply Installation Requirements
Number of DDM bays Installed
Minimum Number of Storage Cage Power
Supplies Required
1 to 8
4
1 to 8 and 9 to 16
6
Are the correct number of storage cage power supplies installed for the
number of DDM bays installed?
Note: It is OK to have more storage supplies installed than required.
v Yes, go to step 3.
v No, install the missing storage cage power supplies. See ″Storage Cage
Power Supply, 2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
3. Go to the operator panel on the front of the 2105 Model E10/E20, use the
Local Power switch to power the subsystem completely off then on.
Go to the rear of the failing 2105 and observe the CHK/PWR GOOD indicators
on the failing storage cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators still On (amber)?
v Yes, there is an overcurrent on the output of the failing storage cage power
supplies, go to step 4 on page 255.
254
VOLUME 1, ESS Service Guide
MAP 3391: Storage Cage Power Problem
v No, go to step 20 on page 258.
4. Determine if the overcurrent is caused by the storage cage fans or the storage
cage fan/power sense card:
a. Power the subsystem off.
b. Disconnect all of the storage cage fans from their storage cage planar.
c. Remove the storage cage fan/power sense card from the failing 2105.
d. Power the subsystem on.
Attention: Do not leave subsystem power on for more then five minutes
with the cooling fans disconnected.
e. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators still On (amber)?
v Yes, the fan FRUs are not causing the overcurrent. Go to step 7.
v No, one of the disconnected fan FRUs is causing the overcurrent; go to step
5.
5. Inspect all of the storage cage fans, the fan sense card, and their cables for
obvious damage. Repair any problems and found.
Were any problems found and repaired?
v Yes, verify the repair.
– If the problem was resolved, go to step 20 on page 258.
– If the problem was not resolved, go to step 6.
v No, go to step 6.
6. Determine which of the disconnected fan FRUs is causing the overcurrent:
a. Reconnect one of the disconnected storage cage fans.
Attention: Do not leave subsystem power on for more then five minutes
with the cooling fans disconnected.
b. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators On (amber)?
v Yes, the fan FRUs you just reconnected is causing the overcurrent, replace
it. See ″Storage Cage Fan (Center), 2105 Model Exx/Fxx and Expansion
Enclosure″, ″Storage Cage Fan, 2105 Model Exx/Fxx and Expansion
Enclosure″, or ″Storage Cage Fan/Power Sense Card, 2105 Model
Exx/Fxx″, all in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2. Go to step 20 on page 258.
v No, repeat the above steps on each fan FRU until all of the storage cage
fans are reconnected and the storage cage fan/power sense card is
installed.
Note: After all of the fans are reconnected, reinstall the storage cage
fan/power sense card.
7. Reconnect any disconnected storage cage fan cables and reinstall the storage
cage fan/power sense card, as required. Continue with the next step.
8. Determine if the overcurrent is caused by the DDM bays associated with the
failing storage cage power supplies:
a. Power the subsystem off.
Problem Isolation Procedures, CHAPTER 3
255
MAP 3391: Storage Cage Power Problem
b. Remove the four screws that hold each DDM bay in the storage cages
associated with the failing storage cage power supplies.
c. Pull each DDM bay out about 5 cm (2 inches).
d. Power the subsystem on.
e. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD indicators
still On (amber)?
v Yes, the DDM bays are not causing the overcurrent. go to step 11.
v No, one of the disconnected DDM bays is causing the overcurrent, go to step
9.
9. Determine which of the disconnected DDM bays is causing the overcurrent:
a. Power the subsystem off.
b. Reinstall one of the disconnected DDM bays.
c. Power the subsystem on.
d. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD indicators
On (amber)?
v Yes, the DDM bay you just reconnected is causing the overcurrent, go to
step 18 on page 258.
v No, repeat the above steps on each DDM bay until all of the DDM bays are
reinstalled.
10. Power the subsystem off. Reinstall all of the DDM bays. Continue with the next
step.
11. Determine if the overcurrent is caused by one of the storage cage power
supplies:
a. Power the subsystem off.
b. Remove the two mounting screws from all of the failing storage cage power
supplies. See ″Storage Cage Power Supply, 2105 Model Exx/Fxx and
Expansion Enclosure″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
c. Pull all of the storage cage power supplies, except one, out about 5 cm (2
inches).
d. Power the subsystem on.
e. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
f. Record which storage cage power supply is installed and the state of its
CHK/POWER GOOD indicator (amber or green).
g. Pull the storage cage power supply out about 5 cm (2 inches).
h. Repeat this test until each of the storage cage power supplies have been
installed and the state of their CHK/POWER GOOD indicators recorded.
After all storage cage power supplies have been tested, continue with the
next step.
12. Review the recorded results of the last step:
v If the CHK/PWR GOOD indicators were On (amber) for all storage cage
power supplies, go to step 13 on page 257.
256
VOLUME 1, ESS Service Guide
MAP 3391: Storage Cage Power Problem
v If the CHK/PWR GOOD indicators were On (amber) for only one storage
cage power supplies, replace it. See ″Storage Cage Power Supply, 2105
Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Reinstall all of the storage cage
power supplies, then go to step 20 on page 258.
13. Power the subsystem off. Verify that all storage cage power supplies are
reinstalled correctly. Continue with the next step.
14. Determine if the overcurrent is caused by the storage cage planar or the power
planar to DDM bay backplane cables associated with the failing storage cage
power supplies:
a. Power the subsystem off.
b. Disconnect all of the power planar to DDM bay backplane cables from the
storage cage planar associated with the failing power supplies. See
″Cables, 2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
c. Power the subsystem on.
d. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators still On (amber)?
v Yes, the power planar to DDM bay backplane cables are not causing the
overcurrent. go to step 17.
v No, one of the disconnected power planar to DDM bay backplane cables is
causing the overcurrent, go to step 15.
15. Determine which of the disconnected the power planar to DDM bay backplane
cables is causing the overcurrent:
a. Power the subsystem off.
b. Reconnect one of the disconnected power planar to DDM bay backplane
cables.
c. Power the subsystem on.
d. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators On (amber)?
v Yes, the power planar to DDM bay backplane cable you just reconnected is
causing the overcurrent, replace it. See ″Cables, 2105 Model Exx/Fxx and
Expansion Enclosure″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2. After the repair, go to step 20 on page 258.
v No, repeat the above steps on each power planar to DDM bay backplane
cable until all of the cables are reinstalled.
16. Power the subsystem off. Reinstall all of the power planar to DDM bay
backplane cables. Continue with the next step.
17. Replace the storage cage power planar. See ″Storage Cage Power Planar,
2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
Reinstall all assemblies and FRUs removed as part of this procedure. See the
″Chapter Table of Contents″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
After the replacement verify the repair.
Problem Isolation Procedures, CHAPTER 3
257
MAP 3391: Storage Cage Power Problem
Is the storage cage fan/power sense card problem resolved?
v Yes, end the call.
v No, call your next level of support.
18. Determine which of the DDM bay FRUs is causing the overcurrent:
Do the following steps on the DDM bay that is causing the overcurrent.
a. Power the subsystem off.
b. Remove all of the FRUs from the failing DDM bay:
v Disk drive modules (DDMs), see ″SSA Disk Drive Model, 7133 Model
020/040″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
Mark the DDMs for reinstallation in the same locations.
v DDM bay controller card, see ″Controller Card, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
v DDM bay bypass and passthrough cards, see ″Bypass and Passthrough
Cards, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
c. Power the subsystem on.
d. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators still On (amber)?
v Yes, replace the DDM bay frame assembly (backplane). See ″Frame
Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2. After the repair, go to step 20.
v No, go to step 19.
19. Determine which of the removed DDM bay FRUs is causing the overcurrent:
a. Power the subsystem off.
b. Reconnect one of the disconnected DDM bay FRUs.
c. Power the subsystem on.
d. Observe the CHK/POWER GOOD indicators on all of the failing storage
cage power supplies.
Are all of the failing storage cage power supply CHK/POWER GOOD
indicators On (amber)?
v Yes, the DDM bay FRU you just reinstalled is causing the overcurrent,
replace it. See the ″Chapter Table of Contents″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
After the repair, go to step 20.
v No, repeat the above steps on each DDM bay FRU until all of the FRUs are
reinstalled.
If the problem is still present after all of the DDM bay FRUs are installed,
call your next level of support.
20. Reconnect all cables and reinstall all assemblies and FRUs removed as part of
this procedure. See the ″Chapter Table of Contents″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
21. Change the state of the problems related to this failure to Closed, if not
already closed.
From the service terminal Main Service Menu, select:
258
VOLUME 1, ESS Service Guide
MAP 3391: Storage Cage Power Problem
Press F3 on the service terminal until the Main Service Menu is displayed,
then select:
Utility Menu
Problem Log Menu
Change A Problem State
Select problems with the following Resource to cancel:
v rs SSA xxxx
v rsDDMxxxx
v rsENCLOSURE
Press F4, select Cancel, then press Enter.
After all related problems are canceled, continue with the next step.
22. Run DDM bay Power test on all DDM Bay related to the failing storage cage
power supplies.
From the service terminal Main Service Menu, select:
Machine Test Menu
SSA Loops Menu
Select SSA Loop by SSA Device Card
All SSA Loops
v If the test runs without error, the problem is resolved.
v If the test fails, repair the new problems.
MAP 3395: Isolating an SSA DASD DDM Bay Power Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
SSA DASD DDM bay power problem.
All indicators on an DDM bay are off. This indicates that input power to the DDM
bay is missing.
v Drawer model, SSA DASD DDM Bay
Isolation
1. Did you start this service action from a problem displayed on a service terminal?
v Yes, go to step 4.
v No, continue with the next step.
2. Use the service terminal to look for any problems. Repair these problems first
then continue with the next step.
3. Are the symptoms that originally sent you to this MAP repaired?
v Yes, the problem is resolved end the service call.
v No, continue with the next step.
4. Determine if the failing storage cage power supplies are associated with storage
cages 1 and 2 or storage cages 3 and 4.
Problem Isolation Procedures, CHAPTER 3
259
MAP 3395: DDM bay Power Problem
To locate the failing power supply and which storage cage it is mounted in, see
″Rack, 2105 Model Exx/Fxx and Expansion Enclosure Storage Cage Power
Supply Location Codes″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3.
Note: A storage cage is the enclosure with four DDM bays, in front and four
DDM bays in the rear.
v 2105 Model E10/E20
– Storage cage 1 and 2, storage cage power supplies
v 2105 Expansion Enclosure
– Storage cage 1 and 2, storage cage power supplies
– Storage cage 3 and 4, storage cage power supplies
Is the failing DDM bay in storage cage 1 or 2?
v Yes, go to step 5.
v No, the failing DDM bay is in storage cage 3 or 4. Go to step 6.
5. Go to the rear of the 2105 Model E10/E20 or 2105 Expansion Enclosure.
Locate the storage cage power supplies mounted between storage cages 1 and
2.
Observe the CHK/POWER GOOD indicators on all of the storage cage 1 and 2
power supplies.
Storage Cage
Power Supply
Input Power
Indicators
Power
Switch
CHK/PWR
Good Indicator
Figure 133. Storage Cage Power Supply Locations (S008495m)
Are all of the storage cage 1 and 2 power supply CHK/POWER GOOD
indicators On (amber)?
v Yes, “MAP 3391: Isolating a Storage Cage Power System Problem” on
page 253.
v No, go to step 7 on page 261.
6. Go to the rear of the 2105 Expansion Enclosure. Locate the storage cage power
supplies mounted between storage cages 3 and 4.
Observe the CHK/POWER GOOD indicators on all of the storage cage 3 and 4
power supplies.
Are all of the storage cage 3 and 4 power supply CHK/POWER GOOD
indicators On (amber)?
v Yes, “MAP 3391: Isolating a Storage Cage Power System Problem” on
page 253.
v No, go to step 7 on page 261.
260
VOLUME 1, ESS Service Guide
MAP 3395: DDM bay Power Problem
7. Replace the power planar to 8-pack planar cable to the failing DDM bay. See
″Cables, 2105 Model Exx/Fxx and Expansion Enclosure″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Verify the repair. Return to the service terminal and and run the SSA Loop Test
on the failing resource listed for this problem.
Is the problem resolved?
v Yes, end the call.
v No, call your next level of support.
MAP 3397: Isolating an SSA DASD DDM Bay Controller Card Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
SSA DASD DDM bay controller card problem.
The controller card failure indicator is on.
v Drawer model, SSA DASD DDM Bay
Isolation
1. Did you start this service action from a problem displayed on a service terminal?
v Yes, go to step 5.
v No, continue with the next step.
2. Use the service terminal to look for any problems. Repair these problems first
then continue with the next step.
3. Are the symptoms that originally sent you to this MAP repaired?
v Yes, the problem is resolved end the service call.
v No, continue with the next step.
4. Replace the controller card, use Controller Card, DDM Bay in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
5. Determine the location code for the DDM bay that you just replaced the
controller card in. The DDM bay location code is in the format: Rx-Uy-Wz.
Do you know the drawers location code?
v Yes, continue with the next step.
v No, determine the location code of the DDM bay. Use Locating a DDM Bay
or SSA DASD Model 020 or 040 Drawer in a 2105 Rack in chapter 7 of the
Enterprise Storage Server Service Guide, Volume 3.
6. Verify that the controller card replacement resolved the problem.
From the service terminal Main Service Menu, select:
Machine Test Menu
SSA Loops Menu
SSA Loop by Storage Bay Drawer...
Select the line that has the DDM bay location code from the last step
(Rx-Uy-Wz). Press enter on the next screen, the verification test will
run.
Problem Isolation Procedures, CHAPTER 3
261
MAP 3397: DDM bay Controller Card Problem
v If verification is successful, the problem is resolved. Return to the
service terminal and Continue Repair Process to return the
resources to the customer and cancel the problem.
v If verification is not successful, repair the problem that was created
by the test.
MAP 3398: Isolating a DDM bay Controller Card Communications
Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the DDM bay unless instructed to do so.
SSA DASD DDM bay controller card communications problem.
Description
The DDM bay controller card has problems communicating with the bypass card or
the passthrough cards in the DDM bay. The cause of the failure may be the
controller card, bypass card, one of the pass through cards, or the DDM bay
backplane.
v Drawer model, SSA DASD DDM Bay
Isolation
1. Locate the controller card in the FRU list. Select the controller card and
replace it.
After replacement, verify the repair:
v If the problem is resolved, end the call.
v If the problem is not resolved, continue with the next step.
2. Verify that the controller card check indicator is on (amber), see “DDM Bay
Indicators and Switches” on page 12.
v If the check indicator is on, continue with the next step.
v If the check indicator is not on, call your next level of support.
3. Select the bypass card from the FRU list for replacement.
a. Do not disconnect the SSA cables from the bypass card.
b. Follow the service terminal instructions to where you are told to remove the
card.
c. Pull the card out only until it is unplugged from the backplane.
d. Continue with the next step.
4. Check if the controller card check indicator is off with the bypass card out.
v If the check indicator is off, continue with the next step.
v If the check indicator is still on, plug the bypass card back in and go to step
6 on page 263.
5. Replace the bypass card and run verification. See ″Bypass and Passthrough
Cards, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2.
Was the verification successful?
v Yes, the problem is resolved, end the call.
262
VOLUME 1, ESS Service Guide
MAP 3398: DDM bay Controller Card Communications Problem
v No, continue with the next step.
6. Select the first passthrough card from the FRU list for replacement.
a. Do not disconnect the SSA cables from the passthrough card.
b. Follow the service terminal instructions to where you are told to remove the
card.
c. Pull the card out only until it is unplugged from the backplane.
d. Continue with the next step.
7. Check if the controller card check indicator is off with the passthrough card out.
v If the check indicator is off, continue with the next step.
v If the check indicator is still on, plug the passthrough card back in and go to
step 9.
8. Replace the passthrough card and run verification. See ″Bypass and
Passthrough Cards, DDM Bay″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Was the verification successful?
v Yes, the problem is resolved, end the call.
v No, continue with the next step.
9. Select the second passthrough card from the FRU list for replacement.
a. Do not disconnect the SSA cables from the passthrough card.
b. Follow the service terminal instructions to where you are told to remove the
card.
c. Pull the card out only until it is unplugged from the backplane.
d. Continue with the next step.
10. Check if the controller card check indicator is off with the passthrough card
out.
v If the check indicator is off, continue with the next step.
v If the check indicator is still on, plug the passthrough card back in and go to
step 12.
11. Replace the passthrough card and run verification. See ″Bypass and
Passthrough Cards, DDM Bay″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Was the verification successful?
v Yes, the problem is resolved, end the call.
v No, continue with the next step.
12. Select the DDM bay frame from the FRU list for replacement. See ″Frame
Assembly, DDM Bay″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2. Replace the DDM bay backplane then run verification.
Was the verification successful?
v Yes, the problem is resolved, end the call.
v No, call your next level of support.
MAP 3400: Replacing an SSA DASD Drawer Backplane or Frame
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
263
MAP 3400: Backplane or Frame Replacement
Description
This procedure is used for SSA failures when the service terminal repair process
cannot call out the backplane for replacement.
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Procedure
1. Record the MAP and step number that sent you to this MAP.
2. Verify you are at the SSA link repair screen that did not include the backplane
as a FRU.
3. Record the drawer number you are repairing and for SSA DASD Model 020s, if
you will be replacing the front or back backplane.
4. Press F3 on the service terminal until the Repair Menu is displayed, select:
Replace a FRU
SSA Devices Menu
5. Move the cursor to the backplane or frame being replaced, front or back, and
press Enter.
6. Replace the selected backplane or frame:
v SSA DASD Model 020
– Front backplane, see ″Front Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
– Back backplane, see ″Back Backplane Assembly, 7133 Model 020″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD Model 040
– Frame assembly, see ″Frame Assembly, 7133 Model 040″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
v SSA DASD DDM bay
– DDM bay frame assembly (backplane). See ″Frame Assembly, DDM Bay″
in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
7. After the backplane or frame is replaced, follow the instructions displayed on the
service terminal to verify the repair process.
v If the repair verification runs without error, the problem is resolved.
v If the SSA link is still failing, look at the MAP step that sent you to this MAP.
– If that step is the last step in the procedure, call the next level of support.
– If there are more steps in the procedure, continue with that MAP.
MAP 3421: Storage Cage Fan/Power Sense Card R2 Cable Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The storage cage fan/power sense card in the bottom half of a 2105 Expansion
Enclosure has reported that it has no cage sense card R2 cable installed. This
cable is needed for proper control of fan speeds in the 2105 Expansion Enclosure
box.
The problem can be caused by one of the following:
v The cage sense card R2 cable is not connected correctly.
264
VOLUME 1, ESS Service Guide
MAP 3421: Fan/Power Sense Card R2 Cable Problem
v The cage sense card R2 cable is failing.
v The lower fan/power sense card is reporting incorrectly.
v A DDM bay controller card is reporting incorrectly.
Figure 134. 2105 Primary Power Supply Connectors (5008774m)
Isolation
1. Locate the cage sense card R2 cable that is connected to the upper and lower
storage cage fan/power sense cards in the 2105 Expansion Enclosure. Verify
that the R2 cable is connected correctly to both sense cards.
Did you find and fix a problem with the R2 cable?
v Yes, verify the repair. Return to the service terminal and select the sense
card for replacement. Proceed through the repair but do not replace the
sense card. This will simulate a repair and run verification.
– If verification is successful, close the problem.
– If verification fails, continue with the next step.
v No, continue with the next step.
2. Replace the cage sense card R2 cable, and then verify the repair. Return to the
service terminal and select the sense card for replacement. Proceed through the
repair but do not replace the sense card. This will simulate a repair and run
verification.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
3. Replace the fan/power sense card show as a FRU by the service terminal, then
verify the repair. See ″Storage Cage Fan/Power Sense Card, 2105 Model
Exx/Fxx″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume
2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
4. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card, DDM Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
Problem Isolation Procedures, CHAPTER 3
265
MAP 3421: Fan/Power Sense Card R2 Cable Problem
v If the verification was not successful, call your next level of support.
MAP 3422: Storage Cage Fan/Power Sense Card R2 Jumper and Cable
Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The Storage cage fan/power sense card in the top of the 2105 Expansion
Enclosure has reported one of the following:
v Missing cage sense card R2 jumper
v Missing cage sense card R2 cable
Figure 135. 2105 Primary Power Supply Connectors (5008774m)
Isolation
1. Check if there is a storage cage fan/power sense card in the bottom of the
2105 Expansion Enclosure.
Is there a lower storage cage fan/power sense card in the 2105 Expansion
Enclosure?
v Yes, go to step 6 on page 267.
v No, continue with the next step.
2. Inspect the upper storage cage fan/power sense card in the 2105 Expansion
Enclosure. Verify that cage sense card R2 jumper is present and installed
correctly on the upper storage cage fan/power sense card.
Did you find and correct a problem with the R2 jumper?
v Yes, verify the repair. Return to the service terminal and select the sense
card for replacement. Proceed through the repair but do not replace the
sense card. This will simulate a repair and run verification.
– If verification is successful, close the problem.
266
VOLUME 1, ESS Service Guide
MAP 3422: Fan/Power Sense Card R2 Jumper and Cable Problems
– If verification fails, continue with the next step.
v No, continue with the next step.
3. Replace the R2 jumper and verify the repair. Return to the service terminal and
select the sense card for replacement. Proceed through the repair but do not
replace the sense card. This will simulate a repair and run verification.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
4. Replace the sense card and then verify the repair. See ″Storage Cage
Fan/Power Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
5. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
6. There are storage cage fan/power cards in both the top and the bottom of the
2105 Expansion Enclosure. The cage sense card R2 cable should run from the
top to the bottom sense cards.
v If the cable is missing or unplugged, install the cable.
7.
8.
9.
10.
v If the cable is already installed, continue with the next step.
Replace the cage sense card R2 cable, then verify the repair. Return to the
service terminal and select the sense card for replacement. Proceed through
the repair but do not replace the sense card. This will simulate a repair and
run verification.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
Replace the top storage cage fan/power sense card and then verify the repair.
See ″Storage Cage Fan/Power Sense Card, 2105 Model Exx/Fxx″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
Replace the bottom storage cage fan/power sense card and then verify the
repair.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful,continue with the next step.
Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
MAP 3423: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Missing Error
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
267
MAP 3423: Storage Cage Fan/Power Sense Card R1 Jumper Problem
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The storage cage fan/power sense card in 2105 Model Exx/Fxx has reported that
the cage sense card R1 jumper is missing.
The problem is one of the following:
v The cage sense card R1 jumper is missing
v The cage sense card R1 jumper is failing
v The fan/power sense card is reporting incorrectly
v A DDM bay controller card is reporting incorrectly.
Figure 136. 2105 Primary Power Supply Connectors (5008774m)
Isolation
1. Inspect the upper storage cage fan/power sense card in the 2105 Model
Exx/Fxx Verify that cage sense card R1 jumper is present and installed correctly
on the storage cage fan/power sense card.
Did you find and correct a problem with the R1 jumper?
v Yes, verify the repair. Return to the service terminal and select the sense
card for replacement. Proceed through the repair but do not replace the
sense card. This will simulate a repair and run verification.
– If verification is successful, close the problem.
– If verification fails, go to step 2 on page 269.
v No, replace the R1 jumper and verify the repair. Return to the service
terminal and select the sense card for replacement. Proceed through the
repair but do not replace the sense card. This will simulate a repair and run
verification.
– If the verification was successful, close the problem and end the call.
– If the verification was not successful, continue with the next step.
268
VOLUME 1, ESS Service Guide
MAP 3423: Storage Cage Fan/Power Sense Card R1 Jumper Problem
2. Replace the sense card and then verify the repair. See ″Storage Cage
Fan/Power Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
3. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
MAP 3424: Isolating a Storage Cage Fan/Power Sense Card R1 Jumper
Failing Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The storage cage fan/power sense card in 2105 Model Exx/Fxx has reported a
failure that is only possible in 2105 Expansion Enclosure. This indicates that the
2105 Model Exx/Fxx cage sense card R1 jumper is failing.
Figure 137. 2105 Primary Power Supply Connectors (5008774m)
Isolation
1. Replace the cage sense card R1 jumper, then verify the repair. Return to the
service terminal and select the sense card for replacement. Proceed through the
repair but do not replace the sense card. This will simulate a repair and run
verification.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
269
MAP 3424: Storage Cage Fan/Power Sense R1 Jumper Problem
2. Replace the storage cage fan/power sense card shown as a FRU by the service
terminal, then verify the repair.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
3. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
MAP 3425: Isolating a Storage Cage Fan/Power Sense Card R2 Cable
Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
One of the storage cage fan/power sense cards in 2105 Expansion Enclosure has
reported a line open in the cage sense card R2 cable. This cable connects the
upper and lower storage cage fan/power sense cards.
The most likely cause of the problem is one of the following:
v The cage sense card R2 cable is failing
v The storage cage fan/power sense card that reported the failure is failing.
v A DDM bay controller card is reporting incorrectly.
Figure 138. 2105 Primary Power Supply Connectors (5008774m)
270
VOLUME 1, ESS Service Guide
MAP 3425: Storage Cage Fan/Power Sense Card R2 Cable Problem
Isolation
1. Replace the cage sense card R2 cable, then verify the repair. Return to the
service terminal and select the sense card for replacement. Proceed through the
repair but do not replace the sense card. This will simulate a repair and run
verification.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
2. Replace the storage cage fan/power sense card, that was shown as a FRU by
the service terminal, then verify the repair. See ″Storage Cage Fan/Power
Sense Card, 2105 Model Exx/Fxx″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
3. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
MAP 3426: Isolating a Storage Cage Fan/Power Sense Card Location
Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The machine hardware is reporting different rack location information from than
entered manually at the service terminal. The problem must be corrected.
The possible causes of this condition are:
v A cage sense card R2 jumper has mistakenly been plugged onto the storage
cage fan/power sense card in 2105 Model Exx/Fxx
v A cage sense card R1 jumper has mistakenly been plugged onto the storage
cage fan/power sense card in the top half of 2105 Expansion Enclosure
v The DDM bay location selected by the service support representative for a DDM
bay was in the wrong 2105, and needs to be changed.
Problem Isolation Procedures, CHAPTER 3
271
MAP 3426: Storage Cage Fan/Power Sense Card Location Problem
Figure 139. Fan Sense Card Jumper and Cable Locations (S008774m)
Isolation
1. Inspect the storage cage fan/power sense card in the 2105 Model Exx/Fxx. If a
2105 Expansion Enclosure is present, inspect the upper storage cage fan/power
sense card in it also.
Verify that the correct cage sense card Rx jumper is present and installed
correctly on the upper storage cage fan/power sense cards.
v 2105 Model Exx/Fxx, cage sense card R1 jumper
v 2105 Expansion Enclosure, cage sense card R2 jumper
Did you find and correct a problem with the Rx jumper?
v Yes, verify the repair. Return to the service terminal and select the sense
card for replacement. Proceed through the repair but do not replace the
sense card. This will simulate a repair and run verification.
– If verification is successful, close the problem.
– If verification fails, continue with the next step.
v No, continue with the next step.
2. Change the DDM bay location selected by the service support representative.
Look below the FRU list on the service terminal, at the line that starts with
″Additional Message...″. Look for the word ″Reported″, followed by the
Rack-Bay-Drawer location reported by the 2105. Then look for the word
Entered, followed by the Rack-Bay-Drawer location that was entered by the
service support representative.
3. Do the following steps to uninstall the drawer or drawers that you just installed:
a. Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
272
VOLUME 1, ESS Service Guide
MAP 3426: Storage Cage Fan/Power Sense Card Location Problem
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be removed
from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may continue
with any operation desired.
MAP 3427: Isolating a Storage and DDM Bay Location Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The machine hardware is reporting different DDM bay location information than was
entered manually at the service terminal. The problem must be corrected.
The possible causes for this condition are:
v The cage sense card R2 cable has been plugged backwards. The end marked
Fan Sense Card Top Power Stack has been plugged into the lower sense card.
The end marked Fan Sense Card Bottom Power Stack has been plugged into
the upper sense card.
v The DDM Bay location selected by the CE for an DDM Bay was in the wrong
bay, and needs to be changed.
v A DDM bay controller card is reporting incorrectly.
Figure 140. Fan Sense Card Jumper and Cable Locations (S008774m)
Problem Isolation Procedures, CHAPTER 3
273
MAP 3427: Storage and DDM Bay Location Problems
Isolation
1. Inspect the 2105 Expansion Enclosure, determine if there are storage bays in
the top and bottom of the rack.
v If there are storage bays in the top and bottom of the 2105 Expansion
Enclosure, go to step 2.
v If there is a storage bays only in the top of the 2105 Expansion Enclosure, go
to step 3.
2. Verify that the cage sense card R2 cable is installed correctly to the top and
bottom sense cards.
v If you find and fix a problem, return to the service terminal and select the
sense card for replacement. Proceed through the repair but do not replace
the sense card. This will simulate a repair and run verification.
– If the verification was successful, close the problem and end the call.
– If the verification was not successful, continue with the next step.
v If you did not find and a problem, continue with the next step.
3. Review the DDM bay location selected by the service support representative.
Look below the FRU list on the service terminal, at the line that starts with
Additional Message.... Look for the word Reported, followed by the
Rack-Bay-Drawer location reported by the 2105. Then look for the word
Entered:, followed by the Rack-Bay-Drawer location that was entered by the
service support representative.
Note: You can verify that the Reported location is correct by looking on the
Additional Messages line, to the right of the Reported
Rack-Bay-Drawer location. You may need to use the arrow keys on the
keyboard to scroll to the right. Look for the words DDMSN, followed by
the serial number of the DDM that was used to read the Reported
location. Following the serial number is the slot number in the DDM bay,
in parentheses, where the DDM is located. You should be able to find the
DDM with this serial number in the DDM bay slot indicated by the
Reported location. If this DDM is not in the DDM bay slot indicated, call
your next level of support.
v If the entered location is wrong, continue with the next step.
v If the reported location is wrong, replace the DDM bay controller card
shown as a FRU by the service terminal. See ″Controller Card
Removal and Replacement, DDM Bay″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2, then verify the repair.
– If the verification was successful, close the problem and end the
call.
– If the verification was not successful, call your next level of support.
4. Change the DDM bay location selected by the CE. Do the following steps to
uninstall the drawer or drawers that you just installed:
a. Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
274
VOLUME 1, ESS Service Guide
MAP 3427: Storage and DDM Bay Location Problems
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be removed
from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may reinstall
the drawers. Be careful to select the correct locations.
MAP 3428: Isolating an SSA DASD Drawer Location Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The machine hardware is reporting different SSA DASD drawer location information
than was entered manually at the service terminal. The problem must be corrected.
The possible causes for this condition are:
v The power planar to DDM bay planar cable is plugged to the wrong connector
position on the storage cage power planar. See Figure 141 on page 277 and
Figure 142 on page 278
v The DDM bay location selected by the service support representative for a DDM
bay was in the wrong location, and needs to be changed.
v A DDM bay controller card is reporting incorrectly.
Isolation
1. Review the DDM bay location entered by the service support representative.
Look below the FRU list on the service terminal, at the line that starts with
Additional Message.... Look for the word Reported, followed by the
Rack-Bay-Drawer location reported by the 2105. You can find the actual DDM
that was used to read the Reported location. Look on the Additional Messages
line, to the right of the Reported Rack-Bay-Drawer location. You may need to
use the arrow keys on the keyboard to scroll to the right. Look for the words
DDMSN, followed by the serial number of the DDM that was used to read the
Reported location. Following the serial number is the slot number in the DDM
bay, in parentheses, where the DDM is located. You should be able to find the
DDM with this serial number in the DDM bay slot indicated by the Reported
location. Then look for the word Entered:, followed by the Rack-Bay-Drawer
location that was entered by the service support representative. Carefully review
the location that the service support representative entered to determine if it is
correct.
v If the location entered by the service support representative is not correct, go
to step 2.
v If the location entered by the service support representative is correct, go to
step 3 on page 276.
2. Do the following steps to uninstall the drawer or drawers that you just installed:
Problem Isolation Procedures, CHAPTER 3
275
MAP 3428: SSA DASD Drawer Location Problem
a. Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be removed
from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may reinstall
the drawers. Be careful to select the correct locations. Complete the install
process. If any problems are found, proceed as directed by the service
panel and end this call. Do not proceed to the next step.
3. Replace the DDM bay controller card shown as a FRU by the service terminal.
See Controller Card Removal and Replacement, DDM Bay in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2 book, then verify the repair.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, continue with the next step.
After the installation, go to step 5.
4. The power planar to DDM bay planar cable may be plugged into the wrong
connector position on the storage cage power planar. Remove the DDM bay in
question and verify that the power planar to DDM bay planar cable is plugged
correctly:
a. Remove the DDM bay, from the 2105. See Frame Assembly Removal and
Replacement, DDM Bay in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2 book. Do only the steps necessary to remove and
replace the DDM bay.
b. Verify that the power planar to DDM bay planar cable is plugged correctly.
The most likely problem is the cables to a pair of front and rear DDM bays
are swapped. See Figure 141 on page 277 and Figure 142 on page 278
Did you find and correct a problem with the power planar to DDM bay planar
cable?
v Yes, continue with the next step.
v No, call your next level of support.
5. Verify the repair. Return to the service terminal and select the sense card for
replacement. Proceed through the repair but do not replace the sense card.
This will simulate a repair and run verification.
v If verification is successful, close the problem.
v If verification fails, work on the resulting problem.
276
VOLUME 1, ESS Service Guide
MAP 3428: SSA DASD Drawer Location Problem
2105 Model Exx/Fxx and Expansion Enclosure
Storage Cage U2
Storage Cage U1
F1
F3
Power
Planar Q1
F1
F3
F1
J18
J17
DDM Bay U1 - W1
DDM Bay U1 - W2
J16
J15
DDM Bay U1 - W3
DDM Bay U1 - W4
J28
J27
J26
J25
DDM Bay U2 - W2
J14
J13
J24
J23
DDM Bay U2 - W3
J12
J11
J22
J21
DDM Bay U2 - W4
F2
Storage Cage U4
Storage Cage U3
F1
DDM Bay U3 - W1
F3
DDM Bay U2 - W1
Power
Planar Q2
J18
J17
DDM Bay U3 - W2
J16
J15
DDM Bay U3 - W3
DDM Bay U3 - W4
F1
J28
J27
F1
DDM Bay U4 - W1
J26
J25
DDM Bay U4 - W2
J14
J13
J24
J23
DDM Bay U4 - W3
J12
J11
J22
J21
DDM Bay U4 - W4
F2
F3
Front View
Figure 141. DDM Bay Front Power Cable Locations (S008812s)
Note: The two lower storage cages (U3 and U4) are not present in 2105 Model
Exx/Fxxs.
Problem Isolation Procedures, CHAPTER 3
277
MAP 3428: SSA DASD Drawer Location Problem
2105 Model Exx/Fxx and Expansion Enclosure
Storage Cage U1
Storage Cage U2
F6
F4
Power
Planar Q1
F6
J28
J27
J18
J17
DDM Bay U2 - W6
J26
J25
J16
J15
DDM Bay U1 - W6
DDM Bay U2 - W7
J24
J23
J14
J13
DDM Bay U1 - W7
DDM Bay U2 - W8
J22
J21
J12
J11
DDM Bay U1 - W8
DDM Bay U2 - W5
F4
DDM Bay U1 - W5
Storage Cage U3
Storage Cage U4
F6
Power
Planar Q2
F6
J28
J27
J18
J17
DDM Bay U4 - W6
J26
J25
J16
J15
DDM Bay U3 - W6
DDM Bay U4 - W7
J24
J23
J124
J13
DDM Bay U3 - W7
DDM Bay U4- W8
J22
J21
J12
J11
DDM Bay U3 - W8
DDM Bay U4 - W5
F4
F4
DDM Bay U3 - W5
Rear View
Figure 142. DDM Bay Rear Power Cable Locations (S008813s)
Note: The two lower storage cages (U4 and U3) are not present in 2105 Model
Exx/Fxxs.
MAP 3429: Isolating a DDM Location Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The machine hardware is reporting different DDM location information than was
created internally based on what was entered manually at the service terminal. The
problem must be corrected.
The possible causes for this condition are:
v The SSA loop has been cabled incorrectly.
278
VOLUME 1, ESS Service Guide
MAP 3429: DDM Location Problem
v The DDM bay controller card is reporting the DDM location incorrectly.
Isolation
1. Look at the SSA cables displayed on the Detail Problem screen. Compare the
SSA cables displayed with the cabling of the DDM bay being Installed/Analyzed.
Are any of the SSA cables connected wrong?
v Yes, connect the jumper cables to the correct connectors, then verify the
repair. Return to the service terminal and select the sense card for
replacement Proceed through the repair but do not replace the sense card.
This will simulate a repair and run verification.
– If the verification was successful, close the problem and end the call.
– If the verification was not successful, continue with the next step.
v No, continue with the next step.
2. Replace the DDM bay controller card shown as a FRU by the service terminal,
then verify the repair. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification was successful, close the problem and end the call.
v If the verification was not successful, call your next level of support.
MAP 3500: Verifying an SSA DASD Drawer Repair
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you to verify a repair to a SSA DASD drawer that generated a
problem because it was powered off. This MAP will verify if the problem is resolved.
v Drawer models, SSA DASD Model 020 or 040 drawer or DDM bay
Isolation
1. Determine if the SSA DASD drawer with the problem was just installed into the
2105 or if DDMs were just installed into it.
Was the failing drawer or its DDMs just installed?
v Yes, the drawer or its DDMs were just installed.
At the service terminal press F3 until the screen that allows the restart of
installation is displayed. Restart the installation to verify the repair. If the
repair is verified, the installation will resume at the point that the original error
was detected.
v No, the drawer or its DDMs were not just installed.
Verify the repair using the service terminal. From the Main Service Menu,
select:
Machine Test Menu.
Machine Test Menu
Select SSA Loops Menu.
Problem Isolation Procedures, CHAPTER 3
279
MAP 3500: SSA DASD Drawer Verification
Select the drawer you just repaired. Identify the drawer by the
location code.
Did the SSA device test run without error?
– Yes, go to step 2.
– No, follow the instructions displayed on the service terminal to correct the
problem.
2. Go to “MAP 1500: Ending a Service Action” on page 68.
MAP 3520: SSA DASD Drawer Verification for Possible Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP verifies that an SSA DASD drawer is operating correctly when visual
symptoms, or other reasons, indicate a possible problem.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Did you start this service action from a problem displayed on a service terminal?
v Yes, go to step 4.
v No, continue with the next step.
2. Use the service terminal to look for any problems. Repair these problems first
then continue with the next step.
3. Are the symptoms that originally sent you to this MAP repaired?
v Yes, the problem is resolved end the service call.
v No, continue with the next step.
4. Record the location of the drawer or DDM bay that you have just repaired.
5. At the service terminal, press F3 until the Main Service Menu is displayed,
select:
Machine Test Menu
SSA Loops Menu
Find the line that has the SSA Device drawer with location you recorded.
6. Select a line with the recorded SSA Device drawer location to run the SSA loop
test. Select loop A or B for this test, it does not matter which you select.
This test will verify correct operation of all of the SSA DASD drawers on both
loops of that SSA device card.
MAP 3540: Unrelated Occurrence, Retry Web Operation
Attention: This is not a stand-alone procedure.
280
VOLUME 1, ESS Service Guide
MAP 3540: Unrelated Occurrence, Retry Web Operation
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The web process did not complete successfully because some unrelated
occurrence in the system caused the test to abort. Retrying the web process may
allow the verification test to run to completion. If there is a real problem, you will be
directed to a different MAP.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. The customer operation probably failed because a problem on the machine or
an error recovery by the machine.
2. Repair any problems that you find on the machine.
Note: These problems might have caused the Web operation to fail.
3. Even if you found no problems on the machine, have the customer retry the
Web operation that failed.
Did the Web operation complete successfully?
v Yes, the problem is resolved.
v No, the machine is still failing. Fix any additional problems that occurred on
the machine. If this does not allow the customer to complete the Web
operation, call the next level of support.
MAP 3560: Unrelated Occurrence, Retry Verification Test
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The verification test did not complete successfully because some unrelated
occurrence in the system caused the test to abort. Retrying the verification test will
allow the verification test to run to completion. If there is a real problem, you will be
directed to a different MAP.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Problem Isolation Procedures, CHAPTER 3
281
MAP 3560: Unrelated Occurrence, Retry Verification Test
Isolation
Rerun the verification test. Press F3 once. At the new screen, select the Run
Verification Tests Again option.
Did repair verification run without error?
v If the verification ran without error, the problem is resolved.
v If the verification failed, continue with any problem displayed by the verification
process.
If this same problem continues to occur, there may be another problem on the
machine that prevents verification from running successfully. Resolve these
problems then retry this problem again. If verification still fails, call your next level
of support.
MAP 3570: Unrelated Event Caused Resume Fail
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The verification test did not complete successfully because some unrelated
occurrence in the system caused the test to abort. Retrying the verification test will
allow the verification test to run to completion. If there is a real problem, you will be
directed to a different MAP.
At the end of a repair process a Resume process is performed that makes the
resource available for customer use. During the Resume process an unrelated
event occurred that prevented the Resume to complete normally. You will need to
go through a pseudo repair process to complete the repair.
Isolation
1. Select the DDM listed in the Possible FRUs to Replace portion of the problem.
2. Proceed through the repair process, when the process instructs you to replace
the DDM, do not replace it. Continue through the repair process as if you had
replaced the DDM. If this repair process directs you to resolve other problems
before completing this problem, do so. Then return to this problem
MAP 3600: Multiple DDMs Isolated on an SSA Loop
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
282
VOLUME 1, ESS Service Guide
MAP 3600: Multiple DDM Isolated on an SSA Loop
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Multiple DDMs can not be accessed. The open links are on a drawer or DDM bay
boundary.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Determine if the SSA cables to the failing drawer have just been changed or
installed.
Have the SSA cables just been changed or installed?
v Yes, go to step 2.
v No, go to step 4.
2. Verify that the SSA cables are connected correctly. Look at the cables
displayed on the Detail Problem screen. Compare the cables displayed with
the cabling of the drawer or DDM bay.
Are any of the cables connected wrong?
v Yes, Connect the cables to the correct connectors, go to step 3.
v No, go to step 4.
3. Determine if the problem is resolved. Return to the service terminal Detail
Problem screen. Select any FRU in the Possible FRUs to Replace list or any
cable in the cable list. Proceed through the repair but do not replace any FRU
or disconnect any cables. This will simulate a repair and run verification.
Did verification run without error?
v Yes, the problem is resolved. Return to the service terminal and select
Continue Repair Process, to return the resources to the customer and
cancel the problem.
v No, go to step 4.
4. Look at the Additional Message in the Detail Problem Record, it will give you
the name and location of one or more failing drawers. Find one of these failing
drawers. See ″Locating a DDM Bay or SSA DASD Model 020 or 040 Drawer in
a 2105 Rack″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3.
Continue with the next step.
5. Determine if the failing drawer is an SSA DASD Model 020 or 040 drawer or
DDM bay.
Is the failing drawer a SSA DASD Model 040?
v Yes, go to “MAP 3620: Multiple DDMs Isolated on an SSA Loop” on
page 296
v No, go to step 6.
6. Determine if the failure is in an DDM bay.
Is the failure in an DDM bay?
v Yes, go to step 11 on page 284.
v No, the failing drawer is a SSA DASD Model 020 drawer, go to step 7.
7. Use Figure 143 on page 284 in the following steps to locate the switch and
indicators on the SSA DASD drawer power control panel:
Power Switch (On/Off)
Problem Isolation Procedures, CHAPTER 3
283
MAP 3600: Multiple DDM Isolated on an SSA Loop
Power Indicator (green)
Check Indicator (amber)
Figure 143. SSA DASD Model 020 Power Control Panel Locations (S008020m)
8. Go to the front of the 2105 and locate one of the SSA DASD drawers with a
DDM shown for replacement. Observe the SSA DASD drawer green power
indicator on the drawer power control panel.
Is the green drawer power indicator on?
v Yes, go to step 9.
v No, press and release the drawer power switch, on the drawer power
control panel.
– If the SSA DASD drawer power indicator is on, go to “MAP 3500:
Verifying an SSA DASD Drawer Repair” on page 279.
– If the SSA DASD drawer power indicator is off, go to “MAP 3352:
Isolating SSA DASD Drawer Power Problems” on page 219.
9. Observe the SSA DASD drawer amber check indicator on the drawer power
control panel.
v If the SSA DASD drawer check indicator is on or blinking, go to “MAP 3150:
Isolating an SSA DASD Drawer Power Problem” on page 188.
v If the SSA DASD drawer check indicator is off, go to step 10.
10. Call the next level of support for instructions on rebuilding the array.
Attention: Attempting to rebuild the arrays, without correct procedures, may
result in the loss of customer data.
11. Observe the following indicators on the front of the DDM bay:
v DDMs (eight)
v Bypass card
v Controller card
284
VOLUME 1, ESS Service Guide
MAP 3600: Multiple DDM Isolated on an SSA Loop
Figure 144. DDM bay Indicator Locations (S008018l)
12. Go to the DDM bay and observe the indicators.
Note: The front of the DDM bay can be facing the front or rear of the 2105.
Are any of the indicators on?
v Yes, call your next level of support.
v No, go to “MAP 3395: Isolating an SSA DASD DDM Bay Power Problem” on
page 259
MAP 3605: Isolating an Unexpected Result
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
v Drawer models, SSA DASD Model 020 or 040 drawer or SSA DASD DDM bay
Unexpected results were reported by an SSA component.
Isolation
An unexpected condition was detected, call your next level of support.
MAP 3610: DDM Installation with New Rank Site Capacity
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
Description
This section describes the conditions that created this state.
Problem Isolation Procedures, CHAPTER 3
285
MAP 3610: DDM Installation with New Rank Site Capacity
The full storage capacity of all DDMs (Disk Drive Modules) on an SSA loop can be
used only when all of the DDMs have the same storage capacity. There are times
when it is correct to add DDMs of a different capacity to a loop. This can happen
when a specific DDM is no longer manufactured and DDMs with a larger storage
capacity must be used. There are also times when there is a need to have mixed
capacity 7133 drawers on a single loop.
You have been sent to this MAP because multiple capacity arrays may be created
on this loop, and additional DDMs may be required as spares.
If you understand the conditions that created this state, go directly to the Isolation
section. If you need more information on allowing this new effective capacity, read
the following Detailed Description section.
Detailed Description
This section is to describe the details of the conditions that created this state. The
following Isolation section will describe what to do to fix the condition.
1. The capacity of all DDMs on an SSA loop are most fully used when all DDMs
have the same storage capacity. There are times when there is a need to add
DDMs of a different capacity.
2. When arrays on an SSA Loop are the same capacity, one spare is created for
each of the first two arrays created. When larger storage capacity DDMs are
added to a loop, allowing higher capacity arrays, one larger capacity spare is
created for each of the first two larger capacity arrays.
3. There are two possible options to resolving this condition.
a. Give permission for the installation to continue with DDMs intermixed as
they currently are.
b. Remove the 7133 drawer(s) or DDM bay(s) that you have just installed.
4. The follow items will help you determine the exact condition and what the
options mean.
5. On each SSA loop, DDMs are grouped together as Potential and Configured
Rank Sites. Each Rank Site consists of eight DDMs.
6. Arrays consist of seven or eight array member DDMs. All of the members of
any array are found on the same rank site. When there are seven members in
an array, the additional DDM in that rank site is always assigned as a spare.
All of the DDMs in an entire array combine so that the array is accessed as if
it were a single DDM.
7. Each JBOD (Just a Bunch Of Disks) DDM is accessed individually. DDMs are
chosen to be JBODs by rank site. A JBOD rank site may or may not contain a
spare. When any DDM in a rank site is chosen to be a JBOD DDM, that rank
site becomes a JBOD rank site. In a JBOD rank site all of the DDMs in that
rank site, except the spares, can only be used for JBOD. Intermixed capacity
DDMs in a JBOD rank site is not a problem.
8. There is a Utility that allows viewing the Rank Sites on an SSA Loop and the
capacities of the DDMs on those Rank Sites. The effective capacity of a Rank
Site is determined by the smallest capacity of any DDM on a rank site.
9. Configured rank sites contain those DDMs which have already been assigned
as array members, spares or JBOD DDMs. Since these rank sites contain
customer data, they will not be affected by this MAP. The effective capacity of
these rank sites is the same capacity as the smallest capacity DDM in the rank
site.
286
VOLUME 1, ESS Service Guide
MAP 3610: DDM Installation with New Rank Site Capacity
Note: There ia a possible, but infrequent, situation where an arrays effective
capacity will be smaller than the smallest DDM. See the note with
Description step 13.
10. All unassigned DDMs on a loop are considered to be Free and have been
grouped into potential rank sites.
Note: Some DDMs may have a status of Failed and may occur in either rank
site.
11. Whenever new DDMs are installed on a loop, these DDMs become Free
DDMs. Existing potential rank sites are dissolved releasing their Free DDMs
and any spare DDMs. Then all the Free DDMs, both new and previously
existing, are grouped together into new potential rank sites.
12. These Free DDMs are then placed in potential rank sites by capacity. The
Largest DDMs are placed into rank sites first. When there are not enough
DDMs of the largest capacity to fill the next rank site, the next smaller capacity
is used. This continues until all the Free DDMs are in potential rank sites.
13. The capacity of an array is determined by the smallest capacity of the member
DDMs when the array is created. This will be the smallest DDM in the rank
site. If one of the DDMs in a rank site is to become a spare, the largest
capacity DDM is chosen for the spare. The rest of the DDMs will become
members of the array. The difference in capacity between a large and small
capacity DDM, in the same rank site, will be unused. than the smallest
capacity in the rank site will be unused.
Note: If, after an array is created, all of the smaller drives fail and are
replaced by larger spares, the array capacity will then be less than the
smallest drive.
14. This condition occurred when one, or more, potential rank sites was found to
have a different effective capacity than previously existing rank sites.
Isolation
1. Do you want to display the capacities and rank sites of the DDMs on this loop?
v Yes, go to step 3.
v No, continue with the next step.
2. Do you want to complete the installation with the DDMs that are currently on the
loop?
v Yes, go to step 8 on page 288.
v No, go to step 6 on page 288.
3. To display the capacities of the DDMs on this loop, perform the following:
a. Note the Loop Name (color) of the loop where the installation is being done.
b. From the service terminal select Exit Install, to display the Main Service
Menu, then select:
Utility Menu
Show Storage Facility Resources Menu
List DDMs on an SSA Loop by Rank Site
Select the line with the install Loop Name (color).
Scroll up and down on the screen to view the Rank Sites and
Capacities of the DDMs on this loop.
c. Continue with the next step.
4. Now that you have viewed the DDM capacities, do you want to complete the
installation with the DDMs that are currently on the loop?
Problem Isolation Procedures, CHAPTER 3
287
MAP 3610: DDM Installation with New Rank Site Capacity
v Yes, complete the installation, continue with the next step.
v No, go to step 7 to remove the drawer(s) or DDM bay(s) you just installed.
5. Return to the Install process on the Service Terminal. Press F3 until the Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Continue into the install process you performed before until the screen
that directed you to this MAP appears.
Go to step 8.
6. At the Service terminal, select Exit Install and you will be at the Main Service
Menu.
Continue with the next step.
7. Do the following steps to uninstall the drawer or drawers that you just installed:
a. From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be removed
from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may continue
with any operation desired.
8. Select Continue with Install.
This will continue through the install process to completion and the new
effective capacity will be accepted. Installation is complete,
MAP 3612: DDM Installation with Mixed Capacity Rank Site
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
288
VOLUME 1, ESS Service Guide
MAP 3612: DDM Installation with Mixed Capacity Rank Site
Description
This section describes the conditions that created this state.
The full storage capacity of all DDMs (Disk Drive Modules) on an SSA loop can be
used only when all of the DDMs have the same storage capacity. There are times
when DDMs of a different capacity are added to a loop. This can happen when a
specific DDM is no longer manufactured and a DDM with a larger storage capacity
must be used as a replacement. There are also times when it is desirable to install
7133 drawers that contain intermixed capacity DDMs.
You have been sent to this MAP to make sure that you intended to install different
size DDMs on this loop.
If you understand the conditions that created this state, go directly to the Isolation
section. If you need more information on to determine if you will allow mixed DDM
capacities in a rank site, read the following Detailed Description section.
Detailed Description
This section is to describe the conditions that created this state. The following
Isolation section will describe what to do to fix the condition.
1. The capacity of all DDMs on an SSA loop are most fully used when all DDMs
have the same storage capacity. There are times when there is a need to add
DDMs of a different capacity.
2. There are two possible options to resolving this condition.
a. Give permission for the installation to continue with DDMs intermixed as
they currently are.
b. Remove the 7133 drawer(s) or DDM bay(s) that you have just installed.
These may be reinstalled with different DDMs.
3. The follow items will help you determine the exact condition and what the
options mean.
4. On each SSA loop, DDMs are grouped together as Potential and Configured
Rank Sites. Each Rank Site consists of eight DDMs.
5. Configured rank sites contain those DDMs which have already been assigned
as array members, spares or JBOD (Just a Bunch of DDMs) DDMs. Since
these rank sites contain customer data, they will not be affected by this MAP.
6. Most unassigned DDMs on a loop are considered to be Free and have been
grouped into potential rank sites. Some of these unassigned DDMs are
configured as spares, if needed, to allow for the configuration of potential rank
sites as arrays.
7. Arrays consist of seven or eight array member DDMs. All of the members of any
array are found on the same rank site. When there are seven members in an
array, the additional DDM in that rank site is always assigned as a spare. All of
the DDMs in an entire array combine so that the array is accessed as if it were
a single DDM.
8. A potential rank site will consist of seven Free DDMs and one spare DDM, or
eight Free DDMs.
9. Each JBOD DDM is accessed individually. DDMs are chosen to be JBODs by
rank site. A JBOD rank site may or may not contain a spare. When any DDM in
a rank site is chosen to be a JBOD DDM, that rank site becomes a JBOD rank
site. In a JBOD rank site all of the DDMs in that rank site, except the spares,
can only be used for JBOD. Intermixed capacity DDMs in a JBOD rank site is
not a problem.
Problem Isolation Procedures, CHAPTER 3
289
MAP 3612: DDM Installation with Mixed Capacity Rank Site
10. Whenever new DDMs are installed on a loop, these DDMs become Free
DDMs. Existing potential rank sites are dissolved. When a potential rank site is
dissolved, any spare DDM in it is made Free so that all of its DDMs are Free
free. All of the Free DDMs (both new and previously existing) are then grouped
together into new potential rank sites and any needed spares are created.
11. The DDMs are placed in rank sites by capacity. The largest DDMs are placed
into rank sites first. When there are not enough DDMs of the largest capacity
to fill the next rank site, the next smaller capacity DDMs are used until all the
Free DDMs are in rank sites.
12. The capacity of an array is determined by the smallest capacity of the member
DDMs when the array is created. This will be the smallest DDM in the rank
site. If one of the DDMs in a rank site is to become a spare, the largest
capacity DDM is chosen for the spare. The rest of the DDMs will become
members of the array. The difference in capacity between a large and small
capacity DDM, in the same rank site, will be unused. than the smallest
capacity in the rank site will be unused.
13. When an array is made up of all the same capacity DDMs and spares, the
capacity of all of those DDMs will be fully used. You are in this MAP because
new DDMs, of different capacities, are being installed on a loop. When
configured into an array these DDMs will not allow the full capacity to be used.
One, or more, of the potential rank sites exists that has DDMs with different
capacities.
Note: Seldom will there be more than one such Rank Site.
14. There are two possible options to resolving this condition.
a. Give permission for the installation to continue with DDMs intermixed as
they currently are.
b. Remove the 7133 drawer(s) or DDM bay(s) that you have just installed.
Isolation
1. Do you want to display the capacities of the DDMs on this loop?
v Yes, go to step 3.
v No, continue with the next step.
2. Do you want to complete the installation with the DDMs that are currently on the
loop?
v Yes, go to step 8 on page 291.
v No, go to step 6 on page 291.
3. To display the RPMs of the DDMs on this loop, perform the following:
a. Note the Loop Name (color) of the loop where the installation is being done.
b. From the service terminal select Exit Install, to display the Main Service
Menu, then select:
Utility Menu
Show Storage Facility Resources Menu
List DDMs on an SSA Loop by Rank Site
Select the line with the install Loop Name (color).
Scroll up and down on the screen to view the Rank Sites and
Capacities of the DDMs on this loop.
c. Continue with the next step.
4. Now that you have viewed the DDM RPM speeds, do you want to complete the
installation with the DDMs that are currently on the loop?
290
VOLUME 1, ESS Service Guide
MAP 3612: DDM Installation with Mixed Capacity Rank Site
v Yes, complete the installation, continue with the next step.
v No, go to step 7 to remove the drawer(s) or DDM bay(s) you just installed.
5. Return to the Install process on the Service Terminal. Press F3 until the Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Continue into the install process you performed before until the screen
that directed you to this MAP appears.
Go to step 8.
6. At the Service terminal, select Exit Install and you will be at the Main Service
Menu.
Continue with the next step.
7. Do the following steps to uninstall the drawer or drawers that you just installed:
a. Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be removed
from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may continue
with any operation desired.
8. Select Continue with Install.
This will continue through the install process to completion and the new
effective capacity will be accepted. Installation is complete,
MAP 3614: DDM Installation Introduces Different RPM
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
291
MAP 3614: DDM Installation Introduces Different RPM
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
Description
During the installation of new DDM Bay(s) or 7133 drawer(s), a DDM was found
that has a different RPM than other DDMs previously on the loop. This is permitted,
but not recommended. A DDM with a lower RPM will slow the access to any array
in which it is included. You may choose to leave this DDM in the loop. If you do,
you will not be notified if any other DDMs with this RPM are included in this
installation. On any new installations, you will only be notified of a still different RPM
DDM.
Isolation
1. Do you want to display the RPMs of the DDMs on this loop?
v Yes, go to step 3.
v No, continue with the next step.
2. Do you want to complete the installation with the DDMs that are currently on
the loop?
v Yes, go to step 10 on page 293.
v No, go to step 6.
3. To display the RPMs of the DDMs on this loop, perform the following:
a. Note the Loop Name (color) of the loop where the installation is being
done.
b. From the service terminal select Exit Install, to display the Main Service
Menu, then select:
Utility Menu
Show Storage Facility Resources Menu
List DDMs on an SSA Loop by Rank Site
Select the line with the install Loop Name (color).
Scroll up and down on the screen to view the Rank Sites and
Capacities of the DDMs on this loop.
c. Continue with the next step.
4. Now that you have viewed the DDM RPM speeds, do you want to complete
the installation with the DDMs that are currently on the loop?
v Yes, complete the installation, continue with the next step.
v No, go to step 7 on page 293 to remove the drawer(s) or DDM bay(s) you
just installed.
5. Return to the Install process on the Service Terminal. Press F3 until the Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Continue into the install process you performed before until the screen
that directed you to this MAP appears.
Go to step 10 on page 293.
6. At the Service terminal, select Exit Install and you will be at the Main Service
Menu.
292
VOLUME 1, ESS Service Guide
MAP 3614: DDM Installation Introduces Different RPM
Continue with the next step.
7. Do you want to leave the DDM bay or drawer on the loop and replace only
some of the DDMs that are currently in that drawer?
v Yes, continue with the next step.
v No, go to step 9.
8. Replace the desired DDMs and then return to Install for a reverification of the
DDMs being installed.
Do not replace DDMs in any other drawer or bay.
v If you were installing a single drawer, you may now replace any of the
DDMs in that drawer.
Do not replace DDMs in any other drawer or bay.
v If you were doing a multiple drawer install, you may replace any of the
DDMs in those drawers that were just newly installed.
Do not replace DDMs in any other drawer or bay.
After all DDMs you wish to replace, have been replaced, go to step 5 on
page 292 to verify that the loop is now correct.
9. Do the following steps to uninstall the drawer or drawers that you just installed:
a. From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawers) Menu
Remove Device Drawer
Select and quiesce the cluster you are powering off.
Attention: Select Continue to Remove Device Drawers.
b. Find the lines with the Resource Locations of the 7133 Drawers you just
installed. Select the highest line for one of the drawers you just installed.
That drawer, and all the drawers below it on the same loop, will be
removed from the loop.
Note: If you were doing a single drawer install, you must remove only that
drawer.
If you were doing a multiple drawer install, you must remove all of the new
drawers that you were installing.
c. Continue through the removal process. When complete, you may continue
with any operation desired.
10. Select Continue with Install.
This will continue through the install process to completion and the new
effective capacity will be accepted. Installation is complete,
MAP 3616: No Intermix of Bus Speeds is Allowed
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
293
MAP 3616: No Intermix of Bus Speeds is Allowed
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
Description
The installation of new 7133 drawers requires that all of the DDMs on a loop have
the same bus speed. 7133 Model 020 drawers have a bus speed of 20 MHz and
7133 Model 040 drawers have a bus speed of 40 MHz. Because of the different
bus speeds, 7133 Model 20s and 7133 Model 40 drawers can not be mixed on the
same SSA loop.
Isolation
1. Determine if you were installing one drawer or multiple drawers at the same
time.
Were you installing multiple drawers at the same time?
v Yes, go to step 3.
v No, continue with the next step.
2. You were installing a single drawer. Do the following steps to remove that
drawer.
Note: The customer will loose access to data on this loop while you are
removing the drawer. No data will be lost.
a. Press F3 until the Main Service Menu is displayed.
b. From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawer
Attention: Select Continue to Remove Device Drawers.
c. Continue through the remove process. When complete, you can
continue with any other operation.
3. You were installing multiple drawers on the loop. Do the following steps to
remove all of those drawers on the loop.
Note: The customer will loose access to data on this loop while you are
removing the drawer. No data will be lost.
a. Press F3 until the Main Service Menu is displayed.
b. From the service terminal Main Service Menu, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawer
Attention: Select Continue to Remove Device Drawers.
Continue through the remove process. When complete,
you can continue with any other operation.
c. Find the lines with the Resource Locations of the 7133 Drawers you
just installed. Select the highest line for one of the drawers you just
installed. That drawer, and all the drawers below it on the same loop,
will be removed from the loop.
d. Continue through the remove process. When complete, you can
continue with any other operation.
294
VOLUME 1, ESS Service Guide
MAP 3618: Replacement DDM Has Slower RPM Than Called For
MAP 3618: Replacement DDM Has Slower RPM Than Called For
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
Description
A DDM used for replacement has a slower RPM than was called for on the FRU
list. It is recommended that a replacement DDM have an equal or higher RPM than
called for on the FRU list. If a DDM with a lower RPM is spared into an array with
higher RPM DDMs, the performance of that array will be somewhat degraded.
If speed of repair is more important than performance, a slower speed DDM can be
used by activating the Allow Slower RPM Replacement switch. This flag will be
valid only for this repair.
Isolation
1. Determine if it is you want to degrade subsystem performance by allowing a
lower RPM replacement DDM to be installed (see Description above).
Do you want to install a lower RPM DDM and degrade loop performance?
v Yes, continue with the next step.
v No, go to step 5 on page 296.
2. You have chosen to degrade loop performance by allowing of a slower RPM
replacement DDM than called for on the FRU list. This step will Allow Slower
RPM Replacement:
a. Return to the service terminal and record the number of the problem you are
working on.
b. Press F3 until the Main Service Menu is displayed.
c. From the service terminal Main Service Menu, select:
Configuration Option Menu
Change/Show Control Switches
d. Select Allow Slower RPM Replacement.
e. Change the value to True.
f. Continue with the next step.
3. Press F3 until the Main Service Menu is displayed.
a. From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems Needing Repair
b. Select the problem with the number you recorded in step 2a.
c. Select the DDM on the Possible FRUs to Replace list.
d. Continue with the next step.
4. Continue through the repair process until the DDM replacement is called. Do not
replace the DDM. Continue through the replace process as if you had replaced
the DDM.
Problem Isolation Procedures, CHAPTER 3
295
MAP 3618: Replacement DDM Has Slower RPM Than Called For
Did the Repair process complete successfully?
v Yes, this problem is resolved. Continue to the end of the repair process to
see if there are any additional problems.
v No, continue with the problem displayed on the Service Terminal. continue
with the next step.
5. Replace the DDM with a correct RPM DDM.
a. Select the DDM on the Possible FRUs to Replace list.
b. Continue with the next step.
6. Continue through the repair process until the DDM replacement is called.
Replace the DDM with another DDM with the correct RPM. Continue through
the replace process.
Did the Repair process complete successfully?
v Yes, this problem is resolved. Continue to the end of the repair process to
see if there are any additional problems.
v No, continue with the problem displayed on the Service Terminal. continue
with the next step.
MAP 3619: This Repair Requires a Larger Capacity DDM
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first. Do not power off the SSA DASD drawer
unless instructed to do so.
Description
A replacement DDM must have the same or greater storage capacity of the DDM
shown on the FRU list. The DDM used for replacement had a smaller capacity than
is required. There are times when a larger capacity DDM is required than the DDM
being replaced. This occurs in cases where a failing DDM is replaced by a spare
that could also be used by a larger array on the loop. This replacement DDM must
have at least the capacity needed to be a spare for the larger array.
Isolation
1. Select the DDM listed in the Possible FRUs to Replace portion of the problem.
2. Proceed through the repair process to the DDM replacement. Replace the DDM
with a DDM that has the same or larger storage capacity than the DDM
requested in the FRUs to Replace portion of the problem.
MAP 3620: Multiple DDMs Isolated on an SSA Loop
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
296
VOLUME 1, ESS Service Guide
MAP 3620: Multiple DDMs Isolated on an SSA Loop
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
Multiple DDMs can not be accessed. The open links are on a drawer boundary.
v Drawer model, SSA DASD Model 040
Isolation
1. Go to the back of the 2105 and locate the SSA DASD Model 040 with a DDM
shown in the problem record. Observe the green PWR (power) indicators on
both drawer power supply assemblies.
Are both of the green PWR (power) indicators off?
v Yes, replace the controller card; go to ″Controller Card Assembly, 7133 Model
040″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
After the controller card is replaced, both PWR (power) indicators should be
on. If they are both on, go to “MAP 3500: Verifying an SSA DASD Drawer
Repair” on page 279 to complete the repair.
v No, continue with the next step.
Figure 145. SSA DASD Model 040 Power Supply Assembly Indicators (S008019m)
2. Observe the SSA DASD drawer amber power supply CHK/PWR (check/power)
Good indicators in Figure 145.
v If the SSA DASD drawer power supply CHK/PWR (check/power) Good
indicators are on or blinking (amber), go to “MAP 3105: Isolating a Loss of
Power to a SSA DASD Model 040” on page 172.
v If either of the SSA DASD drawer power supply CHK/PWR (check/power)
Good indicators are on (green), go to step 3.
3. Call the next level of support for instructions on rebuilding the array.
Attention: Attempting to rebuild the arrays, without correct procedures, may
result in the loss of customer data.
MAP 3621: New DDM Storage Capacity Smaller Than Original DDMs
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
297
MAP 3621: Wrong Storage Capacity DDM Installed
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
One or more DDMs have been added to an SSA loop that have a smaller storage
capacity than the existing DDMs. All DDMs in an SSA loop must have the same
storage capacity.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Determine which DDMs were added to the SSA loop that have a smaller
storage capacity than the original DDMs. Remove those new DDMs, that have a
smaller storage capacity, and replace them with DDMs that have the same or
larger storage capacity than the existing DDMs.
2. Continue with the install or repair.
MAP 3623: New DDM Storage Capacity Less Than 4.5 GB
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
One or more DDMs have been added to an SSA loop that have a storage capacity
of less than 4.5 GB. All DDMs on an SSA loop must be 4.5 GB or larger, and they
must all have the same storage capacity.
v Drawer model, SSA DASD Model 020 drawer
Isolation
1. Determine which DDMs have been added to the SSA loop that have less than a
4.5 GB storage capacity. Remove those new DDMs, with a storage capacity of
less than 4.5 GB, and replace them with 4.5 GB or larger DDMs.
Note: The replacement DDMs must have the same storage capacity as the
existing DDMs in the SSA loop.
2. Continue with the install or repair.
MAP 3625: All DDMs on SSA Loop A Do Not Have the Same
Characteristics
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
298
VOLUME 1, ESS Service Guide
MAP 3625: DDMs on Loop A Have Mixed Characteristics
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
DDMs have been added to SSA loop A that have different characteristics than the
existing DDMs or each other. All DDMs in an SSA loop must have the same storage
capacity, bus speed, and RPM.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Use the service terminal to locate the SSA device card displayed as a Possible
FRU to Replace. Copy that Resource Name (rsssaxx).
2. From the service terminal Main Service Menu, select:
Utility Menu
Show Storage Facility Resources Menu
List DDMs on an SSA Loop
Select the loop that uses the same SSA device card resources copied and loop
A.
3. Observe the Capacity, RPM, and Rate (bus rate) of each DDM on the loop. All
DDMs on a loop must have the same characteristics.
As required to correct the problem, you will have to replace:
v Entire SSA DASD drawer or DDM bay, or
v Individual DDMs
Notes:
a. To correct the characteristics problem, only the DDMs, SSA DASD drawers,
or DDM bays that you just placed on the loop should be replaced.
b. The model of the DDMs on the loop are shown. This tells you, at least, one
model of DDM that can be used on the loop. There may be other DDM
models with the same characteristics that can also be used on the same
loop.
Continue with the next step.
4. Determine if you need to replace individual DDMs or a SSA DASD drawer or
DDM bay.
Do you need to remove an entire SSA DASD drawer or DDM bay?
v Yes, go to step 6.
v No, go to step 5.
5. Remove any DDMs with the wrong characteristics and replace them with the
correct DDMs.
After this, determine if there are any other problems, go to “MAP 3500: Verifying
an SSA DASD Drawer Repair” on page 279.
6. Remove the entire SSA DASD drawer or DDM bay that was just installed.
Press F3 until the Main Service Menu is displayed, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawers
Select the SSA DASD drawer or DDM bay you are removing and follow the
instructions on the service terminal.
Problem Isolation Procedures, CHAPTER 3
299
MAP 3626: DDMs on Loop B Have Mixed Characteristics
MAP 3626: All DDMs on SSA Loop B Do Not Have the Same
Characteristics
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
DDMs have been added to SSA loop B that have different characteristics than the
existing DDMs. All DDMs in an SSA loop must have the same storage capacity, bus
speed, and RPM.
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Use the service terminal to locate the SSA device card displayed as a Possible
FRU to Replace. Copy that Resource Name (rsssaxx).
2. From the service terminal Main Service Menu, select:
Utility Menu
Show Storage Facility Resources Menu
List DDMs on an SSA Loop
Select the loop that uses the same SSA device card resources copied and loop
B.
3. Observe the Capacity, RPM, and Rate (bus rate) of each DDM on the loop. All
DDMs on a loop must have the same characteristics.
As required to correct the problem, you will have to replace:
v Entire SSA DASD drawer or DDM bay, or
v Individual DDMs
Notes:
a. To correct the characteristics problem, only the DDMs, SSA DASD drawers,
or DDM bays that you just placed on the loop should be replaced.
b. The model of the DDMs on the loop are shown. This tells you, at least, one
model of DDM that can be used on the loop. There may be other DDM
models with the same characteristics that can also be used on the same
loop.
Continue with the next step.
4. Determine if you need to replace individual DDMs or a SSA DASD drawer or
DDM bay.
Do you need to remove an entire SSA DASD drawer or DDM bay?
v Yes, go to step 6.
v No, go to step 5.
5. Remove any DDMs with the wrong characteristics and replace them with the
correct DDMs.
After this, determine if there are any other problems, go to “MAP 3500: Verifying
an SSA DASD Drawer Repair” on page 279.
6. Remove the entire SSA DASD drawer or DDM bay that was just installed.
300
VOLUME 1, ESS Service Guide
MAP 3626: DDMs on Loop B Have Mixed Characteristics
Press F3 until the Main Service Menu is displayed, select:
Install/Remove Menu
Device Drawer (DDM Bay or 7133 Drawer) Menu
Remove Device Drawers
Select the SSA DASD drawer or DDM bay you are removing and follow the
instructions on the service terminal.
MAP 3630: Isolating an SSA Device Card/DRAM Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
This MAP helps you isolate between a problem with the SSA device card or a
problem with both of its DRAM modules.
v 2105 Model Exx/Fxx
Isolation
1. Go to the rear of the 2105 Model Exx/Fxx and remove the failing SSA device
card, see d″SSA Device Card DRAM Module Removal and Replacement,
Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Verify that both DRAM 0 and 1 modules are installed correctly on the SSA
device card, see ″SSA Device Card DRAM Module Removal and Replacement,
Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
Reinstall the SSA device card and verify the repair.
Is the SSA device card or its DRAM modules still failing?
v Yes, remove the failing SSA device card, then go to step 2.
v No, the problem is resolved. Go to step 5 on page 302.
2. Remove the failing SSA device card from the 2105 Model Exx/Fxx Get a new
SSA device card and install the DRAM modules from the original card onto it,
see ″SSA Device Card DRAM Module Removal and Replacement, Cluster Bay
(E10/E20)″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume
2.
Reinstall the new SSA device card, with the original DRAM modules, and verify
the repair.
Is the SSA device card or its DRAM modules still failing?
v Yes, remove the failing SSA device card, then go to step 3.
v No, the problem is resolved. Go to step 5 on page 302.
3. Remove the failing SSA device card from the 2105 Model Exx/Fxx Install new
DRAM modules 1 and 2 onto the original SSA device card, see ″SSA Device
Card DRAM Module Removal and Replacement, Cluster Bay (E10/E20)″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
301
MAP 3630: SSA Device Card/DRAM
Reinstall the original SSA device card, with the new DRAM modules, and verify
the repair.
Is the SSA device card or its DRAM modules still failing?
v Yes, remove the failing SSA device card, then go to step 4.
v No, the problem is resolved. Go to step 5.
4. Remove the failing SSA device card from the 2105 Model Exx/Fxx Remove the
new DRAM modules from the original SSA device card and install them onto the
new SSA device card, see ″SSA Device Card DRAM Module Removal and
Replacement, Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
Reinstall the new SSA device card, with the new DRAM modules, and verify the
repair.
Is the SSA device card or its DRAM modules still failing?
v Yes, seek technical support.
v No, the problem is resolved. Go to step 5.
5. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3640: Other Cluster Fenced - Unable to Verify SSA Loop
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
You are connected to one cluster and are attempting to verify a repair on an SSA
Loop. For this repair verification, a test must be run on both clusters. When
verification was run, it failed because the alternate cluster was fenced. There are
two situations that will cause this:
1. There is a problem on the alternate cluster that needs to be resolved before
verifying an SSA repair.
2. The failure on the SSA loop caused the alternate cluster to fence. With this
condition, the alternate cluster needs to be powered off and then on to clear the
fence.
Isolation
1. Examine the other problems to see if there are any, that need to be repaired,
that are not SSA loop problems.
a. Go to list of other problems.
From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems Needing Repair
b. Look for any problem whose ESC does NOT equal 12xx, Cxxx, Dxxx, or
Exxx..
Are there any problems other than the above ESCs?
302
VOLUME 1, ESS Service Guide
MAP 3640: Other Cluster Fenced - Unable to Verify SSA Loop
v Yes, the fence of the other cluster was probably caused by a different
problem than the SSA loop problem you were repairing. Repair those
problems first, then return to the SSA loop problems. Continue with the next
step.
v No, fence of the other cluster was caused by a loop problem. Go to step 3 to
reset the other cluster fence before continuing to repair the SSA loop.
2. Repair non-SSA loop problems before returning to the repair of this SSA loop
problem.
a. Repair the problems whose ESC does Not Equal to 12xx, Cxxx, Dxxx, or
Exxx.
b. When you have repaired all the non-SSA loop problems, return to the SSA
loop problem you were repairing. Follow the instructions for that problem.
3. This step will quiesce and then power off the alternate cluster, the following step
will power it on again.
a. Return to the service terminal and press F3 until the Main Service Menu is
displayed. From the Main Service Menu, select:
Repair Menu
Alternate Cluster Repair Menu
Alternate Cluster Repair Menu
Quiesce the Alternate Cluster
Wait for processing to complete.
Select: Make resources not available for customer use.
Wait for: Quiesce was successful.
b. Power off the alternate cluster, press F3 once.
From the service terminal Alternate Cluster Repair Menu, select:
Power Off the Alternate Cluster
Power Off the cluster now.
Wait for: The cluster has been successfully powered off.
Continue with the next step.
4. Power on the alternate cluster, press F3 once.
From the service terminal Alternate Cluster Repair Menu, select:
Power On the Alternate Cluster
Power On the cluster now
Wait for: The alternate cluster has been powered on.
Wait for the Ready light to be turned on when the IML is complete.
Continue with the next step.
5. Return to the problem you were originally working on, you will now be able to
complete it.
Return to service terminal and press F3 until the Main Service Menu is
displayed.
From the service terminal Main Service Menu, select:
Repair Menu
Problem Isolation Procedures, CHAPTER 3
303
MAP 3640: Other Cluster Fenced - Unable to Verify SSA Loop
Show/Repair Problems Needing Repair
Select the original problem on which you were working.
MAP 3650: Wrong, Missing, or Failing Bypass Card
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
In an SSA DASD drawer, where a bypass card should be plugged, one of the
following conditions is present:
v A different kind of card is plugged
v There is no card in that location
v The bypass card in that location is failing
v
v
v
v
The controller card in that DDM bay is failing
The controller card in that DDM bay is failing
The DDM bay backplane is failing
Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Locate the bypass card listed under Possible FRUs to Replace on the service
terminal. See chapter 7, volume 3 of this book for: ″DDM Bay, Component
Physical Location Codes″ in chapter 7 of the Enterprise Storage Server Service
Guide, Volume 3 book, ″SSA DASD Drawer Component Physical Location
Codes, Model 020 Drawer″ in chapter 7 of the Enterprise Storage Server
Service Guide, Volume 3 book, and ″SSA DASD Drawer Component Physical
Location Codes, Model 040 Drawer″ in chapter 7 of the Enterprise Storage
Server Service Guide, Volume 3 book.
Is there a card plugged into that location?
v Yes, continue with the next step.
v No, select the bypass card from the Possible FRUs to Replace list on service
terminal. Install a bypass card in that location and proceed through the
verification process.
Note: Be sure that the two jumpers on the bypass card are in the correct
positions. See the jumper figures in: ″Bypass Card Removal and
Replacement, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2 book or ″Bypass and
Passthrough Card Removal and Replacement, DDM Bay″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2 book.
– If the verification ran without error, the problem is resolved. Return
to the service terminal and select Continue Repair Process, to
return the resources to the customer and cancel the problem.
304
VOLUME 1, ESS Service Guide
MAP 3650: Wrong, Missing, or Failing Bypass Card
– If the verification failed, continue with any problem displayed by the
verification process.
2. Look at the card(s) plugged into the bypass card position.
Is it a single card with two SSA connectors on it?
v Yes, there is a bypass card in this position, continue with the next step.
v No, the card in this position is a passthrough card instead of a bypass card.
Select the bypass card from the Possible FRUs to Replace list on service
terminal. Install a bypass card in that location and proceed through the
verification process.
Note: Be sure that the two jumpers on the bypass card are in the correct
positions. See the jumper figures in: ″Bypass Card Removal and
Replacement, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2 book.
– If the verification ran without error, the problem is resolved. Return
to the service terminal and select Continue Repair Process, to
return the resources to the customer and cancel the problem.
– If the verification failed, continue with the next step.
3. Select the bypass card from the Possible FRUs to Replace list on service
terminal. Install a bypass card in that location and proceed through the
verification process.
Note: Be sure that the two jumpers on the bypass card are in the correct
positions. See the jumper figures in: ″Bypass Card Removal and
Replacement, 7133 Model 020/040″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2 book or ″Bypass and
Passthrough Card Removal and Replacement, DDM Bay″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2 book.
v If the verification ran without error, the problem is resolved. Return to
the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
v If the verification failed, continue with the next step.
4. Select the controller card from the Possible FRUs to Replace list on the service
terminal. Install a new controller card in that location and proceed through the
verification process. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2
book.
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, continue with the next step.
5. Select the frame from the Possible FRUs to Replace list on the service terminal.
Install a new frame in that location and proceed through the verification process.
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, continue with any problem displayed by the
verification process.
MAP 3652: Wrong, Missing, or Failing Passthrough Card
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
305
MAP 3652: Wrong, Missing, or Failing Passthrough Card
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
In an SSA DASD drawer, where a passthrough card should be plugged, one of the
following conditions is present:
v A different kind of card is plugged
v There is no card in that location
v The passthrough card in that location is failing
v The controller card in that DDM bay is failing
v The controller card in that DDM bay is failing.
v Drawer model SSA DASD DDM bay
Isolation
1. Locate the passthrough card listed under Possible FRUs to Replace on the
service terminal. See ″DDM Bay, Component Physical Location Codes″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
Is there a card plugged into that location?
v Yes, continue with the next step.
v No, select the passthrough card from the Possible FRUs to Replace list on
service terminal. Install a passthrough card in that location and proceed
through the verification process.
– If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
– If the verification failed, continue with any problem. displayed by the
verification process.
2. Look at the card(s) plugged into the passthrough card position.
Is it a single card with two SSA connectors on it?
v Yes, the card in this position is a bypass card instead of a passthrough card.
Select the passthrough card from the Possible FRUs to Replace list on
service terminal. Install a passthrough card in that location and proceed
through the verification process.
– If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
– If the verification failed, continue with any problem. displayed by the
verification process.
v No, there is a passthrough card in this position, continue with the next step.
3. The passthrough card is failing. Select the passthrough card from the Possible
FRUs to Replace list on service terminal. Install a passthrough card in that
location and proceed through the verification process.
306
VOLUME 1, ESS Service Guide
MAP 3652: Wrong, Missing, or Failing Passthrough Card
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, continue with the next step.
4. Select the controller card from the Possible FRUs to Replace list on the service
terminal. Install a new controller card in that location and proceed through the
verification process. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, call your next level of support.
MAP 3654: Bypass Card Jumpers Wrong
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Description
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
v A bypass card has one or both jumpers in the wrong position
v A controller card in that DDM bay is failing
v Drawer models, SSA DASD Model 020 or 040 drawer, or SSA DASD DDM bay
Isolation
1. Locate the bypass card listed under Possible FRUs to Replace on the service
terminal. See ″DDM Bay, Component Physical Location Codes″, ″SSA DASD
Drawer Component Physical Location Codes, Model 020 Drawer″, and ″SSA
DASD Drawer Component Physical Location Codes, Model 040 Drawer″, all in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3.
2. Select the bypass card from the Possible FRUs to Replace list on the service
terminal.
3. Remove the bypass card. Verify that the two jumpers on the bypass card are in
the correct positions. see the ″SSA DASD Model 020 and 040 Drawer Bypass
Card Jumper Settings″ figure in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Reinstall the bypass card and verify the repair:
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, continue with the next step.
4. Select the controller card from the Possible FRUs to Replace list on the service
terminal. Install a new controller card in that location and proceed through the
verification process. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
307
MAP 3654: Bypass Card Jumpers Wrong
v If the verification ran without error, the problem is resolved. Return to the
service terminal and select Continue Repair Process, to return the
resources to the customer and cancel the problem.
v If the verification failed, continue with any problem displayed by the
verification process.
MAP 3656: 20 MB SSA Cable Installed Where 40 MB Cable Expected
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
One of the following conditions exists:
v The SSA cable may be unplugged.
v A 20 MB SSA cable is plugged where a 40 MB SSA cable should be used.
Note: 20 MB SSA cables are grey and 40 MB SSA cables are blue.
v The bypass card at that location has failed
v The controller card in that DDM bay has failed
v Drawer models, SSA DASD Model 040, or SSA DASD DDM bay
Isolation
1. Locate the bypass card listed under Possible FRUs to Replace on the service
terminal. See ″DDM Bay, Component Physical Location Codes″ in chapter 7 of
the Enterprise Storage Server Service Guide, Volume 3 and ″SSA DASD
Drawer Component Physical Location Codes, Model 040 Drawer″ in chapter 7
of the Enterprise Storage Server Service Guide, Volume 3. Determine the color
of the SSA cables connected to the bypass card.
Are both of the cables blue?
v Yes, continue with the next step.
v No, the wrong type of SSA cable(s) are installed. Select the bypass card from
the Possible FRUs to Replace list on the service terminal. Do not replace the
bypass card. Replace any grey SSA cables with blue SSA cables. Proceed
through the verification process.
– If the verification ran without error, the problem is resolved. Go to step 8
on page 309.
– If the verification failed, continue with any problem. displayed by the
verification process.
2. Are both of the SSA cables connected to the bypass card
v Yes, continue with the next step.
v No, connect the cable that is not connected. Select the cable from the
Possible FRUs to Replace list on the service terminal. Do not replace the
cable. Proceed through the verification process.
308
VOLUME 1, ESS Service Guide
MAP 3656: Wrong SSA Cable Installed
– If the verification ran without error, the problem is resolved. Go to step 8.
– If the verification failed, continue with any problem. displayed by the
verification process.
3. Select the bypass card from the Possible FRUs to Replace list on the service
terminal. Do not remove or replace the bypass card at this time.
4. Remove the two SSA cables from the bypass card and inspect the pins in each
connector.
Are there three pins in each connector?
v Yes, continue with the next step.
v No, replace the SSA cable with less than three pins. Connect the SSA cables
and continue through the verification process without replacing any other
FRUs.
– If the verification ran without error, the problem is resolved. Go to step 8.
– If the verification failed, continue with any problem. displayed by the
verification process.
5. Inspect the SSA connectors for bent pins.
Do any of the pins need to be straightened?
v Yes, straighten the pins and replace the cables. Go through the verification
process without replacing any FRUs.
– If the verification ran without error, the problem is resolved. Go to step 8.
– If the verification failed, continue with any problem displayed by the
verification process.
v No, continue with the next step.
6. The bypass card may have a problem that causes it to report the wrong cable
speed. Replace the bypass card then proceed through the verification process.
v If the verification ran without error, the problem is resolved. Go to step 8.
v If the verification failed, continue with the next step.
7. Select the controller card from the Possible FRUs to Replace list on the service
terminal. Install a new controller card in that location and proceed through the
verification process. See ″Controller Card Removal and Replacement, DDM
Bay″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v If the verification ran without error, the problem is resolved. Continue with the
next step.
v If the verification failed, continue with any problem displayed by the
verification process.
8. Return to the service terminal and select Continue Repair Process, to return
the resources to the customer and cancel the problem.
MAP 3680: Isolating a Two DDMs Detect Over-Temperature Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
Problem Isolation Procedures, CHAPTER 3
309
MAP 3680: Two DDM Detected Over-Temperature
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The 2105 requires that the temperature of the room air entering it must not exceed
32°C (89.6°F). With a room temperature of less than 32°C (89.6°F), the base
casting temperature of the DDMs should not exceed 50°C (122°F). You have been
directed to this MAP because the base casting temperature on two DDMs has
exceeding 50°C (122°F).
This may be caused by:
v The air temperature surrounding the DDMs exceeding the maximum allowed
temperature.
v The air flow to the DDMs being restricted.
v The temperature sensing circuits on the DDMs being faulty.
v The DDMs being faulty and generating too much heat.
The repair strategy of this MAP is to first determine if the air supply to the DDMs is
too warm or is restricted. An over-temperature condition is not reported until two or
more DDMs have sensed an over-temperature. It is possible that one of the two
drives has been failing for some time and that the second DDM has just failed. If
the over-temperature condition can not be corrected while examining the air supply,
you will be directed to replace the DDMs one at a time.
The DDMs reporting the over-temperature conditions are in an DDM bays or SSA
DASD Model 040 drawers.
Isolation
1. Record the Problem ID of this problem.
Look at the time stamp of the last occurrence. If it is more than 30 minutes old
the problem is resolved and can be closed.
Was the last occurrence more than 30 minutes ago?
v Yes, go to step 19 on page 313.
v No, continue with the next step.
2. Determine the approximate temperature of the air at the front and rear of each
2105 Model Exx/Fxx and Expansion racks. Also check the approximate
temperature at the front (only) of 2105 Model 100 racks.
Note: The 2105 Model 100 racks contain 7133 drawers and exhaust air
through their rear covers, thus the air there will be warmer than intake
air.
Does the air exceed 32°C ( 90°F)?
v Yes, contact the customer and have the temperature of the room lowered,
then go to step 16 on page 312.
v No, continue with the next step.
3. Look for other problems with the Failing Resource = rsuplnrsnsxxx or
rslplnrsnsxxx or ssaxxx.
Are there any problems as described above?
v Yes, repair all of these problems, this may lower the DDM temperatures.
Then return to this map and go to step 16 on page 312.
310
VOLUME 1, ESS Service Guide
MAP 3680: Two DDM Detected Over-Temperature
v No, continue with the next step.
4. Locate the DDMs shown in the Possible FRUs to Replace section of the
problem detail or your list from the temperature utility. Note the FRU Location
for the FRUs and refer to ″Locating a DDM Bay or SSA DASD Model 020 or
040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server
Service Guide, Volume 3.
5. Are one or both of these DDMs in DDM Bays?
v Yes, continue with the next step.
v No, go to step 8.
6. Open the rack cover adjacent to those drive locations and look if there is
anything interfering with the air flow between the DDMs and the covers.
Did you find anything interfering with the air flow to those drives?
v Yes, remove the interference to the air flow, then go to step 16 on page 312.
v No, continue with the next step.
7. For the DDMs that are in DDM Bays, ensure that the fans at the top of the
rack are all turning.
Note: You can hold a strip of paper over each of the fans to see if each of the
fans are turning. For the location of these fans see ″2105 Model
Exx/Fxx and Expansion Enclosure Storage Cage Fan (Top) Location
Codes″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3.
Are all the fans turning?
v Yes, continue with the next step.
v No, replace the fans that are not turning, then go to step 16 on page 312.
8. Are one or more of these DDMs in 7133 drawers?
v Yes, continue with the next step.
v No, go to step 13 on page 312.
9. Go to the front of the rack(s) containing the 7133 drawer(s) and see if anything
is interfering with the air flow between any drawer and the front cover.
Was anything interfering with the air flow to/from that drawer?
v Yes, remove whatever is interfering with the air flow, then go to step 16 on
page 312.
v No, continue with the next step.
10. Go to the rear of the rack(s) containing the 7133 drawer(s) and see if anything
is interfering with the air flow between any drawer and the rear cover.
Was anything interfering with the air flow to/from that drawer?
v Yes, remove whatever is interfering with the air flow, then go to step 16 on
page 312.
v No, continue with the next step.
11. Go to the fans in the front of each of the 7133 drawers containing a listed
DDM, and pull them out, one at a time. When one fan is pulled out, the other
two fans should increase in speed. You can hear the speed increase of the
fans. See the 7133 locations chapter for the locations of the fans in the 7133
drawer.
Was there any fan for which the speed of the other two fans did NOT increase
when you pulled it out??
v Yes, replace that fan, then go to step 16 on page 312.
Problem Isolation Procedures, CHAPTER 3
311
MAP 3680: Two DDM Detected Over-Temperature
v No, continue with the next step.
12. This is a complex problem. Call your next level of support.
13. Have you already replaced the first of the two DDMs displayed on the service
terminal as Possible FRUs to Replace?
v Yes, go to step 15 and replace the second DDM displayed on the service
terminal.
v No, go to the next step and replace the first DDM displayed on the service
terminal.
14. Replace the first of the two DDMs displayed on the service terminal as
Possible FRUs to Replace, then verify the repair.
Did repair verification run without error?
v Yes, go to step 16 to determine if the over-temperature problem is resolved.
v No, repair the problems from the repair verification.
15. Replace the other DDM displayed on the service terminal as a Possible FRUs
to Replace, then verify the repair.
Note: The service terminal will determine if the second DDM being replaced is
in the same array as the first DDM. If both DDMs are in the same array,
the service terminal will instruct you to wait for sparing to complete.
When sparing for the first DDM replacement completes, the second
DDM can be replaced.
Did repair verification run without error?
v Yes, go to the next step to determine if the over-temperature problem is
resolved.
v No, repair the problems from the repair verification.
16. Wait 15 minutes after the last action was performed that may have decreased
the DDM Temperatures. At the end of this time, press F3 until the Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Utility Menu
Machine Test Menu
SSA Devices Temperature Test
At the top of the display there will be a Maximum Temperature = xx°C (yy°F).
Is the Maximum Temperature greater than 40°C?
v Yes, continue with the next step.
v No, the problem is resolved, go to step 19 on page 313.
17. Look down the display and record the Locations of all of the DDMs whose
temperature is greater than 40°C. Then continue with next step.
Is there only one DDM Location on your list?
v Yes, continue with the next step.
v No, go back to step 4 on page 311 and use the FRU Location List.
18. Replace the DDM. Press F3 until the Main Service Menu is displayed:
From the service terminal Main Service Menu, select:
Repair a FRU
DDM Bay or 7133 Drawer
Select DDM Bay or 7133 Drawer that contains the DDM.
312
VOLUME 1, ESS Service Guide
MAP 3680: Two DDM Detected Over-Temperature
Select the DDM you wish to replace.
Follow the instructions to replace the DDM, then go to step 16 on page 312.
19. Close the Problem that you have just resolved, reference the problem ID
recorded in step 1 on page 310.
Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Repair Menu
Close a Previously Repaired Problem
Select the Problem ID you recorded earlier. Follow the service terminal
instructions to see if all problems are resolved.
MAP 3685: Isolating a Multiple DDMs Detect Over-Temperature
Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the SSA DASD drawer unless instructed to do so.
This MAP is an entry point to the MAPs for the SSA DASD drawer. If you are not
familiar with these MAPs, read “Using the SSA DASD drawer Maintenance Analysis
Procedures (MAPs)” on page 108 first.
Description
The 2105 requires that the temperature of the room air entering it must not exceed
32°C (89.6°F). With a room temperature of less than 32°C (89.6°F), the base
casting temperature of the DDMs should not exceed 60°C (140°F). You have been
directed to this MAP because the base casting temperature on more than two
DDMs has exceeded 60°C (140°F). This may be caused by the air temperature
surrounding the 2105 exceeding the maximum allowed temperature or something
restricting the air flow to the DDMs.
The DDMs reporting the over-temperature conditions are in an DDM bays or SSA
DASD Model 040 drawers.
Isolation
1. Record the Problem ID of this problem.
Look at the time stamp of the last occurrence. If it is more than 30 minutes old
the problem is resolved and can be closed.
Was the last occurrence more than 30 minutes ago?
v Yes, go to step 16 on page 315.
v No, continue with the next step.
2. Determine the approximate temperature of the air at the front and rear of
each2105 Model Exx/Fxx and Expansion racks. Also check the approximate
temperature at the front (only) of 2105 Model 100 racks.
Problem Isolation Procedures, CHAPTER 3
313
MAP 3685: Multiple DDMs Detect Over-Temperature
Note: The 2105 Model 100 racks contain 7133 drawers and exhaust air
through their rear covers, thus the air there will be warmer than intake
air.
Does the air exceed 32°C ( 90°F)?
v Yes, contact the customer and have the temperature of the room lowered,
then go to step 13 on page 315.
v No, continue with the next step.
3. Look for other problems with the Failing Resource = rsuplnrsnsxxx or
rslplnrsnsxxx or ssaxxx.
Are there any problems as described above?
v Yes, repair all of these problems, this may lower the DDM temperatures,
then return to this map and go to step 13 on page 315.
v No, continue with the next step.
4. Locate the DDMs shown in the Possible FRUs to Replace section of the
problem detail or your list from the temperature utility. Note the FRU Location
for the FRUs and refer to ″Locating a DDM Bay or SSA DASD Model 020 or
040 Drawer in a 2105 Rack″ in chapter 7 of the Enterprise Storage Server
Service Guide, Volume 3.
5. Are one or more of these DDMs in DDM Bays?
v Yes, continue with the next step.
v No, go to step 8.
6. Open the rack cover adjacent to those drive locations and look if there is
anything interfering with the air flow between the DDMs and the covers.
Did you find anything interfering with the air flow to those drives?
v Yes, remove the interference to the air flow, then go to step 13 on page 315.
v No, continue with the next step.
7. For the DDMs that are in DDM Bays, ensure that the fans at the top of the
rack are all turning.
Note: You can hold a strip of paper over each of the fans to see if each of the
fans are turning. For the location of these fans see ″2105 Model
Exx/Fxx and Expansion Enclosure Storage Cage Fan (Top) Location
Codes″ in chapter 7 of the Enterprise Storage Server Service Guide,
Volume 3.
Are all the fans turning?
v Yes, continue with the next step.
v No, replace the fans that are not turning, then go to step 13 on page 315.
8. Are one or more of these DDMs in 7133 drawers?
v Yes, continue with the next step.
v No, go to step 12 on page 315.
9. Go to the front of the rack(s) containing the 7133 drawer(s) and see if anything
is interfering with the air flow between any drawer and the front cover.
Was anything interfering with the air flow to/from that drawer?
v Yes, remove whatever is interfering with the air flow, then go to step 13 on
page 315.
v No, continue with the next step.
10. Go to the rear of the rack(s) containing the 7133 drawer(s) and see if anything
is interfering with the air flow between any drawer and the rear cover.
314
VOLUME 1, ESS Service Guide
MAP 3685: Multiple DDMs Detect Over-Temperature
Was anything interfering with the air flow to/from that drawer?
v Yes, remove whatever is interfering with the air flow, then go to step 13.
v No, continue with the next step.
11. Go to the fans in the front of each of the 7133 drawers containing a listed
DDM, and pull them out, one at a time. When one fan is pulled out, the other
two fans should increase in speed. You can hear the speed increase of the
fans. See the 7133 locations chapter for the locations of the fans in the 7133
drawer.
Was there any fan for which the speed of the other two fans did NOT increase
when you pulled it out??
v Yes, replace that fan, then go to step 13.
v No, continue with the next step.
12. This is a complex problem. Call your next level of support.
13. Wait 15 minutes after the last action was performed that may have decreased
the DDM Temperatures. At the end of this time, press F3 until the Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Utility Menu
Machine Test Menu
SSA Devices Temperature Test
At the top of the display there will be a Maximum Temperature = xx°C (yy°F).
Is the Maximum Temperature greater than 40°C?
v Yes, go to step 13.
v No, go to step 15.
14. Look down the display and record the Locations of all of the DDMs whose
temperature is greater than 40°C. Then continue with next step.
Is there only one DDM Location on your list?
v Yes, continue with the next step.
v No, go back to step 4 on page 314 using your new list of DDM locations.
15. Replace the DDM. Press F3 until the Main Service Menu is displayed:
From the service terminal Main Service Menu, select:
Repair a FRU
DDM Bay or 7133 Drawer
Select DDM Bay or 7133 Drawer that contains the DDM.
Select the DDM you wish to replace.
Follow the instructions to replace the DDM, then go to step 13.
16. Close the Problem that you have just resolved, reference the problem ID
recorded in step 1 on page 313.
Press F3 until the Main Service Menu is displayed.
From the service terminal Main Service Menu, select:
Repair Menu
Close a Previously Repaired Problem
Select the Problem ID you recorded earlier. Follow the service terminal
instructions to see if all problems are resolved.
Problem Isolation Procedures, CHAPTER 3
315
MAPs 4XXX: Cluster Bay Isolation Procedures
MAPs 4XXX: Cluster Bay Isolation Procedures
Procedures in the MAP 4XXX group of the Isolate chapter cover the cluster bay
area of the 2105 Model Exx/Fxx unit.
MAP 4020: Performing the SCSI Hard Drive Build Process
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: The FRUs and cables in this procedure are ESD-sensitive. Always wear
an ESD wrist strap during this isolation procedure. Follow the ESD procedures in
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Description
This procedure is used:
v When the cluster is down and cannot boot from the SCSI hard drive.
v When the cluster is up and a problem with the SCSI interface, SCSI hard drive,
or SCSI CD-ROM is suspected.
v To test the CD-ROM Drive and the SCSI Hard Drive as a diagnostic would be
used.
v To load AIX and 2105 Model Exx/Fxx code on a new SCSI Hard Drive or when
the original code image was corrupted.
Note: Various types of CD-ROMs may be required by this procedure:
1. Two, 2105 O/S VER.XXX CDs (AIX CD-ROM) (Volumes 1 and 2)
2. 2105 O/S Update CD-ROM (AIX PTF CD-ROM)
3. 2105 LIC - Licensed Internal Code CD-ROM (Functional Microcode)
Procedure
1. This MAP isolates a problem with the SCSI interface, SCSI hard drive, or SCSI
CD-ROM drive that may or may not prevent the cluster from coming ready.
2. Verify the service terminal is connected to the operating cluster bay, see
″Service Terminal Setup″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3.
3. Quiesce the failing cluster bay using the alternate cluster repair menu options
from the operating cluster bay.
From the service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
4. Make configuration diskette(s) from the operating cluster bay. (Multiple
diskettes will be needed if the configuration is large.)
Note: The diskettes must be made for the other cluster bay on this 2105
Model Exx/Fxx Both cluster bays must be at the same E/C level of
code.
From the service terminal Main Service Menu, select:
316
VOLUME 1, ESS Service Guide
MAP 4020: SCSI Hard Drive Build
Configurations Option Menu
Import/Export Configuration Data Menu
Export Configuration Data via Diskette
Follow the service terminal prompts, insert the diskette when
instructed.
Note: When the diskette(s) are removed, label them with a date and as a
configuration diskette. (If there are multiple configuration diskettes, mark
them in the order they were created.)
5. Make a customization diskette.
Note: The diskettes must be made for the other cluster bay on this 2105
Model Exx/Fxx Both cluster bays must be at the same E/C level of
code.
From the service terminal Main Service Menu, select:
Utility Menu
Make A Customization Diskette
Follow the service terminal prompts, insert the diskette (new media for
/dev/rfd0) when prompted.
6.
Note:
Are you using this MAP to replace a SCSI Hard Drive FRU?
v Yes, go to “MAP 4700: Replacing Cluster FRUs” on page 375.
Note: Ensure the SCSI hard drive jumpers were set correctly. See
″CD-ROM, SCSI Hard Drive, and Diskette Drive Removals and
Replacements, Cluster″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
v No, continue with the next step.
7.
Are you using this MAP to isolate a problem that prevents the cluster bay from
booting form the SCSI Hard Drive?
v Yes, continue with the next step.
v No, exit this MAP and start the repair over.
8. Was the SCSI hard drive, SCSI CD-ROM drive, SCSI CD-ROM drive, or SCSI
cable FRU just replaced?
v Yes, continue with the next step.
v No, go to step 10 on page 318.
9. Ensure the SCSI cable is properly connected to the I/O planar, SCSI hard
drive and SCSI CD-ROM drive. Ensure the power cable is connected to both
drives. This may also be caused by a problem with the SCSI interface
termination. There are now two types of SCSI hard drives:
v Drives with internal SCSI terminators that require a SCSI cable without a
terminator block.
v Drives without internal SCSI terminators that require a SCSI cable with an
external terminator block.
Do you know if the SCSI termination is correct?
v Yes, continue with the next step.
v No, reference the SCSI Hard Drive replacement procedure in ″CD-ROM,
SCSI Hard Drive, and Diskette Drive Removals and Replacements, Cluster
(E10/E20)″ in chapter 4 of the Enterprise Storage Server Service Guide,
Problem Isolation Procedures, CHAPTER 3
317
MAP 4020: SCSI Hard Drive Build
Volume 2 or the ″CD-ROM, SCSI Hard Drive, and Diskette Drive Removals
and Replacements, Cluster (F10/F20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Return here and continue with the
next step.
10. Observe the center LED indicator (cluster bay power output) on the front of
each of the three electronics cage power supplies.
Are the LEDs indicators off?
v Yes, continue with the next step.
v No, use the Alternate Cluster Repair menu options to power off the cluster.
Then continue with the next step.
11. Use the service processor (SP) Card System Management Service (SMS)
utilities to ensure the boot list devices are set to default values. Read and
understand the next three steps before actually doing the procedure. You will
need to move the service terminal connection quickly for the procedure to
work.
12. Power on the failing cluster bay using the Alternate Cluster Repair Menu option
and then immediately go to the next step.
13. Disconnect the service terminal interface cable from the operating cluster bay
and connect it to the S1 port of the failing cluster bay. Logically connect to the
failing cluster bay.
v If the cluster bay failed to power on and display progress codes, go to “MAP
4360: Isolation Using Codes Displayed by the Cluster Operator Panel” on
page 342.
v If the cluster bay began to power on and displayed progress codes, go to
the next step.
14. Watch the operator panel of the cluster bay being serviced. As the cluster bay
powers on, the firmware tests display EXXX progress codes. Keep logically
connecting the service terminal, by repeating Appendix step ″Logically Connect
the Service Terminal to the Cluster Bay″ in chapter 8 of the Enterprise Storage
Server Service Guide, Volume 3 on the service terminal, until progress code
E1FB is displayed. (During the cluster bay power on, the service terminal may
be logically disconnected one or more times.) Immediately look at the service
terminal for the display shown below.
As soon as the word Keyboard is displayed at the bottom of the screen,
immediately press the number 1 key on the service terminal. This will load the
SMS utilities from the service processor.
RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000
RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000
RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000 RS/6000
Memory
====>
Keyboard
15. After the text-based System Management Services starts, the following
screens appear.
318
VOLUME 1, ESS Service Guide
MAP 4020: SCSI Hard Drive Build
System Management Services
1.
2.
3.
4.
Display Configuration
Multiboot
Utilities
Select Language
====>
*------*
³X=Exit³
*------*
Select 2. Multiboot, then the next screen appears.
Multiboot
1.
2.
3.
4.
5.
6.
Select Software
Software Default
Install From
Select Boot Devices
OK Prompt
Multiboot Startup <OFF> (or <ON>)
===>
*------*
³X=Exit³
*------*
Select 4. Select Boot Devices, then the next screen appear
Select Boot Devices
1.
2.
3.
4.
5.
6.
7.
Display Current Settings
Restore Default Settings
Configure 1st Boot Device
Configure 2nd Boot Device
Configure 3rd Boot Device
Configure 4th Boot Device
Configure 5th Boot Device
===>
*------*
³X=Exit³
*------*
Select option 2. Restore Default Settings, then continue with the next step.
16. Insert the customization diskette in the failing cluster bay diskette drive.
17. Insert the 2105 O/S VER. X.X.X. volume 1 CD in the failing cluster bay
CD-ROM Drive. Wait until the CD-ROM Drive LED stops blinking, then go to
the next step.
18. Use the X=Exit option four times to return to prior menus and quit SMS. The
SCSI Hard Drive code load process will automatically continue. (Many screen
lines of RS/6000 will be displayed.)
The cluster bay will begin loading code from the 2105 O/S VER. X.X.X. CD
and customization diskette to build the SCSI Hard Drive.
Ignore any error messages that may temporarily display as the status
messages scroll by. The final status screen will inform you if there were any
unexpected errors.
Problem Isolation Procedures, CHAPTER 3
319
MAP 4020: SCSI Hard Drive Build
Follow the service terminal instructions for inserting/removing CD-ROMs and
diskettes as follows:
Note: Wait until the CD-ROM drive indicator stops blinking before removing or
inserting another CD-ROM into the drive. You must type 1 and press
enter to start the next action. Do not press the Enter key until you have
completed the instructions on the screen. Any errors occurring during
the load process will display recovery information on the service
terminal SMIT screen.
a. After a few minutes, a screen will instruct you to remove the 2105
Operating System volume 1 CD.
b. Insert the 2105 Operating System volume 2 CD. After the CD
comes ready, type 1 and press enter, this will reboot the cluster bay.
c. When prompted, remove the 2105 Operating System volume 2 CD.
d. Insert the 2105 OS Update CD (AIX PTF CD). After the CD comes
ready, type 1 and press enter.
e. When prompted, remove the 2105 OS Update/PTF CD (AIX PTF
CD).
f. Insert the 2105 LIC CD and the first configuration diskette. AIX will
prompt you for more diskettes if required. After the CD comes ready,
type 1 and press enter.
g. When the process is complete remove the CD and diskette.
h. Type 1 and press enter, this will reboot the cluster bay.
Note: The service terminal logical connection will be lost several times. Keep
logically reconnecting the service terminal so you do not miss seeing
the displayed information.
19. Wait the normal amount for the cluster to come ready and then attempt to login
with the service terminal.
Was the service terminal able to login to the cluster being repaired?
v Yes, continue with the next step.
v No, go to “MAP 4360: Isolation Using Codes Displayed by the Cluster
Operator Panel” on page 342.
20. Disconnect the service terminal from the S1 port and connect to the S2 port of
the same cluster bay. (The S2 port is once again the service login port after
the customization diskette has been loaded.) Check that you can login and
display the main menu.
v If this fails, call the next level of support.
v If it works, go to the next step.
21. Connect the service terminal to the cluster bay not being repaired. Use the
Alternate Cluster Repair Menu option to resume the alternate cluster bay. Wait
for the operator panel Cluster Bay Ready Indicator LED to come on and then
go to the next step.
Note: The resume causes the cluster bay to reload the code again.
22. Go to “MAP 1500: Ending a Service Action” on page 68.
MAP 4030: CPI Hardware Version Mismatch
Attention: This is not a stand-alone procedure.
320
VOLUME 1, ESS Service Guide
MAP 4030: CPI Hardware Version Mismatch
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
There are two versions of the CPI FRUs (IA card and host bay card). The two
version are not compatible and must not be mixed within this 2105 Model Exx/Fxx
Isolation
Two incompatible versions of the CPI FRUs have been detected. The FRU listed in
the problem log is not compatible with the version defined for this 2105 Model
Exx/Fxx.
Replace the CPI FRU in the problem log with a valid part numbered FRU.
Reference the parts manual to determine the valid part numbers for each version.
Use the service terminal to display the CPI version defined for this 2105 Model
Exx/Fxx.
MAP 4040: Entry MAP for CPI Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A CPI error has generated a problem log that is ready for repair. The error recovery
code has fenced (removed from customer use), a 4-slot bay, cluster, or 4-slot bay
and cluster.
There are four CPI diagnostic tests:
v IOA Test, tests the I/O Attachment Card in the cluster.
v IOA to Host Bay Planar Test, tests the interface between the I/O Attachment Card
in the cluster and the Host Bay Planar in the 4-slot bay.
v Host Bay Planar Test, tests the Host Bay Planar.
v Host Bay Planar PCI Bus Test, tests the PCI bus section of the Host Bay Planar
which is used for cluster to cluster communication. It is the common logic
between the CPI interface to each cluster. This test first uses the cluster to
cluster ethernet communications to setup registers in both clusters before testing
the cluster to cluster CPI communications.
There are four conditions when the CPI diagnostics are run. These are listed in the
table below.
Problem Isolation Procedures, CHAPTER 3
321
MAP 4040: CPI Entry MAP
Table 23. CPI Diagnostics Overview
CPI Test
Two Cluster
IML, 2105 Model
Exx/Fxx Power
On
Resume
Cluster, Host
Bay Available,
Fenced or
Quiesced
Resume Host
Bay, Both
Clusters
Available
Resume Host
Bay, One
Cluster Fenced
or Quiesced
IOA Test
Yes
Yes
No
No
IOA to Host Bay
Planar Test
Yes
Yes
Yes
Yes
Host Bay Planar
Test
Yes
No
Yes
Yes
Host Bay Planar
PCI Bus Test
Yes
No
Yes
No
Isolation
1. Write down each FRU Name and FRU Location Description listed in the
problem log.
2. Write down the time stamp in the Last Occurrence field. This field is updated
with a new time stamp if the error is detected again during the isolation
procedures. CPI diagnostics will create a new problem log instead of updating a
problem log created by the functional code and customer activity.
v If only one FRU is listed, go to one of the following:
– For cluster FRUs, go to “MAP 4060: Replacement of Cluster FRUs for CPI
Problems” on page 326.
– For 4-slot bay FRUs, go to “MAP 4070: Replacement of Host Bay FRUs
for CPI Problems” on page 327.
Note: There is normally no need to use the diagnostics to recreate a CPI
error if only one FRU is listed. If you choose to determine if the CPI
error is still failing and can be detected by CPI diagnostics then go to
the next step in this procedure.
v If more than one FRU is listed, the CPI error can be isolated to the failing
FRU if the CPI diagnostics can detect the error. Running the diagnostics to
isolate the error takes more time than replacing the listed FRUs with no
further isolation. Do one of the following:
– To isolate to the failing FRU, go to “MAP 4050: Isolating a CPI Problem”.
– To replace FRUs with no isolation, go one of the following:
- “MAP 4060: Replacement of Cluster FRUs for CPI Problems” on
page 326
- “MAP 4070: Replacement of Host Bay FRUs for CPI Problems” on
page 327
MAP 4050: Isolating a CPI Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
322
VOLUME 1, ESS Service Guide
MAP 4050: CPI Tests
Description
The CPI problem can be solid or intermittent. A solid CPI problem will still be
present during the repair and will be detected by the CPI diagnostics. An
intermittent CPI problem most likely will not be detected by the diagnostics.
If the CPI diagnostics detect a failure, a new problem log may be created instead of
updating the original problem log. The CPI problem with multiple FRUs can be
isolated to the failing FRU by replacing one FRU at a time. Running the diagnostics
after each FRU replace will then identify if the failure has been repaired.
If the diagnostics cannot detect an intermittent failure, the CPI problem can instead
by isolated by replacing one FRU at a time. After replacing the FRU, the original
problem is closed which returns the CPI resource to customer use. If during
customer use a new problem log is created, the next FRU should be replaced.
If the diagnostics cannot detect an intermittent failure, and there is not time to do
isolation, then all the FRUs can be replaced together.
The conditions needed to run the CPI diagnostics depends on the FRUs to be
tested. The CPI diagnostics cannot run concurrent with customer use. They are only
run during 2105 Model Exx/Fxx power on, cluster power on, or cluster or host bay
quiesce/resumes.
v To test all the possible CPI FRUs when the customer is not using the 2105
Model Exx/Fxx, power off and then power on the 2105 Model Exx/Fxx . This may
take up to 30 minutes depending on the installed features and configuration. If
the problem log FRU list contains only host bay FRUs, it may be faster to use
the service terminal to test only the host bay by resuming the host bay.
v To test the cluster FRUs while the customer is using the 2105 Model Exx/Fxx ,
the cluster must be quiesced and then resumed. This causes control of the host
interfaces to this cluster to be failed over to the other cluster which allows the
customer uninterrupted access. The customer may have less performance while
the cluster is quiesced. This may take up to 30 minutes depending on the
installed features and configuration.
v To test the host bay FRUs, the host bay must be quiesced and resumed. This
may take up to 10 minutes. The customer will not have access through the host
interfaces cabled to the quiesced host bay. The host bay cannot be quiesced if
either cluster is fenced or quiesced.
Isolation
Note: This MAP may direct you to display fence and quiesce conditions and
resume a cluster or host bay. To do this you will need to use the service
terminal Utilities Menu, Resource Management Menu options unless
directed otherwise.
1. Ensure the CPI cables that attach to the listed FRUs are fully seated and are
properly connected. Each CPI cable is labeled with color to match a color label
on the sheet metal next to the FRU.
2. Refer to the list of FRUs written down from “MAP 4040: Entry MAP for CPI
Problems” on page 321.
3. Review the FRU list and then go to the step below with the matching FRUs:
v Only host bay FRUs, go to step 7 on page 325.
v Only cluster FRUs, go to step 6 on page 324.
v Cluster and host bay FRUs, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
323
MAP 4050: CPI Tests
v None of the above FRUs. You have a list of FRUs that this MAP is not
designed to isolate. Call the next level of support.
4. Will the customer allow you to power the 2105 Model Exx/Fxx off?
v Yes, go to step 5.
v No, go to step 8 on page 325.
5. The customer will allow the 2105 Model Exx/Fxx to be powered off and the FRU
list contains cluster and host bay FRUs.
Note: The easiest method to do a complete CPI test is to power the 2105
Model Exx/Fxx off then on. Powering on will run all three CPI functional
tests on both clusters to all host bays. All fence conditions will be reset
when the 2105 Model Exx/Fxx is powered off. Any fence conditions after
the power on were caused when CPI errors were detected and logged.
After both clusters have successfully completed the power on and code
load, Ready will be displayed on each cluster operator panel. Use the
service terminal to display problems needing repair. The diagnostics will
create a new related problem if a CPI failure is detected.
a. Power the 2105 Model Exx/Fxx Off using the operator panel Local Power
switch. This may take up to 3 minutes.
b. Power the 2105 Model Exx/Fxx On using the operator panel Local Power
switch. Wait for the operator panel Cluster Bay Ready indicator LED to
come on for one or both clusters.
c. Display problems needing repair. If the diagnostics detected a failure, a new
problem log may exist or the existing problem log Last Occurrence field will
have been updated. A CPI problem can be isolated to the failing FRU by
replacing the FRUs one at a time.
Determine if the diagnostics detected an error and do one of these:
v If the diagnostics detected an error, go to “MAP 4080: Powering the 2105
Model Exx/Fxx Off to Replace CPI FRUs” on page 329.
v If the diagnostics did not detect a failure, further isolation by diagnostics is
not possible. One or more of the FRUs listed in the original problem log
can be replaced now.
Note: It may be possible to isolate the problem by replacing one FRU
and then returning the 2105 Model Exx/Fxx to customer use. After
replacing the FRU, close the original problem and then use the
Start Repair Menu, End of Call option to ensure all the cluster and
host bay resources are no longer fenced or quiesced. Wait and
see if a new problem is created in the next few hours or days.
Continue replacing FRUs until new problems are no longer
created.
Go to “MAP 4080: Powering the 2105 Model Exx/Fxx Off to Replace CPI
FRUs” on page 329.
6. The FRU list contains only cluster FRUs.
a. Quiesce and Resume the cluster.
b. Display problems needing repair.
v If there is a new related problem or the existing problem Last Occurrence
field was updated, the diagnostics detected an error. Cluster FRUs may
be replaced one at a time to isolate the failing FRU. Go to “MAP 4060:
Replacement of Cluster FRUs for CPI Problems” on page 326.
324
VOLUME 1, ESS Service Guide
MAP 4050: CPI Tests
v If the diagnostics did not detect an error, quiesce and resume the related
host bay.
– If there is a new related problem or the existing problem Last
Occurrence field was updated, the diagnostics detected an error. The
FRUs may be replaced one at a time to isolate the failing FRU. Use
one of the these:
- For cluster FRU replacement, “MAP 4060: Replacement of Cluster
FRUs for CPI Problems” on page 326.
- For host bay FRU replacement, “MAP 4070: Replacement of Host
Bay FRUs for CPI Problems” on page 327.
– If the diagnostics did not detect an error, further isolation by
diagnostics is not possible. Go to “MAP 4060: Replacement of Cluster
FRUs for CPI Problems” on page 326. to replace FRUs from the
original FRU list.
7. The FRU list contains only host bay FRUs.
a. If a cluster is fenced, quiesce and resume that cluster first.
b. Quiesce and Resume the host bay.
c. Display problems needing repair.
v If there is a new related problem or the existing problem Last Occurrence
field was updated, the diagnostics detected an error. Host bay FRUs may
be replaced one at a time to isolate the failing FRU. Go to “MAP 4070:
Replacement of Host Bay FRUs for CPI Problems” on page 327.
v If the diagnostics did not detect an error, further isolation by diagnostics is
not possible. Replace the host bay FRUs. Go to “MAP 4070:
Replacement of Host Bay FRUs for CPI Problems” on page 327.
8. The FRU list contains host bay FRUs and FRUs from only one cluster, continue
with this step.
a. Quiesce and Resume the cluster with the FRUs listed.
b. Display problems needing repair.
v If there is a new related problem or the existing problem Last Occurrence
field was updated, the diagnostics detected an error. Replace the cluster
FRUs. Go to “MAP 4060: Replacement of Cluster FRUs for CPI
Problems” on page 326.
v If the diagnostics did not detect an error, further isolation by diagnostics is
not possible.
c. Quiesce and Resume the host bay.
d. Display problems needing repair.
v If there is a new related problem or the existing problem Last Occurrence
field was updated, the diagnostics detected an error. Host bay FRUs may
be replaced one at a time to isolate the failing FRU. Go to “MAP 4070:
Replacement of Host Bay FRUs for CPI Problems” on page 327
v If the diagnostics did not detect an error, further isolation by diagnostics is
not possible. Replace one or more FRUs. Go to either:
– For cluster FRU replacement, go to “MAP 4060: Replacement of
Cluster FRUs for CPI Problems” on page 326.
– For host bay FRU replacement, go to “MAP 4070: Replacement of
Host Bay FRUs for CPI Problems” on page 327.
9. The FRU list contains host bay FRUs and FRUs from both clusters.
Determine if either cluster is fenced.
v If a cluster is fenced:
Problem Isolation Procedures, CHAPTER 3
325
MAP 4050: CPI Tests
Note:
a. Quiesce and resume that cluster.
b. Display problems needing repair.
– If there is a new related problem or the existing problem Last
Occurrence field was updated, the diagnostics detected an
error. Replace the FRUs for the cluster that was resumed. Go
to “MAP 4060: Replacement of Cluster FRUs for CPI
Problems”.
– If the diagnostics did not detect an error, continue.
c. Quiesce and Resume the host bay.
d. Display problems needing repair.
– If there is a new related problem or the existing problem Last
Occurrence field was updated, the diagnostics detected an
error. Replace one or more host bay FRUs to isolate the failing
FRU. Go to “MAP 4070: Replacement of Host Bay FRUs for
CPI Problems” on page 327.
– If the diagnostics did not detect an error each cluster will need
to be tested. If a cluster was already quiesced and resumed
above, there is no need to test that cluster again.
v If a cluster is not fenced, quiesce and resume the cluster and then display
problems needing repair.
v If there is a new related problem or the existing problem Last Occurrence
field was updated, the diagnostics detected an error. FRUs for this cluster or
the host bay may be replaced one at a time. (Repeat this for the other cluster
if needed.) Go to
– “MAP 4060: Replacement of Cluster FRUs for CPI Problems”
– “MAP 4070: Replacement of Host Bay FRUs for CPI Problems” on
page 327
MAP 4060: Replacement of Cluster FRUs for CPI Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: The FRUs and cables in this procedure are ESD-sensitive. Always wear
an ESD wrist strap during this isolation procedure. Follow the ESD procedures in
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Description
This MAP is used to replace cluster FRUs for CPI problems. These FRUs are the
I/O attachment card and I/O planar. When the cluster is removed from the 2105
Model Exx/Fxx, one or more FRUs may be replaced.
Only one cluster may be fenced or quiesced at a time. If a cluster is already fenced
or quiesced, replace the FRUs in that cluster first. If no cluster is fenced or
quiesced, you may quiesce either cluster and then replace its FRUs.
326
VOLUME 1, ESS Service Guide
MAP 4060: Cluster FRUs for CPI
The CPI diagnostics are automatically run when the cluster is resumed. To test the
cluster to host bay interface, the 4-slot bay must also be quiesced during this
procedure. If the host bay is not quiesced, only the I/O Attachment Card will be
tested.
Isolation
1. Determine if a cluster is fenced.
From the service terminal Main Service Menu, select:
Utilities Menu
Resource Management Menu
Show Fenced Resources
Do one of the following:
v If a cluster is fenced, it is recommended to replace the FRUs in that cluster
first. Go to the next step.
v If a cluster is not fenced, you may select either cluster to replace FRUs in.
Go to the next step.
2. Replace the cluster FRU or FRUs, go to “MAP 4700: Replacing Cluster FRUs”
on page 375.
When that MAP directs you to go to “MAP 1500: Ending a Service Action” on
page 68, return here instead and continue with the next step.
3. Display problems needing repair.
v If the existing problem created by the diagnostic, has been updated with a
new Last Occurrence date and time, the FRU just replaced did not repair the
problem. You may repeat this procedure to replace any remaining cluster
FRUs.
To replace host bay FRUs go to “MAP 4070: Replacement of Host Bay FRUs
for CPI Problems”.
v If a new related problem was created, the new FRU might be bad, the CPI
cable may not be seated properly or the host bay may not be seated
correctly in the 2105 Model Exx/Fxx. You can repeat this procedure with this
FRU or the original FRU to get back to the original failure and problem log.
Remember to write down the Last Occurrence date and time so you can
determine which problem log gets updated.
v If there is not a new problem and the original problem Last Occurrence date
and time did not change, then you have replaced the failing FRU. Go to
“MAP 1500: Ending a Service Action” on page 68.
MAP 4070: Replacement of Host Bay FRUs for CPI Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: The FRUs and cables in this procedure are ESD-sensitive. Always wear
an ESD wrist strap during this isolation procedure. Follow the ESD procedures in
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
327
MAP 4070: Host Bay FRUs for CPI
Description
This MAP is used to replace and test host bay FRUs. The host bay is quiesced and
powered off, then the FRUs are replaced. The CPI diagnostics are run when the
host bay is resumed. A fenced or quiesced cluster prevents the CPI diagnostics
from running on that cluster. The Ultra SCSI Host Cards are also tested when these
CPI diagnostics are run.
Isolation
1. Determine if a cluster is fenced.
From the service terminal Main Service Menu, select:
Utilities Menu
Resource Management Menu
Show Fenced Resources
v If a cluster is not fenced, go to step 2.
v If a cluster is fenced, do the following:
– Quiesce the host bay. This will prevent a new error from being created
when the cluster fence is reset by the quiesce/resume.
From the service terminal Main Service Menu, select:
Utilities Menu
Resource Management Menu
Quiesce a Resource
Select the proper host bay.
– Quiesce the cluster using the Alternate Cluster Repair menu options.
Connect the service terminal to the cluster that is not fenced. From the
service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
Quiesce the failing cluster then press F3 once to return to the
Alternate Cluster Repair Menu. Then resume the alternate,
failing, cluster.
Resume the Alternate Cluster
– Resume the cluster using the Alternate Cluster Repair menu options. The
resume causes the cluster to load code as if it were being powered on
and then fail-back its host bay resources from the other cluster. This can
take up to 30 minutes depending on the features and configuration of the
2105 Model Exx/Fxx .
Note: If the resume fails, that must be repaired before continuing with the
host bay FRU replacement. You may be able to start the new repair
with a visual symptom if the cluster hung with a code displayed in
the cluster operator panel. You may be able to use error
information displayed on the service terminal. If you cannot begin
the repair, call the next level of support.
2. Replace the host bay FRU or FRUs. Use the Replace a FRU option. It will
quiesce and power off the host bay, prompt you to replace the FRU, then power
on and resume the host bay.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Host Bay FRUs
328
VOLUME 1, ESS Service Guide
MAP 4070: Host Bay FRUs for CPI
After the FRU has been replaced and the host bay resumed, go to
the next step.
3. Display problems needing repair.
v If the diagnostics detected a CPI error, ensure the FRU and any attached
cables are properly connected. Then replace the remaining FRUs.
Note: The Last Occurrence date and time in the existing problem will have
been updated or a new related problem log will have been created. Go
to one of the following:
– “MAP 4060: Replacement of Cluster FRUs for CPI Problems” on page 326
–
v If the diagnostics did not detect a CPI error or a cluster error, go to the next
step.
4. Go to “MAP 1500: Ending a Service Action” on page 68.
MAP 4080: Powering the 2105 Model Exx/Fxx Off to Replace CPI FRUs
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
You have CPI FRUs to replace in the host bay and one or both clusters.
Powering the 2105 Model Exx/Fxx off, resets all quiesce conditions. With the 2105
Model Exx/Fxx powered off, one or more CPI FRUs can be replaced. When the
2105 Model Exx/Fxx is powered on, all CPI diagnostics are run. If the CPI
diagnostics detect a failure, a new CPI problem log will be created. If an existing
CPI problem log is present, the Last Occurrence date and time field will be updated.
Isolation
1. Ensure the customer is not using the 2105 Model Exx/Fxx . Power off the 2105
Model Exx/Fxx using the operator panel Local Power switch.
2. Replace one or more FRUs. Refer to Chapter 4 for individual FRU replacement
procedures. Do only the steps necessary to physically replace the FRU. Return
here when the FRU has been replaced and continue with the next step.
3. Power on the 2105 Model Exx/Fxx using the operator panel Local Power switch.
4. Display problems needing repair. If the CPI diagnostics detect a failure, a new
CPI problem log will be created. If an existing CPI problem log is present, it will
be updated with the current time stamp in the Last Occurrence field.
v If the diagnostics detected a failure:
– Replace any remaining FRUs using this MAP.
– If all FRUs have been replaced, call the next level of support. The problem
may be in the backplanes.
v If the diagnostics did not detect a failure, go to “MAP 1500: Ending a Service
Action” on page 68.
MAP 4090: CPI Address Mismatch
Attention: This is not a stand-alone procedure.
Problem Isolation Procedures, CHAPTER 3
329
MAP 4090: CPI Address Mismatch
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The CPI diagnostics check that each cluster bay IOA card CPI interface is cabled to
the proper host bay CPI interface. A diagnostic detected CPI address mismatch
indicates a CPI address logic failure if only one error is detected. If two errors are
detected, then the most likely cause is two CPI cables being cross connected. The
CPI cables and adjacent sheet-metal are marked with matching color labels to
indicate proper connection.
Isolation
1. Determine if there are one or two problem logs related to CPI address
mismatch. Use the service terminal to display problems needing repair.
v There is only one related problem. Continue the repair using the problem log
and replace the listed FRU(s).
v There are two or more related problems. Go to the next step.
2. Two or more CPI cables are cross connected. Use the color labels on CPI
cables and adjacent sheet metal to determine which cables are crossed. Or use
the following tables to determine the proper connections for each CPI cable.
Go to the correct cluster bay model table:
v 2105 Models E10/E20, go to Table 24
v 2105 Model Exx/Fxx, go to Table 25 on page 331
Note: Reference to ″Locating a CPI Cable Using Colored Labels″ in chapter 7
of the Enterprise Storage Server Service Guide, Volume 3.
Table 24. 2105 Models E10/E20 CPI Cable Connections
330
CPI Interface
Cluster Location
Host Bay Location
Color Code
CPI4 Local
R1-T1-I4/JB
R1-B1/JB
Green
CPI4 Remote
R1-T2-I4/JA
R1-B1/JA
Orange
CPI5 Local
R1-T2-I4/JB
R1-B3/JB
Red
CPI5 Remote
R1-T1-I4/JA
R1-B3/JA
Gray
CPI6 Local
R1-T1-I7/JB
R1-B2/JB
Yellow
CPI6 Remote
R1-T2-I7/JA
R1-B2/JA
Brown
CPI7 Local
R1-T2-I7/JB
R1-B4/JB
Blue
CPI7 Remote
R1-T1-I7/JA
R1-B4/JA
Violet
VOLUME 1, ESS Service Guide
MAP 4090: CPI Address Mismatch
Table 25. 2105 Model F10/F20 CPI Cable Connections
CPI Interface
Cluster Location
Host Bay Location
Color Code
CPI4 Local
R1-T1-I5/JB
R1-B1/JB
Green
CPI4 Remote
R1-T2-I5/JA
R1-B1/JA
Orange
CPI5 Local
R1-T2-I5/JB
R1-B3/JB
Red
CPI5 Remote
R1-T1-I5/JA
R1-B3/JA
Gray
CPI6 Local
R1-T1-I8/JB
R1-B2/JB
Yellow
CPI6 Remote
R1-T2-I8/JA
R1-B2/JA
Brown
CPI7 Local
R1-T2-I8/JB
R1-B4/JB
Blue
CPI7 Remote
R1-T1-I8/JA
R1-B4/JA
Violet
3. Determine the end of each cable that is cross connected. Use the service
terminal Main Menu, Replace a FRU option to quiesce and power off the FRUs
the cables are connected to before correcting the cable connections.
v Use the Host Bay FRU option for that end of each CPI cable.
v Use the Cluster Bay FRU option for that end of each CPI cable.
MAP 4100: Isolating a LIC Process Read/Display Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Isolation
Determine if the LIC installation will be from CD-ROM or diskette:
v If using a CD-ROM as the LIC installation media, go to “MAP 4600: Isolating a
CD-ROM Test Failure” on page 373.
v If using a diskette as the LIC installation media, go to “MAP 4620: Isolating a
Diskette Drive Failure” on page 374.
MAP 4120: Handling Unexpected Resources
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
This failure indicates that a resource has been detected (ESC = 1202) that has not
been properly installed in the 2105 Model Exx/Fxx .
Isolation
1. Is there another problem (ESC = 1201) indicating that a resource is missing?
v Yes, a FRU has been placed in a wrong location and needs to be moved, go
to step 6 on page 332.
v No, continue with the next step.
Problem Isolation Procedures, CHAPTER 3
331
MAP 4120: Handling Unexpected Resources
2. Look at the resource in the FRU list of the problem. The 2105 Model Exx/Fxx
has detected a resource that has not been properly installed.
Should this resource be installed in this machine?
v Yes, record the Problem ID number then continue with the next step to install
this resource.
v No, go to step 7.
3. Look at ″Install and Remove″ in chapter 5 of the Enterprise Storage Server
Service Guide, Volume 2. See if there is an installation procedure for this
resource.
Is there an installation procedure for this resource?
v Yes, continue with the next step and perform the installation.
v No, there is no installation process for this resource. Call the next level of
support for assistance.
4. Perform the installation as described in the Service Guide.
Were you able to complete the installation?
v Yes, continue with the next step to cancel original problem.
v No, contact your next level of support.
5. The problem is now resolved, cancel the original problem. Press F3 until Main
Service Menu is displayed.
From the service terminal Main Service Menu, select:
Repair Menu
Close a Previously Repaired Problem
Select the problem with ID you recorded in step 2.
Scroll to bottom of display and select the line that starts with: Close
Problem .....
The problem is now closed and this repair is complete.
6. You are going to move the FRU to the correct location. Select the FRU in the
FRU list of the other problem which indicates the missing resource. When
directed to replace the FRU, move the FRU to the correct location. Continue
through Verification.
Does Verification run without a problem?
v Yes, the problem is resolved. Return to the service terminal and follow
directions to return the resource to the customer and close the problem.
v No, resolve the problem created by verification.
7. You will remove the resource from the system.
a. Select the FRU from the problem FRU list.
b. When you are directed to replace the FRU, follow the Remove/Replace
instructions to remove the FRU, but do not replace the FRU. Follow any
instructions for any reassembly required.
c. Go through the verification process.
Does Verification run without a problem?
v Yes, the problem is resolved. Return to the service terminal and follow
directions to return the resource to the customer and close the problem.
v No, resolve the problem created during the verification.
MAP 4130: Handling a Missing or Failing Resource
Attention: This is not a stand-alone procedure.
332
VOLUME 1, ESS Service Guide
MAP 4130: Handling a Missing or Failing Resource
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
This failure indicates that a resource has not been detected (ESC = 1201) that
should be in the 2105 Model Exx/Fxx. This may mean that the resource is not in
the expected location or the resource is failing in such a way that it can not be
detected.
Isolation
1. Is there another problem (ESC = 1202) indicating that a resource is
unexpected?
v Yes, a FRU has been placed in a wrong location, continue with the next step
to move the FRU.
v No, go to step 3.
2. You are going to move the FRU to the correct location. Select the FRU in the
FRU list with either of the two problems. When directed to replace the FRU,
move the FRU to the correct location. Continue through verification.
Does Verification run without a problem?
v Yes, the problem is resolved. Return to the service terminal and follow
directions to return the resource to the customer and close the problem.
v No, resolve the problem created by verification.
3. You will add or replace the missing/failing resource.
a. Select the FRU from the problem FRU list.
b. When you are directed to replace the FRU, follow the remove/replace
instructions to remove the FRU.
Is there a FRU in that location?
v Yes, the FRU has failed. Remove the FRU and continue with the next
step.
v No, the FRU is missing. Add a FRU to that location and continue with the
next step.
4. Place a FRU in the specified location and follow the replace instructions through
verification.
Does Verification run without a problem?
v Yes, the problem is resolved. Return to the service terminal and follow
directions to return resources to the customer and close the problem.
v No, resolve the problem created during verification.
MAP 4140: Isolating a LIC Activation Process Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the 2105 Model Exx/Fxx unless instructed to do so.
Problem Isolation Procedures, CHAPTER 3
333
MAP 4140: LIC Activation
Description
v A Cluster SCSI Hard Drive is failing or data on it has been corrupted. MAP 4020:
SCSI Hard Drive Build Process will be used as a diagnostic to test the SCSI
Hard Drive. It will isolate if a hardware problem exits. If not, it will then reload all
the AIX operating system and functional code on the SCSI Hard Drive. Then the
LIC Activation should be tried again and should be successful. If not, the next
level of support will be called.
Isolation
1. Test the Cluster SCSI Hard Drive, go to “MAP 4020: Performing the SCSI Hard
Drive Build Process” on page 316.
2. After completing the procedure, attempt the LIC Activation process again. If it
still fails, call the next level of support.
MAP 4240: Isolating a Blinking 888 Error on the Cluster Operator Panel
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: Do not power off the 2105 unless instructed to do so.
Description
v A blinking 888 number suggests that either a hardware or software problem has
been detected and a diagnostic message is ready to be read. The next level of
support will be called as they may have the additional information and access
authority to do problem isolation and resolution.
Isolation
1. Perform the following steps to record the information contained in the blinking
888 message and then call your next level of support.
a. Wait until the blinking 888 is displayed.
b. Record in sequence each code that is displayed after the blinking 888 goes
away. Stop recording when the blinking 888 reappears. Separate each code
recorded with a blank space.
c. Go to step 2.
2. Using the first code recorded use the following list to determine the next step to
use.
v Type 102, go to step 3.
v Type 103, go to step 4 on page 336.
3. Use the following steps and information to determine the content of the type 102
message. Crash and dump status codes are listed later in this step.
Note: A Type 102 message is generated when a software or hardware error
occurs while the system is running an application.
102 = Message type
RRR = Crash code, the three-digit code that immediately follows the 102, see
″Crash Codes″ on page 335.
SSS = Dump status code, the three-digit code that immediately follows the
Crash code, see ″Dump Progress Indicators (Dump Status Codes)″ on page
335.
334
VOLUME 1, ESS Service Guide
MAP 4240: 888 Blinking on Cluster
Record the Crash code and the Dump Status from the message you recorded.
Are there additional codes following the Dump Status?
v Yes, this message also has a type 103 message included in it. To decipher
the SRN and FRU information in the Type 103 message, go to step 4 on
page 336.
v No, call your next level of support. The 2105 software on the cluster SCSI
Hard Drive has most likely been corrupted. You may be asked to use MAP
4020: SCSI Hard Drive Build to reload all the cluster software.
Note: There are no SRNs associated with message Type 102.
Crash Codes
The following crash codes are part of a Type 102 message.
000
Unexpected system interrupt.
200
Machine check because of a memory bus error.
201
Machine check because of a memory time-out.
202
Machine check because of a memory card failure.
203
Machine check because of a out of range address.
204
Machine check because of an attempt to write to ROS.
205
Machine check because of an uncorrectable address parity.
206
Machine check because of an uncorrectable ECC error.
207
Machine check because of an unidentified error.
208
Machine check due to an L2 uncorrectable ECC.
300
Data storage interrupt from the processor.
32x
Data storage interrupt because of an I/O exception from IOCC.
38x
Data storage interrupt because of an I/O exception from SLA.
400
Instruction storage interrupt.
500
External interrupt because of a scrub memory bus error.
501
External interrupt because of an unidentified error.
51x
External interrupt because of a DMA memory bus error.
52x
External interrupt because of an IOCC channel check.
53x
External interrupt from an IOCC bus timeout; x represents the IOCC
number.
54x
External interrupt because of an IOCC keyboard check.
558
There is not enough memory to continue the IPL.
700
Program interrupt.
800
Floating point is not available.
Dump Progress Indicators (Dump Status Codes)
The following dump progress indicators, or dump status codes, are part of a
Type 102 message.
Note: When a lowercase c is listed, it displays in the lower half of the
seven-segment character position. The leftmost position is blank on the
following codes.
0c0
The dump completed successfully.
0c1
The dump failed due to an I/O error.
0c2
A dump, requested by the user, is started.
0c3
The dump is inhibited.
0c4
The dump device is not large enough.
0c5
The dump did not start, or the dump crashed.
0c6
Dumping to a secondary dump device.
0c7
Reserved.
0c8
The dump function is disabled.
Problem Isolation Procedures, CHAPTER 3
335
MAP 4240: 888 Blinking on Cluster
0c9
A dump is in progress.
0cc
Unknown dump failure
4. Use the following steps and information to determine the content of the Type
103 message.
Note: A Type 103 message is generated when a hardware error is detected.
103 = Message type
XXX YYY = SRN (where XXX = the three-digit code following the 103 and
YYY is the three-digit code following the XXX three-digit code).
a. Record the SRN and FRU location codes from the recorded message.
b. Call the next level of support before continuing.
c. Find the SRN in the SRN Listing and do the indicated action, go to step
″Bus SRN to FRU Reference Table″ in chapter 9 of the Enterprise Storage
Server Service Guide, Volume 3.
MAP 4320: Isolating E1xx SCSI Hard Drive Code Boot Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: The FRUs and cables in this procedure are ESD-sensitive. Always wear
an ESD wrist strap during this isolation procedure. Follow the ESD procedures in
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Description
An E105 checkpoint may be displayed on the cluster bay operator panel for a
period of time while the boot image is loaded from the SCSI hard drive. If the
checkpoint code is displayed for more than 5 minutes there is a problem loading the
boot image. This may be a software or hardware problem. A checkpoint of STBY
displayed indicates that the cluster cannot find a boot image on the SCSI hard
drive.
An E105 hang can also occur if one of the CPU processors has a hardware
problem. Prior to the E105 status, only one processor is used. At E105, the other
three processors are brought online and if one is bad, the E105 may hang.
The cluster power on and code load process is hanging. There are six types of
failures:
v The device boot list in SMS is corrupted or incorrect so the SCSI hard drive is
not being accessed.
v A processor on one of the two CPU cards is failing.
v Power failure to the SCSI hard drive and most likely the CD-ROM drive (common
power cable).
v SCSI interface has failed and is preventing access to the SCSI hard drive and
possibly also the CD-ROM drive (common SCSI cable). There may also be a
problem with the SCSI interface termination. There are SCSI hard drives with
and without internal SCSI termination. There are SCSI cables with and without
SCSI termination. There must be only one termination at the device end of the
SCSI cable. For additional information on SCSI interface termination, see
″Additional SCSI Hard Drive Replacement Information, 2105 Model F10/F202105
336
VOLUME 1, ESS Service Guide
MAP 4320: E1xx SCSI Hard Drive Code Boot
Model E10/E20″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2 or ″Additional SCSI Hard Drive Replacement Information, 2105 Model
F10/F20″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
v SCSI hard drive hardware has failed. SCSI interface is functional to the CD-ROM
drive.
v SCSI hard drive, CD-ROM drive and SCSI interface are functional, the AIX boot
image is bad.
Isolation
1. This MAP assumes the cluster bay is powered on. The cluster bay is powered
on if the front middle indicator LED on the three electronics cage power
supplies above the cluster bay are on.
v If they are on, continue with the next step.
v If they are not on, connect the service terminal to the other cluster and use
the Repair Menu, Alternate Cluster Repair options to power the cluster on.
2. The cluster bay operator panel should be displaying E105.
v If it is, continue with the next step.
v If it is not, go to “MAP 4360: Isolation Using Codes Displayed by the Cluster
Operator Panel” on page 342.
3. A failing CPU card can cause the E105 hang condition. If the following steps
do not fix the problem, replace the CPU cards one at a time.
Note: Prior to the E105 status, only one processor is used. At E105, the other
three processors are brought online. If one processor is bad, the E105
may hang.
4. The device boot list may be corrupted or incorrect. See step 10 on page 318 to
step 15 on page 318 to display and reset the default boot list if necessary.
5. Press the cluster CD-ROM drive eject button.
Note: The SCSI hard drive and CD-ROM drive share a common power cable.
A power failure to the cable or in the cable should affect both.
Does the CD-ROM tray open?
v Yes, continue with the next step.
v No, go to step 13 on page 339.
6. Read the following explanation.
The drives are both assumed to be receiving power. This step will test the
SCSI interface, first to the CD-ROM drive and then to the SCSI hard drive. The
SCSI hard drive build process from CD-ROM and diskette will be used.
Go to “MAP 4020: Performing the SCSI Hard Drive Build Process” on
page 316 after reading next two bullets.
v If the load of AIX code from the CD does not get an error, then both drives
are functional and the problem does not require any FRUs to be replaced.
Continue with that MAP to complete the code load process. The process will
end by rebooting the cluster which will verify the boot records were built
properly.
v If the load of AIX code fails with an error that is not for the CD-ROM or
SCSI hard drive then continue with “MAP 4020: Performing the SCSI Hard
Drive Build Process” on page 316
v If the load of AIX code fails with an error for either the CD-ROM drive or the
SCSI hard drive, then return here and continue with the next step.
Problem Isolation Procedures, CHAPTER 3
337
MAP 4320: E1xx SCSI Hard Drive Code Boot
7.
Did this failure begin after replacing the SCSI hard drive or SCSI cable?
v Yes, continue with the next step.
v No, go to step 9.
8. This may also be caused by a problem with the SCSI interface termination.
There are now two types of SCSI hard drives:
v Drives with internal SCSI terminators that require a SCSI cable without a
terminator block.
v Drives without internal SCSI terminators that require a SCSI cable with an
external terminator block.
Do you know if the SCSI termination is correct?
v Yes, continue with the next step.
v No, reference the SCSI Hard Drive replacement procedure in ″CD-ROM,
SCSI Hard Drive, and Diskette Drive Removals and Replacements, Cluster
(E10/E20)″ in chapter 4 of the Enterprise Storage Server Service Guide,
Volume 2 or the ″CD-ROM, SCSI Hard Drive, and Diskette Drive Removals
and Replacements, Cluster (F10/F20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2. Return here and continue with the
next step.
9. Ensure that the SCSI signal cable is fully seated at the SCSI hard drive,
CD-ROM drive and I/O planar (R1-Tx-P2-Z1.1).
Connect the service terminal to the other cluster. Use the Repair Menu,
Replace a FRU options to simulate replacing the CD-ROM drive.
v Yes, continue with the next step.
v No, go to step 8 on page 191.
10. Ensure the three CD-ROM jumpers are plugged correctly. If they are not, there
may be duplicate SCSI IDs.
CD-ROM Drive
12 4
Rear View
(3 Jumpers)
Figure 146. CD-ROM Drive Jumpers (S008413l)
11. Unplug the SCSI signal cable from the SCSI hard drive. Power on the cluster.
Leave the CD and diskette inserted.
v If the hard drive load process begins and gets a SCSI hard drive error, it
means that the CD-ROM works when the SCSI hard drive is unplugged, but
fails when it is plugged. Replace the SCSI hard drive. Go to “MAP 4700:
Replacing Cluster FRUs” on page 375.
v If the hard drive load process fails trying to access the CD, go to the next
step.
12. Plug the SCSI signal cable back into the SCSI hard drive and unplug it from
the CD-ROM drive. Remove the CD and diskette. Power on the cluster.
v If the cluster powers on and loads code normally, the CD-ROM drive was
putting errors on the common SCSI interface. Replace the CD-ROM drive.
338
VOLUME 1, ESS Service Guide
MAP 4320: E1xx SCSI Hard Drive Code Boot
The SCSI hard drive code load was successful. Go to “MAP 4700:
Replacing Cluster FRUs” on page 375.
v If the cluster hangs with another 4 digit boot code problem, then the
problem is affecting both the CD-ROM drive and SCSI hard drive. Replace
the I/O planar and SCSI signal cable. Go to “MAP 4700: Replacing Cluster
FRUs” on page 375.
13. The CD-ROM drive and possibly the SCSI hard drive are not receiving power.
Connect the service terminal to the other cluster. Use the Repair Menu,
Replace a FRU options to simulate replacing the CD-ROM drive. When the
cluster bay is powered off, do the following:
v Ensure the power cable is fully plugged into both drives and the cluster
power planar.
v Ensure all cluster power planar cables are fully seated.
v If the cables were properly plugged, replace the following FRUs until the
CD-ROM drive has power. (Use the service terminal to continue the FRU
replace procedure to power up the cluster after each FRU replace.)
– Cluster power planar, electronics cage power planar,
– cluster power planar to SCSI HD and CD cable,
– cluster power planar to docking connector cable
Go to “MAP 4700: Replacing Cluster FRUs” on page 375.
Call the next level of support if the problem is not fixed by these FRUs.
MAP 4340: Isolating a E3xx Memory Test Hang Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
This section isolates an E3xx code memory hang problem during cluster bay power
on firmware tests.
The first memory card will function in either system planar slot M1 or M2. In this
product, it is always installed in the M1 slot. Two memory cards partially populated
with DIMM pairs will function, however in this product the first memory card is fully
populated before adding the second memory card.
Memory card DIMMs must be installed in matched (size and speed) pairs. They
must be installed in matched memory card slots (example: P1 and P2, P3 and P4,
P5 and P6), see ″Cluster Bay, Memory Card, Memory Module Location Codes″ in
chapter 7 of the Enterprise Storage Server Service Guide, Volume 3, for DIMM slot
locations. It takes two DIMMs working together to store the full memory word. A first
DIMM pair will function in any matched pair of slots, however in this product they
are always installed in slots 1 and 2.
Isolation
1. Use the alternate cluster repair menu to quiesce and then power off the cluster
being serviced.
Connect the service terminal to the cluster not being serviced. From the
service terminal Main Service Menu, select:
Problem Isolation Procedures, CHAPTER 3
339
MAP 4340: E3xx Memory Test Hang
Repair Menu
Alternate Cluster Repair Menu
Quiesce the Alternate Cluster
Power Off the Alternate Cluster
2. Remove the cluster and open the top to access the FRUs. Ensure the memory
cards in system planar slots M1 and M2 are properly seated. Ensure that all
the DIMMs are properly seated.
Was a problem found and repaired?
v Yes, go to step 9 on page 341.
v No, continue with the next step.
3. Remove the memory card in slot M2. The memory card in slot M1 is still
installed.
Note: There are two memory cards installed.
Does the cluster still hang at E3xx?
v Yes, go to step 5.
v No, continue with the next step.
4. Remove the memory card in slot M1 and move it to slot M2.
Does the cluster hang at E3xx?
v Yes, slot M2 is failing, replace the system planar. Then go to step 9 on
page 341 .
v No, the unplugged memory card is failing. Unplug the memory card in slot
M2. Plug the other memory card in slot M1. Then go to step 7.
5. Move the memory card in slot M1 to slot M2.
Does the cluster hang at E3xx?
v Yes, continue with the next step.
v No, the memory card is failing, go to step 7.
6. Remove the memory card in slot M2 and replace it with the other memory
card.
Does the cluster hang at E3xx?
v Yes, both slots M1 and M2 are failing. Replace the system planar. Then go
to step 9 on page 341.
v No, the memory card not plugged in is failing. Unplug the memory card in
slot M2. Plug the other memory card in slot M1. Then go to step 7.
7. Remove all the DIMM pairs except for the pair in slots P1 and P2. Ensure the
memory card is plugged in slot M1.
Does the cluster hang at E3xx?
v Yes, continue with the next step.
v No, one of the removed DIMM pairs is failing. Reinstall them one or more
pairs at a time to isolate and replace the failing DIMM. Then go to step 9 on
page 341.
8. Replace the DIMM pair in slots P1 and P2 with a DIMM pair that was
removed.
Does the cluster hang at E3xx?
v Yes, replace the memory card. Then go to step 9 on page 341.
v No, the memory DIMM pair just removed is failing. Isolate and replace the
memory DIMM that is failing. Then go to step 9 on page 341.
340
VOLUME 1, ESS Service Guide
MAP 4340: E3xx Memory Test Hang
9. Reinstall all FRUs in their original locations. Power on the cluster.
10. Wait the operator panel cluster bay Ready indicator LED to light. Then attempt
to login to the cluster bay being repaired. This will ensure the cluster is ready
to be resumed.
11. Connect the service terminal to the cluster bay not being repaired. Use the
Alternate Cluster Repair menu option to resume the cluster. When the resume
is complete go to “MAP 1500: Ending a Service Action” on page 68.
MAP 4350: Isolating Cluster Code Load Counter=2
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster attempted to IML two times and failed each time. A problem log was
created.
When a cluster powers on, it first loads the AIX operating system, then the
functional code and finally the RAS (maintenance package) code. The code load
counter is initially set to 0 and is incremented by 1 at the start of the code load. If
the code load is successful, the counter is reset to 0. If it is unsuccessful, the
counter is not reset to 0.
If the load of the functional code is not successful, the failing cluster creates an AIX
error log. A problem log is not created as the functional code and RAS code were
not able to be loaded yet.
The other cluster reboots the failing cluster to attempt to get past the error. If the
code load is successful, the code load counter is reset to 0. The AIX error log from
the prior unsuccessful attempt will not create a problem log as the error was
temporary.
If the second reboot attempt fails, a final reboot occurs. The AIX code is loaded, the
functional code load which would fail is bypassed, and the RAS code is loaded.
This leaves the failing cluster unable to do customer operations, but able to accept
a service terminal login for service actions.
The other cluster creates a problem log with an ESC=38F0 and uses this MAP for
further isolation. The problem log does not give the error that caused the code load
failures. The failing cluster should create a problem using the AIX error log from the
prior unsuccessful attempt. The problem should contain the repair action for the
error that caused the code load failures.
Isolation
1. Read the description section above.
2. Display problems needing repair. Look for related problems that have cluster,
bay or power FRUs. (SSA or drawer problems are not related.)
Were related problems found?
v Yes, repair them.
v No, call the next level of support.
Problem Isolation Procedures, CHAPTER 3
341
MAP 4360: Cluster Operator Panel Codes
MAP 4360: Isolation Using Codes Displayed by the Cluster Operator
Panel
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster operator panel displays various types of codes that indicate the status
of the cluster power on and code load. Some are normal status and progress
indications that change every few seconds. These same codes can indicate a
problem is the cluster appears to hang with the code still displayed. Other codes
indicate error conditions that will not prevent the code load from completing but will
create a problem log. Still other codes indicate conditions that prevent the cluster
from completing its power on or code load.
Notice that a ″Ready″ indication normally means the cluster is powered on and all
code is loaded. The ″Ready″ can become blank shortly after first appearing, this is
normal operation. However, the ready indicator on the 2105 Model Exx/Fxx operator
panel will stay lit.
Isolation
If a 2105 Model Exx/Fxx operator panel Cluster Message indicator is on, then
ensure you have already displayed problems needing repair before continuing with
this MAP.
Use the following table to determine your starting point. Find the symptom in the
table and then use the action to isolate and repair the problem.
Table 26. Cluster Boot or Down, Symptom Table
Symptom
Action
Blank during power on and code
load.
Shortly after cluster power on, 4 digit Service Processor and System Firmware
progress codes should be displayed. Various other status codes will be displayed
until the power on and code load is complete which is indicated by Ready being
displayed and the 2105 Model Exx/Fxx operator panel ready indicator being on.
To determine if the cluster is powered up, wait for 3 minutes after power on and
then press the eject button on the CD-ROM.
v If the disk tray opens, the cluster power is on. The failure to display any status
messages is in one of these FRUs which must be replaced, the Cluster
Operator Panel, Cluster Operator Cable, I/O Planar, Cluster Power Supply or
I/O Planar. If these FRUs do not correct the problem, call the next level of
support as there may be a problem in the backplane.
v If the disk tray does not open, go to “MAP 20A0: Cluster Not Ready” on
page 72
Went blank after displaying Ready 1. This is a normal indication at the end of a successful cluster power on and
code load. The cluster bay is ready for a service terminal login
2. The Ready display can be overwritten at any time by an AIX operating system
or service terminal action that will cause it to be blank.
Ready is displayed
1. This is a normal indication at the end of a successful cluster power on and
code load. The cluster bay is ready for a service terminal login.
2. The Ready display can be overwritten at any time by an AIX operating system
or service terminal action that will cause it to be blank.
342
VOLUME 1, ESS Service Guide
MAP 4360: Cluster Operator Panel Codes
Table 26. Cluster Boot or Down, Symptom Table (continued)
Symptom
OK is displayed
Action
The Service Processor (SP) is ready. The cluster is waiting for power on.
v This is normal if the cluster was powered off for service by using the service
terminal Alternate Cluster Repair Menu options.
v This is not normal if the cluster was not powered off for service. The cluster
can power itself off during a startup process if the RPC card remote/local
switches are not set to the same positions on both cards. Ensure that the RPC
power select switch at the top of each card are set the same. Ensure that the
bottom two positions of the 4 position DIP switch at the bottom of each card
are set the same.
v If the 2105 Model Exx/Fxx is being powered on, OK should display for a few
seconds and then the cluster power on should begin. If the cluster hangs with
OK displayed, go to “MAP 20B0: Cluster Did Not Power On, OK Displayed” on
page 74
STBY is displayed
The Service Processor (SP) is ready. The cluster was shutdown by the cluster
operating system, AIX. Read SP error log for possible fault indications and then
call the next level of support. See ″Service Processor Operations″ in Appendix A
of the Enterprise Storage Server Service Guide, Volume 3.
Service Terminal connect
problems.
Check for these further symptoms.
v Connect problem to only one cluster. Go to “MAP 6060: Isolating a Service
Terminal Login Failure To One Cluster” on page 432
v Connect problem to both clusters. Go to “MAP 6040: Isolating a Service
Terminal Login Failure To Both Clusters” on page 431
Cluster stops with 0005 displayed.
The cluster unsuccessfully attempted to load code three times. The threshold
counter was exceeded and it stopped with 0005 displayed. AIX and the RAS
(maintenance package) code did load successfully. If the problem is due to
hardware, a problem record should have been created. Connect the service
terminal to the failing cluster and display problems needing repair. Repair any
related problems. A power on of the cluster bay will automatically reset the
threshold counter.
If there are no related problem records, the problem is due to a code problem.
Call your next level of support.
Check the cluster operator panel:
Cluster stops with a 4-character
code displayed
v If the number displayed begins with the character E0xx (SP Checkpoint) or
Note: If the cluster operator panel
E1xx-EFFF (Firmware Checkpoint) then go to ″Checkpoints″ in chapter 9 of the
displays 2 sets of numbers (one
Enterprise Storage Server Service Guide, Volume 3.
above the other), use the top set
v For all other numbers record SRN 101-xxx, where xxx is the last three-digits of
of numbers as the error code.
the four-digit number displayed in the operator panel, then go to ″Service
Request Number List″ in chapter 9 of the Enterprise Storage Server Service
Guide, Volume 3.
4 character codes (0500-0999) are Normal, these are Configuration Program Indicators that give configuration
displayed
progress status. Reference ″Configuration Program Indicators″ in chapter 9 of the
Enterprise Storage Server Service Guide, Volume 3.
xxx-xxx, a SRN (Service
Reference Numbers) is displayed
SRNs are created by the AIX operating system to report requests for service of
hardware and/or software problems. Go to ″Service Request Number List″ in
chapter 9 of the Enterprise Storage Server Service Guide, Volume 3.
8 character codes are displayed
Record the error code, then go to ″Firmware/POST Error Codes″ in chapter 9 of
the Enterprise Storage Server Service Guide, Volume 3 for the repair.
10 character codes are displayed
These are normal progress codes for the CPI initialize, CPI diagnostics and code
load. Should the cluster hang with one of these displayed for greater than 5
minutes then call the next level of support. Go to ″9 and 10 Character Progress
Codes″ in chapter 9 of the Enterprise Storage Server Service Guide, Volume 3.
Problem Isolation Procedures, CHAPTER 3
343
MAP 4360: Cluster Operator Panel Codes
Table 26. Cluster Boot or Down, Symptom Table (continued)
Symptom
Action
The cluster appears to
restart/reboot while displaying the
E1xx system firmware codes.
If the service terminal is kept logically connected, this normally happens after the
cluster POST indicators are displayed. The term ″POST indicators″ refer to the
resource names that are listed after the multiple lines of RS/6000 are displayed.
They are ″memory keyboard network SCSI speaker″.
Go to “MAP 4320: Isolating E1xx SCSI Hard Drive Code Boot Problems” on
page 336.
The cluster appears to
restart/reboot when displaying the
10 character codes. The cluster
returns to the E1xx progress codes
and begins the code load
sequence again. This may occur
up to three times.
There are certain error recovery sequences during code load at the time the CPI
interfaces are being initialized that will cause up to 3 code loads to be attempted.
A problem log will be created and the cluster message indicator on the 2105
Model Exx/Fxx operator panel will be on. Connect the service terminal to the
cluster with the message indicator on and use the Main Service Menu -> Start
Repair -> Show/Repair Problems Needing Repair option.
v If a related problem is found, repair it.
v If no related problem is found, then attempt to recreate the problem by power
cycling the cluster again. Connect the service terminal to the working cluster
and: the working cluster and use the Repair Menu -> Alternate Cluster Repair
Menu -> options to:
– Quiesce the Alternate Cluster
– Power Off the Alternate Cluster
– Power On the Alternate Cluster.
– Observe the cluster operator panel during power on and code load. If it
loads normally, then use Resume Alternate Cluster to return the cluster to
customer use. If the cluster fails with a problem log created, repair it. If the
cluster fails with no problem created, call the next level of support.
888 is displayed followed by
additional error codes.
Go to “MAP 4240: Isolating a Blinking 888 Error on the Cluster Operator Panel”
on page 334
Go to “MAP 4540: Isolating Problems on a Minimum Configuration Cluster” on
The cluster stops and POST
page 364.
indicators are displayed on the
service terminal session (if it had
been kept logically connected
since the cluster power on. The
term ″POST indicators″ refer to the
resource names that are listed
after the multiple lines of RS/6000
are displayed. They are ″memory
keyboard network SCSI speaker″.
MAP 4370: Error Displaying Problems Needing Repair
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The process to display the problem logs first attempts to access the problem log file
on the cluster bay that the service terminal is connected to. If the file cannot be
read, an error message will be included in the service terminal problem display
screen for that cluster bay.
344
VOLUME 1, ESS Service Guide
MAP 4370: Display Problem Needing Repair Error
The process to display the problem logs then attempts to access the problem log
file on the other cluster bay. It attempts to communicate through the cluster to
cluster ethernet connection. If there is no response from the other cluster bay when
trying to read the problem log, then an error message will be included in the service
terminal problem display screen for that cluster bay.
Isolation
1. Use the service terminal Show / Repair Problems Needing Repair option.
Is there a problem displaying the problem logs for the cluster bay the service
terminal is connected to? (The problem logs for the other cluster bay are
displayed without error.)
v Yes, continue with the next step.
v No, go to step 3.
2. Use these steps to reload the code for the failing cluster bay and try the
operation again.
v Connect the service terminal to the other cluster bay (working cluster).
v Go to the Alternate Cluster Repair Menu options
v Quiesce the Alternate Cluster (failing cluster bay)
v Power Off the Alternate Cluster
v Power On the Alternate Cluster
v Resume the Alternate Cluster
v Connect the service terminal back to the failing cluster.
v Display the problems needing repair again.
Does it still fail?
v Yes, call the next level of support. (The cluster bay SCSI hard drive may
need the rebuild process to reload its code.)
v No, go to “MAP 1500: Ending a Service Action” on page 68.
3. There is a problem displaying the problem logs for the other cluster bay (the
cluster bay the service terminal is not connected to).
Connect the service terminal to the other cluster bay and attempt to login.
Is the Copyright and Login screen is displayed?
v Yes, continue with the next step.
v No, the Copyright and Login screen is not displayed, go to “MAP 6060:
Isolating a Service Terminal Login Failure To One Cluster” on page 432.
4. Attempt to display problems needing repair.
Does it now fail to the cluster bay the service terminal is connected to? (This is
the same cluster bay that originally failed.)
v Yes, continue with the next step.
v No, if no error message is displayed for this cluster, the problem is with the
cluster bay to cluster ethernet connection. Go to “MAP 4390: Isolating a
Cluster to Cluster Ethernet Problem” on page 347.
5. Reload the cluster bay code and try the operation again by doing the following:
v Connect the service terminal to the other cluster bay.
v Go to the Alternate Cluster Repair Menu options
v Quiesce the Alternate Cluster (failing cluster)
v Power Off the Alternate Cluster
v Power On the Alternate Cluster
Problem Isolation Procedures, CHAPTER 3
345
MAP 4370: Display Problem Needing Repair Error
v Resume the Alternate Cluster
v Connect the service terminal back to the failing cluster bay.
v Display the problems needing repair.
Does it still fail?
v Yes, call the next level of support. (The cluster bay SCSI hard drive may
need the rebuild process to reload its code.)
v No, go to “MAP 1500: Ending a Service Action” on page 68.
MAP 4380: Isolating a Customer LAN Connection Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Note: This MAP is for clusters directly connected to a customer LAN. MAP 4440 is
for clusters connected to the IBM RISC private LAN.
The clusters communicate to each other through an ethernet connection for the
RAS (maintenance package) operations. A new 2105 Model Exx/Fxx comes from
the factory with a short ethernet jumper cable directly connecting the RJ-45
connector on each cluster. The factory provided TCP/IP settings are changed to
match those provided by the customer on the Communication Resources Work
Sheet. Then the jumper cable is disconnected and cables to the customer LAN are
connected. A copy of the work sheet should be in the 2105 Model Exx/Fxx
document enclosure.
The cluster to cluster ethernet communications are tested each time:
v the 2105 Model Exx/Fxx is powered on.
v a cluster is powered on.
v CPI diagnostics are run.
v when the periodic diagnostics are automatically run each hour.
There are two types of customer LAN problems:
v cluster to cluster communications fail.
v cluster to cluster communications work, but the customer notification e-mails are
not received, or the customer cannot access a cluster with the WEB based tools.
Isolation
1. Test the cluster to cluster communication.
v Connect the service terminal to cluster 2. From the service terminal Main
Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Cluster 1 problems are listed first. If cluster to cluster communications
are working, cluster 2 problems will be listed below cluster 1 problems.
If cluster to cluster communications are not working, the following error
message will be listed below the cluster 1 problems. ″The problems
from the other cluster are inaccessible. The service terminal must be
346
VOLUME 1, ESS Service Guide
MAP 4380: Customer LAN Connection
moved to the alternate cluster to display the problems on that cluster.″
(The cluster to cluster communication occurs even if no problems are
found.)
v If a communication error message is displayed, go to “MAP 4390:
Isolating a Cluster to Cluster Ethernet Problem”.
v If there is no communication error message, the customer LAN is
working well enough to allow cluster to cluster communication
through the customer ethernet hub. Go to the next step.
2. Send a test e-mail message from the cluster to the customer.
v Connect the service terminal to cluster 2. From the service terminal Main
Service Menu, select:
Machine Test Menu
Send Test Notification Menu
Customer Notification (via E-mail)
If the test says it passed, then have the customer determine if the
e-mail was received. It should go to the destination defined by the
Communications Work Sheets.
– If the e-mail was received, go to step 4.
– If the e-mail was not received, go to step 3.
3. Display the configured e-mail destinations to ensure they match the work
sheets. From the service terminal Main Service Menu, select:
Configuration Options Menu
Configure Communications Resources Menu
Configure E-Mail Menu
List Configured E-mail Destinations
– If the destinations are correct, then the problem appears to be
with the customer LAN and network. Notify the customer of
their problem.
– If the e-mail is received, go to the next step.
4. Have the customer ping each cluster TCP/IP address (AIX command issued
from a customer network console to test communication to a TCP/IP address).
The ping command will display the round trip communication delay in
milliseconds or will hang if the TCP/IP address does not respond. Enter CTRL/C
to stop the hang, otherwise it can slow down the customer LAN as it keeps
retrying the address.
v If the ping works, the customer network is able to access the 2105 Model
Exx/Fxx . Any remaining customer problems are most likely with the TCP/IP
addresses defined in the customer applications.
v If the ping hangs, the customer has a network problem.
5. Call the next level of support if the customer is not able to determine the
problem.
MAP 4390: Isolating a Cluster to Cluster Ethernet Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
347
MAP 4390: Cluster to Cluster Ethernet
Description
The clusters communicate to each other through an ethernet connection for the
RAS (maintenance package) operations. A new 2105 Model E10/E20 comes from
the factory with a short ethernet jumper cable directly connecting the RJ-45
connector on each cluster. This special jumper cable crosses signals within the
cable so an ethernet hub is not needed to direct connect the clusters to each other.
All 2105 Model Exx/Fxx leave the factory set with the same pair of TCP/IP
addresses, one for each cluster bay. Those TCP/IP addresses are changed when
connected to the ESSNet console or customer ethernet.
Isolation
1. Observe the 2105 Model E10/E20 operator panel Ready indicator LEDs (Light
Emitting Diode).
Is the Ready indicator LED on for each cluster?
v Yes, go to 3.
v No, continue with the next step.
2. Connect the service terminal to the cluster with the Ready indicator LED off.
Attempt to login.
Was the login successful?
v Yes, continue with the next step.
v No, go to “MAP 20A0: Cluster Not Ready” on page 72.
3. Does each cluster bay have an ethernet cable connected?
v Yes, continue with the next step.
v No, Each cluster bay must have access to the other cluster bay through an
ethernet connection. That connection can be through an ESSNet ethernet
hub or through the cluster interconnect ethernet cable that goes directly
between both cluster bays. Use the 2105 Model E10/E20 service guide
2105 Model E10/E20 install procedures or the ESSNet install guide to make
the ethernet connections.
4. Connect the service terminal to cluster bay 1. Use the Repair Menu, Display /
Repair Problems Needing Repair option, which displays problems from both
clusters. If the cluster to cluster communication is not working, it will give an
error message for cluster bay 2.
Is there an error message for cluster bay 2?
v Yes, go to step 6.
v No, continue with the next step.
5. Connect service terminal to cluster bay 2. Use the Repair Menu, Display /
Repair Problems Needing Repair option.
Is there an error message for cluster bay 1?
v Yes, continue with the next step.
v No, neither cluster bay is failing. Go to the service terminal Repair Menu,
End of Call Status option.
6. Each cluster needs its own TCP/IP address setting and the TCP/IP address
setting of the other cluster bay. If either of the settings in a cluster are
incorrect, the clusters will not be able to communicate.
Use the following two service terminal options while connected to each cluster
bay:
From the service terminal Main Service Menu, select:
Configuration Options Menu
Configure Communications Resources Menu
348
VOLUME 1, ESS Service Guide
MAP 4390: Cluster to Cluster Ethernet
Change / Show TCP/IP Configuration
Review the following information:
v Minimum Configuration & Startup
Ensure the proper TCP/IP protocol (available network interface) is
selected. The entire network must use the same protocol.
v Configure Alternate Cluster IP Address and Hostname
Do the settings in each cluster bay match properly?
v Yes, continue with the next step.
v No, correct the settings and retry displaying problems that need repair.
7. Are the cluster bays connected to an ESSNet ethernet hub?
v Yes, go to step 10.
v No, continue with the next step.
8. Can both clusters be connected to an ESSNet at this time?
v Yes, use the ESSNet installation procedures to connect the clusters.
Reference ″Installation of the ESSNet and ESSNet Console″ in chapter 5 of
the Enterprise Storage Server Service Guide, Volume 2 book.
v No, continue with the next step.
9. One of the following FRUs is failing: I/O Planar card or Ethernet Cable in either
cluster bay, Cluster Interconnect Ethernet cable. Use the Repair Menu,
Replace a FRU options.
Note: The cluster interconnect ethernet cable can be plugged and unplugged
without using the service terminal.
10. Ensure the following ESSNet ethernet hub indications are present:
a. Power LED is on.
b. Error indicator LEDs are off. Reference the ethernet hub maintenance
documentation.
Are the hub indicators as listed above?
v Yes, continue with the next step.
v No, go to the ESSNet ethernet hub maintenance documentation to correct
the problem.
11. Are the ethernet cables from the cluster bays connected to the ESSNet
ethernet hub?
v Yes, continue with the next step.
v No, connect the cables and then repeat steps 4 on page 348 and 5 on
page 348. If the failure still occurs go to the next step.
12. Observe the ESSNet ethernet hub port indicators for the ports connected to
cluster bay 1 and cluster bay 2. The indicator is:
v Off, if the hub port cannot detect the cluster
v On, if the hub port can detect the cluster.
v Blinking, if the hub port is passing data to/from the cluster.
Find the condition you have:
a. Cluster bay 1 hub port On/blink, cluster bay 2 hub port On/blink. Go to step
13 on page 350.
b. Cluster bay 1 hub port On/blink, cluster bay 2 hub port Off. Go to step 14
on page 350.
Problem Isolation Procedures, CHAPTER 3
349
MAP 4390: Cluster to Cluster Ethernet
c. Cluster bay 1 hub port Off, cluster bay 2 hub port On/blink. Go to step 14.
d. Cluster bay 1 hub port Off, cluster bay 2 hub port Off. Go to step 16.
13. Go to the ESSNet console and open a DOS window. At the command line,
enter a ping command with the cluster bay TCP/IP address. This will test the
communication from the ESSNet server to each cluster bay. The format is:
’ping 9.113.24.123’.
If the ping is successful, a line of information will be displayed each time data
is received back from the cluster:
For example: 64 bytes from 9.113.24.123: icmp_seq=0 ttl=252 time=4ms. If
the ping is not successful, the line of information will not display and the test
will appear to hang with no response.
Note: Do not leave the ping test running, it will slow down all
communications through the hub. Press Ctrl/C to quit the ping test.
Find the condition you have:
v The ping test worked to both cluster bays. If the cluster to cluster
communications still fail, call the next level of support. If the communications
now work, go to the service terminal Repair Menu, End of Call Status
option.
v The ping test failed to one cluster bay. Go to step 17.
v The ping test failed to both cluster bays. The ESSNet console may not be
able to talk with the ethernet hub. Go to “MAP 4440: ESSNet Console to
Cluster Bay Problem” on page 352.
14. One ESSNet hub port indicator is on, the other is off. Swap the ethernet
cables between the two ESSNet hub ports.
Is the same ESSNet hub port indicator off?
v Yes, go to the ethernet hub maintenance manual with the symptom that one
port indicator does not come on. The hub may need to be reset or replaced.
v No, continue with the next step.
15. One ESSNet hub port indicator is on, the other is off. Swap the ethernet
cables between the two cluster bay ethernet ports. (Do not move the ESSNet
hub port ends of the cables.)
Is the same hub port indicator off?
v Yes, replace the ethernet cable connected to the hub port with the indicator
off.
v No, the cluster bay connected to the hub port with the indicator off is failing.
One of the following FRUs is failing: I/O Planar card or Ethernet Cable in
the cluster bay. Use the Repair Menu, Replace a FRU option.
16. Go to the ethernet hub maintenance manual with the symptom that more than
one port indicator that should be on is off. The hub may need to be reset or
replaced.
17. Ensure that correct TCP/IP address was entered in the ping command. Use
the step 5 procedure to display the cluster bay TCP/IP address.
Was the TCP/IP address used correct?
v Yes, continue with the next step.
v No, correct the TCP/IP address and retest.
18. Swap the ethernet cables at the cluster bays. Do not swap the ethernet cables
at the ESSNet hub ports. Do the ping test to each cluster.
Find the condition you have:
350
VOLUME 1, ESS Service Guide
MAP 4390: Cluster to Cluster Ethernet
v The original cluster bay still fails, the other cluster bay still works. One of the
following FRUs is failing: I/O Planar card or Ethernet Cable in the failing
cluster bay. Use the Repair Menu, Replace a FRU option.
v The original cluster bay works, the other cluster bay now fails. Continue with
the next step.
v Both clusters now work. Reseating the cables corrected the problem. Go to
the Repair Menu, End of Call Status option.
19. Swap the cluster bay ethernet cables at the hub ports. Do not swap the
ethernet cables at the cluster bays. Do the ping test to each cluster.
Find the condition you have:
v The original cluster bay fails, the other cluster bay works. The ethernet hub
port is failing. Go to the ESSNet ethernet hub maintenance documentation
to correct the problem.
v The original cluster bay works, the other cluster bay fails. Replace the
ethernet cable to the failing cluster bay and repeat the test. Then go to the
Repair Menu, End of Call Status option.
v Both clusters now work. Reseating the cables corrected the problem. Go to
the Repair Menu, End of Call Status option.
MAP 4400: Displaying Cluster SMS Error Logs
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The SMS (System Management Services) includes an option to display SMS error
logs for problems that may not have created a problem log viewable with the
service terminal.
Displaying SMS error logs requires the cluster to be taken away from customer use.
Procedure
1. Access the SMS menu options with the service terminal, see ″Entry for Service
Terminal Activities″ in chapter 8 of the Enterprise Storage Server Service Guide,
Volume 3.
2. Display the error logs, see ″Utilities″ in Appendix B of the Enterprise Storage
Server Service Guide, Volume 3.
MAP 4420: Displaying I/O Planar UAA LAN Address
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The SMS (System Management Services) includes an option to Display
Configuration that includes the UAA of the integrated ethernet adapter.
Problem Isolation Procedures, CHAPTER 3
351
MAP 4420: I/O Planar UAA LAN Address
Using SMS requires the cluster to be taken away from customer use.
Procedure
1. Access the SMS menu options with the service terminal, see ″Entry for Service
Terminal Activities″ in chapter 8 of the Enterprise Storage Server Service Guide,
Volume 3.
2. Display the configuration, see ″Display Configuration″ in Appendix B of the
Enterprise Storage Server Service Guide, Volume 3 .
MAP 4440: ESSNet Console to Cluster Bay Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
ESSNet console to cluster bay problem.
Procedure
1. Ensure the following ESSNet ethernet hub indications are present:
a. Power LED is on.
b. Error indicator LEDs are off. Reference the ethernet hub maintenance
documentation.
Are the hub indicators as listed above?
v Yes, continue with the next step.
v No, go to the ESSNet ethernet hub maintenance documentation to correct
the problem.
2. Are the ethernet cables from the cluster bays connected to the ESSNet
ethernet hub?
v Yes, continue with the next step.
v No, connect the cables and then repeat steps 3 and 4. If the failure still
occurs go to the next step.
3. Observe the ESSNet ethernet hub port indicators for the ports connected to
cluster bay 1 and cluster bay 2.
The indicator is:
v Off, if the hub port cannot detect the cluster.
v On, if the hub port can detect the cluster.
v Blinking, if the hub port is passing data to/from the cluster.
Is the hub port indicator for each cluster On/Blinking?
v Yes, continue with the next step.
v No, go to “MAP 4390: Isolating a Cluster to Cluster Ethernet Problem” on
page 347.
4. Observe the ESSNet ethernet hub port indicator for the port connected to the
ESSNet console.
The indicator is:
v Off, if the hub port cannot detect the cluster.
v On, if the hub port can detect the cluster.
352
VOLUME 1, ESS Service Guide
MAP 4440: ESSNet Console to Cluster Bay Problem
v Blinking, if the hub port is passing data to/from the cluster.
Is the hub port indicator On/Blinking?
v Yes, go to step 9.
v No, continue with the next step.
5. Ensure the ethernet cable from the ethernet hub to the ESSNet server is
connected.
Is it connected?
v Yes, continue with the next step.
v No, connect the cable and retest.
6. Connect the ESSNet console ethernet cable to another hub port.
Is the hub port indicator On/Blinking?
v Yes, the original hub port is failing. Use the ESSNet ethernet hub
documentation to correct the problem. The hub may need to be reset or
replaced.
v No, connect the cable and retest.
7. Replace the ESSNet console to hub ethernet cable and ensure it is plugged
into the original hub port.
Is the hub port indicator On/Blinking?
v Yes, go to step 9.
v No, continue with the next step.
8. Observe the ESSNet console ethernet port indicator.
Is it On/Blinking?
v Yes, the ethernet port is seeing the ethernet hub, even though that port
indicator is off. Use the ESSNet ethernet hub documentation to correct the
problem. The hub may need to be reset or replaced.
v No, use the ESSNet console documentation to ensure it is installed and
configured properly. Run any diagnostics as needed. If it still fails, call the
next level of support.
9. Go to the ESSNet console and open a DOS window. At the command line,
enter a ping command with the cluster bay TCP/IP address. This will test the
communication from the ESSNet console to each cluster bay. The format is:
’ping 9.172.31.1’.
If the ping is successful, a line of information will be displayed each time data
is received back from the cluster:
For example: 64 bytes from 9.113.24.123: icmp_seq=0 ttl=252 time=4ms. If
the ping is not successful, the line of information will not display and the test
will appear to hang with no response.
Note: Do not leave the ping test running, as it will slow down all
communications through the hub. Press Ctrl/C to quit the ping test.
Was the ping test to the cluster bay successful?
v Yes, the ESSNet is able to communicate with the cluster bay. Go to the
Repair Menu, End of Call Status option.
v No, continue with the next step.
10. Ensure that the TCP/IP minimum configuration and startup fields are set
correctly. Compare it to the customer provided worksheet. Ensure that the
correct TCP/IP protocol (network interface) is selected, en0 or et0. The entire
Problem Isolation Procedures, CHAPTER 3
353
MAP 4440: ESSNet Console to Cluster Bay Problem
network must use the same protocol. Check it against the customer provided
TCP/IP address. Use the following service terminal option while connected to
the failing cluster bay:
From the service terminal Main Service Menu, select:
Configuration Options Menu
Configure Communications Resources Menu
Change/Show TCP/IP Configuration
Minimum Configuration & Startup.
Ensure that the correct TCP/IP protocol (network interface) is
selected, en0 or et0.
Does the TCP/IP address used in the ping command match the cluster bay
TCP/IP address displayed?
v Yes, call your next level of support.
v No, retry step 9 on page 353 with the correct TCP/IP address.
MAP 4450: ESSNet Cluster Bay to Customer Network Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Model Exx/Fxx cluster bay ethernet connections to the customer LAN
network are made through the ESSNet console direct attached eternet hub. All the
TCP/IP settings including the ethernet protocol (en0 or et0) across the network
must be compatible.
Isolation
1. Ensure the following ESSNet ethernet hub indications are present:
a. Power LED is on.
b. Error indicator LEDs are off. Reference the ethernet hub documentation.
Are the hub indicators as listed above?
v Yes, continue with the next step.
v No, use the ESSNet ethernet hub documentation to correct the problem.
2. Observe the ESSNet ethernet hub port indicators for the ports connected to
cluster bay 1 and cluster bay 2.
The indicator is:
v Off, if the hub port cannot detect the cluster.
v On, if the hub port can detect the cluster.
v Blinking, if the hub port is passing data to/from the cluster.
Is the hub port indicator for the cluster bay On/Blinking?
v Yes, continue with the next step.
v No, go to “MAP 4390: Isolating a Cluster to Cluster Ethernet Problem” on
page 347.
3. Observe the ESSNet ethernet hub port indicator for the port connected to the
customer network.
354
VOLUME 1, ESS Service Guide
MAP 4450: ESSNet Cluster Bay to Customer Network Problem
the indicator is:
v Off, if the hub port cannot detect the cluster.
v On, if the hub port can detect the cluster.
v Blinking, if the hub port is passing data to/from the cluster.
Is the hub port indicator On/Blinking?
v Yes, continue with the next step.
v No, go to “MAP 4440: ESSNet Console to Cluster Bay Problem” on
page 352.
4. Observe the ESSNet ethernet hub port indicator for the port connected to the
customer LAN.
The indicator is:
v Off, if the hub port cannot detect the cluster.
v On, if the hub port can detect the cluster.
v Blinking, if the hub port is passing data to/from the cluster.
Is the hub port indicator On/Blinking?
v Yes, continue with the next step.
v No, go to step 9 on page 356.
5. Ensure that the TCP/IP minimum configuration and startup fields are set
correctly. Compare it to the customer provided worksheet. Ensure that the
correct TCP/IP protocol (network interface) is selected, en0 or et0. The entire
network must use the same protocol. Check it against the customer provided
TCP/IP addresses. Use the following service terminal option while connected
to the failing cluster bay:
From the service terminal Main Service Menu, select:
Configuration Options Menu
Configure Communications Resources Menu
Change / Show TCP/IP Configuration
Minimum Configuration & Startup
Ensure the correct TCP/IP protocol (network interface), en0 or
et0 is selected.
Are the fields set correctly?
v Yes, continue with the next step.
v No, correct the fields and retest the communications.
6. Go to the ESSNet console and open a DOS window. At the command line,
enter a ping command with the cluster bay TCP/IP address. This will test the
communication from the ESSNet console to the cluster bay. The format is:
ping 9.113.24.123 (use your TCP/IP address).
If the ping is successful, a line of information will be displayed each time data
is received back from the cluster:
For example: 64 bytes from 9.113.24.123: icmp_seq=0 ttl=252 time=4ms. If
the ping is not successful, the line of information will not display and the test
will appear to hang with no response.
Note: Do not leave the ping test running, as it will slow down all
communications through the hub. Press Ctrl/C to quit the ping test.
Was the ping test to the cluster bay successful?
Problem Isolation Procedures, CHAPTER 3
355
MAP 4450: ESSNet Cluster Bay to Customer Network Problem
v Yes, continue with the next step.
v No, go to “MAP 4440: ESSNet Console to Cluster Bay Problem” on
page 352.
7. Enter a ping command with the customer Nameserver TCP/IP address. Enter a
ping command with the customer Gateway TCP/IP address.
Was each ping test successful?
v Yes, continue with the next step.
v No, the ESSNet console and ethernet hub have proper indicators on and
have TCP/IP addresses that are listed on the customer worksheet. Work
with the customer to isolate the problem. The TCP/IP values on the
worksheet may no longer be correct. Have the customer ping the ESSNet
console and the cluster bay from their network.
8. Use the service terminal connected to the failing cluster to send a test e-mail.
Ensure the cluster is configured for e-mail notification.
From the service terminal Main Service Menu, select:
Machine Test Menu
Send Test Notification Menu
Customer Notification (via E-mail)
Did the customer receive the test e-mail?
v Yes, the cluster bay connection to the customer network is working fine. Go
to the Repair Menu, End of Call Status option.
v No, the ESSNet console was able to ping the cluster bay and the customer
network. If the cluster bay TCP/IP settings for the customer network are
correct, there should not be a problem. Work with the customer to resolve
the problem.
9. Ensure the ethernet cable from the ESSNet console to the ethernet hub is
properly connected.
Is the cable connected at both ends?
v Yes, continue with the next step.
v No, connect the cable and retry the test.
10. Have the customer ensure their ethernet hub is on and has no check
conditions for the hub or the port that is connected to the ESSNet ethernet
hub. Have the customer reset the hub if possible.
Is the customer ethernet hub on and error free?
v Yes, continue with the next step.
v No, have the customer correct the problem and then retest.
11. At the ESSNet ethernet hub, unplug the customer ethernet cable and plug it in
to a known good port.
Is the hub port indicator On/Blinking?
v Yes, the original hub port was not working. Use the hub documentation to
correct the problem. The hub may need to be reset or replaced.
v No, reconnect the cable to its original port. Go to the next step.
12. At the customer ethernet hub, have the customer unplug the customer ethernet
cable and plug it into a known good port.
Is the port indicator On/Blinking?
v Yes, the original hub port was not working. Have the customer correct the
problem. The hub may need to be reset or replaced.
356
VOLUME 1, ESS Service Guide
MAP 4450: ESSNet Cluster Bay to Customer Network Problem
v No, have the customer reconnect the cable to its original port. Continue with
the next step.
13. Have the customer test or replace the ethernet cable. Ensure the cable is the
proper type for the port speed and distance.
Is the port indicator on both ethernet hubs for this cable On/Blinking?
v Yes, the connection is now working. Retest the communication.
v No, call the next level of support.
MAP 4480: Isolating a Cluster / RPC Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A problem log with a FRU list that contains both RPC cards, a cluster I/O planar
and service processor card (2105 Model E10/E20 only). The service processor is
part of the I/O planar for the 2105 Model F10/F20.
v Each RPC card has a separate status register for each cluster that can be read.
v The path from the cluster code to the RPC registers is:
–
–
–
–
–
Cluster code
I/O planar
Service processor card (2105 Model E10/E20 only)
Cluster power supply
Power planar
– Electronics cage power planar
– Electronics cage power planar to sense card cable
– Electronics cage sense card
– RPC1 to electronics cage cable
– Rack power control card 1 (RPC card)
– RPC2 to electronics cage cable
– Rack power control card 2 (RPC card)
v The clusters compare the status they receive from the RPC cards. If the status is
not the same, the error recovery code will create a problem log and will fence
(remove from use) a cluster or an RPC Card. The resource fenced is the most
likely cause of the problem.
v There are four basic types of error conditions that are listed in the table below.
The fencing action for each type is shown. The fenced resource will normally
contain the FRU having the highest percent probability of fixing the error
condition. It should be replaced first.
Problem Isolation Procedures, CHAPTER 3
357
MAP 4480: Cluster / RPC
Table 27. Conditions for Fencing
Condition
Fences a Cluster
Fences an RPC
Card
Only one cluster reads bad status from both RPC
cards. The other cluster reads good information.
Yes
No
Only one cluster reads bad status from one RPC
card. The other cluster reads good information from
the same card.
No
Yes
An RPC card presents invalid status to one or both
clusters.
No
Yes
A cluster cannot read the status from one RPC card.
No
Yes
v When replacing a cluster FRU, the communication to both RPC Cards is only
tested if both RPC Cards are not fenced. If an RPC Card is fenced, it must be
quiesced and then resumed to test the communication from the cluster.
v When replacing an RPC Card, the cluster to cluster comparison of the RPC
status occurs only if both clusters are not fenced or quiesced. If a cluster is
fenced or quiesced, it must be resumed to run the cluster to cluster RPC status
comparison.
Isolation
1. Display the problem log details that sent you here and write down the
timestamp value in the last occurrence field. After the FRU replace you will
display this field again. If the value has been updated, then the same failure is
still occurring and additional FRUs will need to be replaced.
2. The FRU list contains both RPC cards and one or more cluster FRUs. It is
recommended to replace the FRU with the highest probability first.
v To replace a cluster FRU, go to step 3.
v To replace an RPC card FRU, go to step 9 on page 359.
3. Go to “MAP 4700: Replacing Cluster FRUs” on page 375 to replace the cluster
FRU.
Return here after the cluster FRU replacement is completed and the cluster
bay has come ready.
4. Display the problem logs to determine if a problem is still occurring.
v If the original problem log last occurrence timestamp value has been
updated, the problem is still occurring. Return to the beginning of this MAP
to replace the remaining FRUs or call the next level of support if all FRUs
have been replaced.
v If a new related problem log was created, repair that problem now. After that
repair is complete return to this MAP if the original problem is still occurring.
(The last occurrence timestamp field value of the original problem log was
updated during the last cluster bay power on.)
v If the original problem log was not updated and there is no new related
problem log, continue with the next step.
5. Quiesce and then Resume RPC-1. This will ensure that both cluster bays read
the status register from the RPC-1 card.
From the service terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Quiesce a Resource
358
VOLUME 1, ESS Service Guide
MAP 4480: Cluster / RPC
Select the Rack Power Control Card to quiesce. Use the Resume a
Resource option to resume that RPC Card.
6. Display the problem logs to determine if a problem is still occurring.
v If the original problem log last occurrence timestamp value has been
updated, the problem is still occurring. Return to the beginning of this MAP
to replace the remaining FRUs. If all FRUs FRUs have been replaced, call
the next level of support.
v If a new related problem log was created, repair that problem now. After that
repair is complete return to this MAP if the original problem is still occurring.
(The last occurrence timestamp field value of the original problem log was
updated during the last cluster bay power on.)
v If the original problem log was not updated and there is no new related
problem log, continue with the next step.
7. Quiesce and then Resume RPC-2. This will ensure that both cluster bays read
the status register from the RPC-2 card.
From the service terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Quiesce a Resource
Select the Rack Power Control Card to quiesce. Use the Resume a
Resource option to resume that RPC Card.
8. Display the problem logs to determine if a problem is still occurring.
v If the original problem log last occurrence timestamp value has been
updated, the problem is still occurring. Return to the beginning of this MAP
to replace the remaining FRUs. If all FRUs have been replaced, call the
next level of support.
v If a new related problem log was created, repair that problem now. After that
repair is complete return to this MAP if the original problem is still occurring.
(The last occurrence timestamp field value of the original problem log was
updated during the last cluster bay power on.)
v If the original problem log was not updated and there is no new related
problem log, go to “MAP 1500: Ending a Service Action” on page 68.
9. Replace the RPC Card. Use the service terminal Replace A FRU option to
replace the RPC card. Then return here and continue with the next step.
10. Display the problem logs to determine if a problem is still occurring.
v If the original problem log last occurrence timestamp value has been
updated, the problem is still occurring. Return to the beginning of this MAP
to replace the remaining FRUs. If all FRUs have been replaced, call the
next level of support.
v If a new related problem log was created, repair that problem now. After that
repair is complete return to this MAP if the original problem is still occurring.
(The last occurrence timestamp field value of the original problem log was
updated during the last cluster bay power on.)
v If the original problem log was not updated and there is no new related
problem log, continue with the next step.
11. Determine if a cluster is fenced.
From the service terminal Main Service Menu, select:
Utilities Menu
Resource Management Menu
Show Fenced Resources
Problem Isolation Procedures, CHAPTER 3
359
MAP 4480: Cluster / RPC
v If a cluster is fenced, continue with the next step.
v If no cluster is fenced, go to “MAP 1500: Ending a Service Action” on
page 68.
12. Quiesce the cluster bay using the Alternate Cluster Repair menu options.
Connect the service terminal to the cluster that is not fenced. From the service
terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
Resume the cluster using the Alternate Cluster Repair menu options. The
resume causes the cluster bay to load code as if it were being powered on. It
then does a fail-back of the resources from the other cluster.
13. Display the problem logs to determine if a problem is still occurring.
v If the original problem log last occurrence timestamp value has been
updated, the problem is still occurring. Return to the beginning of this MAP
to replace the remaining FRUs. If all FRUs have been replaced, call the
next level of support.
v If a new related problem log was created, repair that problem now. After that
repair is complete return to this MAP if the original problem is still occurring.
(The last occurrence timestamp field value of the original problem log was
updated during the last cluster bay power on.)
v If the original problem log was not updated and there is no new related
problem log, go to “MAP 1500: Ending a Service Action” on page 68.
MAP 44F0: Electronics Cage Cooling Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The electronics cage sense card indicated that all four electronics cage cooling fans
were not turning. The power error recovery code powered off both host bays and
the cluster bay in this electronics cage. The customer resources were ″failed over″
to the working cluster. The power off was necessary to prevent any failures due to
over-temperature. This error is a failure of the fan sense card or the common 12
volt power to the fans.
When the electronics cage is powered on, the fan sense card gives status to both
RPC cards. If the status indicates all four fans are failing, the working cluster
microcode will tell the RPC cards to power off the failing electronics cage host bays
and cluster bays. This check occurs when the electronics cage is powered on,
during 2105 Model Exx/Fxx power on or while replacing FRUs with MAP 4790
below. This check is also active after the power on is complete.
If the failure is in the fan sensing, the fans will turn normally on power up until the
false fan failures power off the electronics cage. If the failure is in the 12 volts to the
fans, the fans will not turn at all.
360
VOLUME 1, ESS Service Guide
MAP 44F0: Electronics Cage Cooling Problem
Isolation
1. Ensure you have read the description above and these steps before going to
“MAP 4790: Repairing the Electronics Cage” on page 395 to replace the FRUs.
The MAP power on procedure can be used before replacing any FRUs to
isolate if the fans turn when the electronics cage is first powered on.
v If the fans turn, replace the Electronics Cage Sense Card.
v If the fans do not turn, they are not getting the 12 volts. See the next step.
2. The possible FRUs are:
v Electronics cage sense card
v Electronics cage power supply (one of the three may be shorting the common
output bus).
v Electronics cage power planar
v Cable Assembly - Fan/RPC to Upper Backplane
MAP 4500: Isolating an ESC=5xxx
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The failing cluster has created an AIX operating system SRN (Service Reference
Number) that were then used to build a problem log with an ESC=5xxx (where xxx
are the first three digits of the SRN). The SRN will be 6 or 8 characters long. The
SRN will be looked up in an SRN reference table to determine the failing FRU or
further isolation procedures needed to determine the failing FRU.
Isolation
Use the following MAP steps to continue this repair action.
1. Ensure that the problem log is still displayed on the service terminal. Record the
values in the following fields:
v ESC
v SRN
v Description
v Additional Message
v Failing Cluster
v Reporting Cluster
v Ignore the information in the Failure Actions, Probable Cause, Failure Cause
and User Actions fields.
2. Lookup each SRN listed in the problem log and read its description and action
information.
Then return here and continue at the next step.
v For 6 digit SRN (XXX-XXX), go to ″Service Request Number List″ in chapter
9 of the Enterprise Storage Server Service Guide, Volume 3.
v For an 8 digit SRN (XXXXXXXX), go to ″Firmware/POST Error Codes″ in
chapter 9 of the Enterprise Storage Server Service Guide, Volume 3.
– If FRU(s) are listed with no further isolation needed, then go to “MAP
4700: Replacing Cluster FRUs” on page 375.
Problem Isolation Procedures, CHAPTER 3
361
MAP 4500: ESC=5xxx
– If further isolation is needed, then go to the listed MAP to determine the
failing FRU.
MAP 4510: Isolating a Cluster to Cluster CPI Communication Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
This MAP is used for a cluster to cluster CPI communication timeout. The
communication after AIX is loaded and as the functional code loads occurs across
the CPI interfaces (cluster 1 I/O Attachment Card to the Host Bay Planar Card to
the cluster 2 I/O Attachment Card). There are four CPI interfaces that may be used.
Once the cluster code is loaded, each cluster periodically sends a communication
message to the other cluster (heartbeat) and sets a timer waiting for the response.
If the timer expires with no response, the error recovery process will cause the
non-responding cluster to failover its resources to the originating cluster. The
non-responding cluster is then fenced (which removes customer use of that cluster).
The originating cluster attempts to power cycle the non-responding cluster to reload
its code in an attempt to recover it for customer use. A timer is set waiting for the
code load and failback to complete.
v If the non-responding cluster hangs loading the code, this become a cluster boot
or cluster down problem. This will cause the working cluster to have a
communication timeout and it will create a problem log with MAP 4510 for
isolation. The code load process normally leaves an error or progress code
displayed in the Cluster Operator Panel.
v The 2105 Model Exx/Fxx code will begin cluster to cluster communication testing
(heartbeats) during the code loading. It checks all 4 CPI paths. If any fail, the
cluster is power cycled up to two times to reload the code and attempt to clear
the condition. If the communication timeout is still present, the failing CPI path
will be fenced. If all 4 CPI paths are failing, the cluster will be fenced.
v If the cluster successfully loads the code, then the error recovery process will
attempt to failback the resources to their original cluster. If the failback is not
successful this creates a communication timeout which will create a problem with
MAP 4510 for isolation.
v If the failback is successful, the error recovery timer is reset and a
communication timeout will not occur. The cluster that created the original
communication problem may still have created a problem log, even if it was
temporary and the cluster recovered and the cluster Ready indicator on the 2105
Model Exx/Fxx operator panel is on.
Isolation
Use the following MAP steps to continue this repair action.
1. Ensure that the problem log is still displayed on the service terminal. Note the
following:
v Failing Cluster should be the other cluster (not the one the service terminal is
connected to).
v Reporting Cluster should be the cluster you are connected to.
v Ignore the information in the Failure Actions, Probable Cause, Failure Cause
and User Actions fields.
362
VOLUME 1, ESS Service Guide
MAP 4510: Cluster to Cluster CPI Communication
2. Observe the cluster bay Ready indicator LED for the failing cluster on the 2105
Model Exx/Fxx operator panel.
Is the Ready indicator LED on?
v Yes, the cluster has successfully completed the power on error recovery.
Display problems needing repair and repair any related problems. Then go to
“MAP 1500: Ending a Service Action” on page 68.
v No, the cluster did not successfully complete the power on error recovery.
Continue with the next step.
v Observe the cluster bay operator panel.
Is the cluster hung displaying a code on the operator panel?
– Yes, go to “MAP 4360: Isolation Using Codes Displayed by the Cluster
Operator Panel” on page 342 and use the codes displayed on the cluster
bay operator panel.
– No, display problems needing repair and repair any related problems. If
there are none, call the next level of support.
MAP 4520: Pinned Data and/or Volume Status Unknown
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
Pinned Data can exist for DASD Fast Write, High Bandwidth Sequential Fast Write,
and Cache Fast Write Data. Pinned Data is caused by failures that prevent data
from being destaged to DASD. These are either DASD failures that make the
array/volume unavailable or failures that make cache and/or NVS data unavailable.
Pinned Data can only be freed or un-pinned by successful retry of the destage
operation or a request to discard the pinned data is received from the host or
service interface.
Isolation
1.
Use this step to collect the needed information and then call the next level of
support. Do not perform any repair unless directed by the next level of
support. If repairs are performed in the wrong sequence, customer data loss
can occur.
a. Determine all of the volumes with Pinned Data and/or Volume Status
Unknown.
From the service terminal Main Service Menu, select:
Utilities Menu
Pinned Data Menu/Volume Status Unknown
Display Pinned Data
Note: Volumes displayed have retryable pinned data, non-retryable pinned
data or FC (no global subsystem status). A volume can be listed with
more than one pinned data status. Pinned data status can be caused
by hardware problems which create problem logs. Retryable pinned
Problem Isolation Procedures, CHAPTER 3
363
MAP 4520: Pinned Data and/or Volume Status Unknown
data is normally caused by DASD or SSA interface problems.
Non-retryable pinned data is normally caused by cluster problems.
FC status can be caused by either of the above problem types.
b. Display problems needing repair. From the service terminal Main Service
Menu, select:
Repair Menu
Show / Repair Problem Needing Repair
c. Continue with the next step.
2. Call your next level of support now. Have ready the information you gathered in
the last step. Your next level of support may need to login remotely and perform
additional problem analysis.
3. Your next level of support may direct you to do the following steps after they
have reviewed all of the information. They may change the order of the repairs.
Wait for them to guide you before continuing.
4. Are there any DASD or SSA interface related problem logs?
v Yes, repair the DASD or SSA interface problem logs. The repair may allow
retryable pinned data to destage. (An SSA loop with only one DDM failure will
not normally cause pinned data if the DDM is part of a RAID array.)
v No, continue with the next step.
5. Are there any cluster related problem logs?
v Yes, repair the cluster problem logs. The repair may allow pinned data to
destage so the retryable pinned data status is reset. The repair process may
require you to discard non-retryable pinned data before the FRUs are
replaced. This will cause customer data loss.
v No, continue with the next step.
6. After all related repairs have been completed, display the pinned data status.
Do any volumes still have retryable or non-retryable pinned data?
v Yes, inform the next level of support.
v No, continue with the next step.
7. Do any volumes have FC status (no valid global subsystem status available)?
v Yes, go to “MAP 4560: No Valid Subsystem Status Available” on page 370.
v No, go to the Repair Menu, End of Call Status option.
MAP 4540: Isolating Problems on a Minimum Configuration Cluster
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
This MAP isolates a defective FRU that prevents the cluster from loading code and
becoming Ready. The isolation procedure removes cluster FRUs not needed prior
to accessing the SCSI Hard Drive during power on.
v If the cluster still fails, the remaining FRUs are replaced one at a time until the
failing FRU is identified.
v If the cluster no longer fails, then the FRUs are reinstalled one at a time until the
failing FRU is identified.
364
VOLUME 1, ESS Service Guide
MAP 4540: Cluster Minimum Configuration
Sometimes error conditions can be repaired by simply draining the I/O Planar
NVRAM which causes the settings to be reloaded on the next power on.
Note: Sometimes an error condition can be caused by the NVRAM settings being
corrupted. The error code may not even be related to the NVRAM or the I/O
Planar that contains the NVRAM and its battery backup. Once you get to
MAP step 4540-2 and have the cluster bay in the service position, you may
want to try draining the NVRAM and powering back on. To drain the
NVRAM, remove the I/O Planar battery and use a metal object to
momentarily touch the battery socket + and - contacts together. Then replace
the battery, power on the cluster bay to see if the problem has been
repaired. If it still fails then you can proceed with the remaining MAP steps.
MAP Step 4540-1
This step removes the cluster from customer use, displays the SP error logs, and
sets the cluster reboot value from 3 to 0.
1. Quiesce the failing cluster.
Connect the service terminal to the working cluster. From the service terminal
Main Service Menu, select:
Repair Menu
Alternate Cluster Repair Menu
Quiesce Alternate Cluster
2. Check the SP error logs. The service processor may have recorded one or
more symptoms in its error log.
Note: If the error condition does not allow this, continue with step 3.
a. Power off the cluster. Use the Alternate Cluster Repair Menu option
to do this, refer to ″Cluster Bay Power Off Using the Service
Terminal″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2. Verify the cluster power off by pressing the
CD-ROM eject button, the CD-ROM tray should not open.
b. Connect the service terminal to the failing cluster and press enter to
display the service processor Main Menu.
From the service processor Main Menu, select:
System Information Menu
Read SP Error Logs
3. Change the service processor reboot attempts setting, using this step, then
continue with “MAP Step 4540-2” on page 366.
The service processor reboot attempts setting from 3 to 0 for this isolation
process.
Note: If the error condition does not allow this, continue with “MAP Step
4540-2” on page 366.
Note: Remember to reset the reboot attempts back to 3, after the isolation is
complete.
From the service processor Main Menu, select:
System Power Control Menu
Reboot/Restart Power-On Menu
Number of reboot attempts
Problem Isolation Procedures, CHAPTER 3
365
MAP 4540: Cluster Minimum Configuration
MAP Step 4540-2
This step tests if the minimum configuration cluster is functional when the SCSI HD
drive, CD-ROM drive, most interfaces cables, I/O Attachment cards, NVS cards and
SSA cards are unplugged.
Attention: The FRUs and cables in this procedure are ESD-sensitive. Always wear
an ESD wrist strap during this isolation procedure. Follow the ESD procedures in
″Working with ESD-Sensitive Parts″ in chapter 4 of the Enterprise Storage Server
Service Guide, Volume 2.
Note: Sometimes an error condition can be caused by the NVRAM settings being
corrupted. The error code may not even be related to the NVRAM or the I/O
Planar that contains the NVRAM and its battery backup. Once you get to
MAP step 4540-2 and have the cluster bay in the service position, you may
want to try draining the NVRAM and powering back on. To drain the
NVRAM, remove the I/O Planar battery and use a metal object to
momentarily touch the battery socket + and - contacts together. Then replace
the battery, power on the cluster bay to see if the problem has been
repaired. If it still fails then you can proceed with the remaining MAP steps.
1. Slide the cluster bay into the service position.
Go to the correct cluster bay model repair procedure, in chapter 4 volume 2 of
this book, see:
v 2105 Model E10/E20, ″Cluster Bay Service Position (E10/E20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Bay Service Position (F10/F20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2
2. Open the cluster bay top cover.
Go to the correct cluster bay model repair procedure, in chapter 4 volume 2 of
this book, see:
v 2105 Model E10/E20, ″Cluster Top Bay Servicing (E10/E20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Top Bay Servicing (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2
3. Record the slot location of each I/O Attachment Card, NVS card and SSA card.
Then remove them. They will be plugged back in the same slots when this
isolation procedure is complete.
4. Remove the processor card in system planar slot C2.
Note: The processor cards have very little clearance between them. Ensure
no components are broken off as the processor card is removed. It may
be easier to first remove the processor card in slot C1, then in slot C2,
then plug back in slot C1.
5. Remove the memory card in system planar slot M2 (if installed).
6. Remove the memory card in system planar slot M1. Record the position of the
memory DIMMs on the memory card in system planar slot C1. Remove all the
installed memory DIMMs pairs except the first pair in slots J1 and J2. Reinstall
the memory card in system planar slot M1.
Note: A memory DIMM pair must be installed in slots that are next to each
other (example, J1 and J2 or J13 and J14). The width of a memory
word requires a pair of DIMMs.
7. Disconnect the SCSI signal cable from the I/O planar. (AIX location code 10-60
or physical location code R1-Ty-P2-Z1.1)
366
VOLUME 1, ESS Service Guide
MAP 4540: Cluster Minimum Configuration
8. Disconnect the diskette drive signal cable from the I/O planar. (Physical
location code R1-Ty-P2-D1.1)
9. Disconnect both serial interface cables. (AIX location code 01-S1 / physical
location code R1-Ty-P2-S1.1 ) (AIX location code 01-S3 / physical location
code R1-Ty-P2-S3.1 )
10. Disconnect the parallel interface cable. (AIX location code 01-R1.)
11. Disconnect the ethernet cables. (AIX location code 10-80 / physical location
code R1-Ty-P2-E1.1 )
Note: Ensure that the Operator Panel Cable has not been disconnected, it is
needed to display the checkpoints. (Physical location code
R1-Ty-P2-L1.1)
12. Close the cluster bay top cover.
Go to the correct cluster bay model repair procedure, in chapter 4 volume 2 of
this book, see:
v 2105 Model E10/E20, ″Cluster Top Bay Servicing (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Top Bay Servicing (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2
13. Slide the cluster bay into the operating position.
Go to the correct cluster bay model repair procedure, in chapter 4 volume 2 of
this book, see:
v 2105 Model E10/E20, ″Cluster Bay Service Position (E10/E20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Bay Service Position (F10/F20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2
14. Connect the service terminal to the working cluster. Use the Alternate Cluster
Repair Menu options to power on the failing cluster, refer to ″Cluster Bay
Power On Using the Service Terminal″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2 or ″Cluster Bay Power Off Using the Service
Terminal″ in chapter 4 of the Enterprise Storage Server Service Guide, Volume
2.
Note: Use the Alternate Cluster Repair Menu options to power on or off the
cluster during this procedure when FRUs are changed.
15. Wait up to 3 minutes for the operator panel to stabilize at a status code.
Does the operator panel stabilize with code E1F2, E1F3, E1F7, STBY, 20EE000B,
or 4BA0830 (boot device problems)?
Note: This is expected if the failing FRU has been removed or unplugged.
v Yes, go to “MAP Step 4540-4” on page 368.
v No, go to “MAP Step 4540-3”.
Attention: In the following map steps, refer to the following previous steps for
replacing cluster FRUs:
Cluster Removal, see step 1 on page 366.
Cluster Replacement, see steps 12, 14, and 15.
MAP Step 4540-3
The minimum cluster configuration still includes the failing FRU.
1. Move the memory card from system planar slot M1 to M2.
Problem Isolation Procedures, CHAPTER 3
367
MAP 4540: Cluster Minimum Configuration
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, the system planar slot M1 is failing. Replace the system planar, then go
to “MAP Step 4540-8” on page 369.
v No, continue with the next step.
2. Move the memory card from I/O planar slot M2 back to slot M1.
3. Exchange each of the following FRUs in order, until the operator panel stabilizes
with E1F7 or 20EE000B:
a. Processor card (Use second processor card if available.)
b. Memory DIMM pair (Not needed if second memory card is available. See
next item in this list.)
c. Memory card (Use second memory card if available. If it indicates that the
first memory card is failing, reinstall that memory card. Isolate the problem
by removing DIMM pairs, or using DIMM pairs from the known good memory
card. The failure may be a DIMM pair or the memory card.)
d. I/O planar
e. System planar
f. Service processor
g. I/O planar battery
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, the failing FRU has been replaced, go to “MAP Step 4540-8” on
page 369.
v No, replace the next FRU listed.
– If all the FRUs have been exchanged, call your next level of support.
– If the symptom has changed, check for loose cards, cables, and
obvious problems. If you do not find a problem, return to “MAP Step
4540-1” on page 365
MAP Step 4540-4
No failure was detected with the current configuration. The cluster stabilized with
E1F7 or 20EE000B as expected because the SCSI Hard Drive interface cable is
still unplugged.
1. Reinstall one or more DIMM pair(s) at a time on the slot M1 memory card. Then
reinstall the slot M2 memory card (if present). Check the operator panel code
after each FRU(s).
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, repeat this step until all memory FRUs have been reinstalled, then go to
“MAP Step 4540-5”.
v No, replace the failing memory FRU. If the memory FRU does not repair the
failure, replace the following FRUs in the order listed. System planar, I/O
planar, cluster power planar and cluster power planar cables. After the repair
is successful, go to “MAP Step 4540-5”.
MAP Step 4540-5
1. Reinstall the processor card in system planar slot C2.
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, go to “MAP Step 4540-6” on page 369.
v No, replace the processor card in system planar slot C2. If it still does not
stabilize at E1F7 or 20EE000B, replace the System Planar, then go to “MAP
Step 4540-8” on page 369.
368
VOLUME 1, ESS Service Guide
MAP 4540: Cluster Minimum Configuration
MAP Step 4540-6
1. Reconnect one or more of the following cables.
v Diskette drive signal cable. (Physical location code R1-Ty-P2-D1.1)
v Both serial interface cables. (AIX location code 01-S1 / physical location code
R1-Ty-P2-S1.1 ) (AIX location code 01-S3 / physical location code
R1-Ty-P2-S3.1 )
v Parallel interface cable. (AIX location code 01-R1.)
v Ethernet cables. (AIX location code 10-80 / physical location code
R1-Ty-P2-E1.1 )
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, if all the cables listed above have been connected, continue at the next
step.
v No, the cable(s) just connected is causing the failure. If the end of the cable
away from the I/O Planar is connected to a FRU (such as the diskette drive),
disconnect it and repeat the test.
– If it does not fail, replace the FRU that was disconnected.
– If it still fails, replace the following FRUs in the order listed. Cable, I/O
planar, cluster power planar. After the repair is successful, go to “MAP
Step 4540-8”.
MAP Step 4540-7
1. Reinstall one or more of the I/O Attachment cards, NVS cards, SSA cards.
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, go to “MAP Step 4540-9”.
v No, one of the FRUs just installed is failing and should be replaced. Isolate to
the failing FRU, then exit this MAP and replace the FRU using “MAP 4700:
Replacing Cluster FRUs” on page 375. Skip the steps that prepare the cluster
for service and power off. Begin with the step that replaces the failing cluster
bay FRU.
MAP Step 4540-8
A failing FRU has been replaced. The cluster now stabilizes with code E1F7 or
20EE000B.
1. Reinstall all remaining FRUs and reconnect all cables except the SCSI signal
cable (AIX location code 10-60 or physical location code R1-Ty-P2-Z1.1).
Does the operator panel stabilize with code E1F7 or 20EE000B?
v Yes, go to “MAP Step 4540-9”.
v No, one of the FRUs or cables just installed are causing a problem. Remove
them one at a time to isolate the failure and replace the failing FRU. If it still
fails, replace the following FRUs in the order listed. I/O planar, cluster power
planar. When the failing FRU has been replaced and the operator panel
stabilizes with code E1F7 or 20EE000B, go to “MAP Step 4540-9”.
MAP Step 4540-9
The prior MAP steps ensured that the cluster was ready to access the boot device.
This step reconnects the SCSI signal cable to the SCSI Hard Drive (boot device).
Note: Remember to reset the service processor reboot attempts setting to 3, if it
was set to 0.
1. Reconnect the SCSI signal cable. (AIX location code 10-60 or physical
location code R1-Ty-P2-Z1.1)
Problem Isolation Procedures, CHAPTER 3
369
MAP 4540: Cluster Minimum Configuration
The cluster ready indicator on the 2105 Model E10/E20 operator panel is
on:
v Ready (may take 15 minutes). The cluster loaded code properly.
– The cluster FRU you have replaced may require an additional
preparation to ensure the cluster is ready for customer use.
Reference “MAP 4700: Replacing Cluster FRUs” on page 375.
Review the FRU replacement procedure in steps 11 on page 377 to
17 on page 380 for any additional actions. Then return here and
continue.
Note: Remember to reset the service processor reboots attempt
setting back to 3. See step 3 on page 365.
– Go to “MAP 1500: Ending a Service Action” on page 68.
v E1F7 or 20EE000B. The cluster cannot boot from the SCSI HD. There
is a SCSI HD problem or a SCSI interface problem, go to “MAP 4320:
Isolating E1xx SCSI Hard Drive Code Boot Problems” on page 336.
v Other codes or symptoms. Go to “MAP 4360: Isolation Using Codes
Displayed by the Cluster Operator Panel” on page 342.
2. Exit this MAP and go to 12 on page 378 to close the cluster bay top
cover and complete the repair.
MAP 4550: NVS FRU Replacement
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
When a problem log calls this MAP, the NVS FRU(s) must be replaced as
described below.
Isolation
1. A problem log with one or more NVS FRUs sent you here. The NVS FRU
replacement kit contains two NVS memory cards and one NVS top card
crossover. All three FRUs should be replaced, they are a matched and tested
set.
Note: If the NVS FRU kit contains any written instructions, follow those
instructions. After you complete these instructions, return here and
continue with the next step.
2. Go to, “MAP 4700: Replacing Cluster FRUs” on page 375.
MAP 4560: No Valid Subsystem Status Available
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
Global subsystem status (GSS) exists for each Logical Subsystem (LSS). Two
copies are kept, each on a separate array. If one copy becomes unavailable, a
problem log is created and a new second copy is created on a different array if
370
VOLUME 1, ESS Service Guide
MAP 4560: No Valid Global Subsystem Status
possible. It stays in this new location even after the repair is complete. Normally,
when a volume is unavailable, the array it is located on has status of offline or
unknown.
An LSS can operate on just one GSS copy. If both GSS copies are unavailable, the
LSS gives ’FC’ status to all ESCON host system requests to its volumes. The LSS
gives command rejects and check conditions of internal target failure to all SCSI
host system requests to its volumes. There can be one or more problem logs for
each GSS copy that is unavailable. It normally takes two or more failures to prevent
the fault tolerant RAID architecture from accessing a particular array (rank).
If access to the GSS copies was lost, but the data is still valid, then the repair
action should restore access. This will automatically reset the ″No Valid Subsystem
Status″ condition.
If both copies lost the actual GSS data, then the GSS status for that LSS will have
to be reset when determined by the next level of support. This can cause customer
data loss.
There is no one problem log that will identify the various combinations of failures
that created the condition. Each GSS copy has at least one problem log needing
repair. There may be other non-related problem logs needing repair also. An
example would be a problem log for a DDM replacement on an array and SSA loop
not part of the LSS with the condition. Therefore, the isolation procedure below
helps you determine the highest priority problem to repair first.
Isolation
1. It is important you read the description section above before proceeding with
this isolation procedure.
2. Call your next level of support before going to the next step.
3. Display the pinned data status:
From the service terminal Main Service Menu, select:
Utilities Menu
Pinned Data Menu
Display Pinned Data
A volume is only displayed if it has pinned data status. The LUA/LSS and SSID
are shown for each volume displayed. The display groups volumes having
retryable pinned data, non-retryable pinned data and ’FC’ (no global subsystem
status).
v If a volume has ’FC’ status, go to the next step.
v If a volume has retryable or non-retryable pinned data go to “MAP 4520:
Pinned Data and/or Volume Status Unknown” on page 363.
4. Display the status of all arrays (ranks):
From the service terminal Main Service Menu, select:
Utilities Menu
Display Physical and Logical Configuration
List all Ranks
An array with status of offline or unknown may include one or both GSS
volumes. Record any arrays with this status then go to the next step.
5. Determine the SSA loop and DDM bays locations the offline or unknown arrays
are part of:
From the service terminal Main Service Menu, select:
Problem Isolation Procedures, CHAPTER 3
371
MAP 4560: No Valid Global Subsystem Status
Utilities Menu
Display Physical and Logical Configuration
List Physical Disks in a Rank
At the Select A Rank Name display, find the rank (array) noted in the prior step.
Record the drawer and location fields for that rank.A rank can exist on more
than one drawer and may appear more than once in the list.
Determine the loop name (color) by observing the SSA cables connected to the
DDM bay at the location (physical) noted.
6. Display problems needing repair:
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Display the problem details for each problem. Notice the physical location code
and/or SSA loop identified. Record the problem record ID for any problem
related to an array that is offline or unknown. Go to the next step.
7. If an array has more than one problem record related to it, use the following
priorities:
a. First repair a problem that includes an SSA card or SSA cable as a FRU or
an isolation procedure for these FRUs.
b. Next repair a problem that has an SRN of:
46000 (more than one DDM not available)
48900 (more than one DDM failed)
48950 (array build failed)
c. Repair any remaining related problems.
8. After each repair is complete, display the pinned data status. Restoring just one
of the two GSS copies will clear the No Valid Subsystem Status Available
condition.
MAP 4580: Pinned Data In Single Cluster NVS
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
A problem caused both clusters to shutdown. During the recovery power on and
code load, the problem caused one cluster to have pinned data in cache or NVS.
The repair of that cluster problem will not automatically reset the pinned data
condition because it occurred during the power on of both clusters. The only way to
reset the pinned data condition is to power on both clusters after the cluster repair
is complete.
Isolation
1. Display the pinned data status for each cluster.
From the service terminal Main Service Menu, select:
372
VOLUME 1, ESS Service Guide
MAP 4580: Pinned Data In Single Cluster NVS
Utilities Menu
Pinned Data Menu
Display Pinned Data
A volume is only displayed if it has pinned data status. The LUA/LSS and SSID
are shown for each volume displayed. The display groups volumes having
retryable pinned data, non-retryable pinned data and ’FC’ (no global subsystem
status). A volume can be listed with more than one pinned data condition.
2. Are there any volumes with retryable or non-retryable pinned data?
v Yes, go to “MAP 4520: Pinned Data and/or Volume Status Unknown” on
page 363.
v No, continue with the next step.
3. Are there any volumes with ’FC’?
v Yes, go to “MAP 4560: No Valid Subsystem Status Available” on page 370.
v No, go to the Repair Menu, End of Call Status option.
MAP 4600: Isolating a CD-ROM Test Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The CD-ROM drive in one of the clusters is failing.
Isolation
Retry the failing operation with another CD-ROM disk of the same type.
Note: Even though the service terminal CD diagnostic calls for a Test Pattern CD
to be used, any 2105 code/LIC CD disc may be used.
Is the CD-ROM still failing?
v Yes, go to “MAP 4700: Replacing Cluster FRUs” on page 375 and replace the
CD-ROM drive.
v No, discard the failing CD-ROM disk and replace it with a new one of the same
type.
MAP 4610: Cluster SP/System Firmware Down-level
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster SP or System Firmware is down-level. This can happen when the SP
card (Ex0 only) or I/O planar FRUs are replaced and have down-level firmware. On
cluster power up, the down-level code is discovered and a problem log is created.
This occurs even before you have the chance to check and update the firmware per
the FRU Replace table in “MAP 4700: Replacing Cluster FRUs” on page 375.
Problem Isolation Procedures, CHAPTER 3
373
MAP 4610: Cluster SP/System Firmware Down-level
Before firmware can be updated, all problem logs needing repair must be repaired
or cancelled.
Isolation
1. Cancel the problem log that sent you to this MAP.
From the service terminal Main Service Menu, select:
Utility Menu
Problem Log Menu
Change a Problem State
2. Repair all problem logs needing repair before going to the next step.
From the service processor Main Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
3. Check and update to the latest level of LIC firmware for the I/O planar and SP.
From the service terminal Main Service Menu, select:
Licensed Internal Code Maintenance Menu
Multiple LIC Menu
Select one of the following:
v Concurrent or Nonconcurrent
Select one: a. Concurrent or b. Noncurrent.
v System Planar / Service Processor Menu
MAP 4620: Isolating a Diskette Drive Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The diskette drive in one of the clusters is failing.
Isolation
Retry the failing operation with a new diskette of the same type.
Is the diskette drive still failing?
v Yes, go to “MAP 4700: Replacing Cluster FRUs” on page 375 and replace the
diskette drive.
v No, discard the failing diskette disk and replace it with a new one of the same
type.
MAP 4630: Listed FRUs May Be Incomplete or Need Isolation
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
374
VOLUME 1, ESS Service Guide
MAP 4630: Incomplete FRU List
Description
See isolation below.
Isolation
The list of FRU(s) in the problem log may only be the FRU(s) reporting the error.
The actual failing FRU(s) may not be listed. To determine what the additional FRUs
are, YOU MUST:
1. Determine the Service Request Number (SRN).
2. Locate the SRN in the ″Service Request Number List″ in chapter 9 of the
Enterprise Storage Server Service Guide, Volume 3.
3. Find the Failing Function Codes (FFC) listed with the SRN.
4. Locate the FFC in the ″Failing Function Code Table″ in chapter 9 of the
Enterprise Storage Server Service Guide, Volume 3. Use the information with
the FFC to identify additional FRUs.
5. Use the Description and Action column, in the Service Request Number List, to
determine any further isolation procedures.
MAP 4700: Replacing Cluster FRUs
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A problem log or MAP isolation procedure has identified one or more cluster bay
FRUs for replacement. Following all steps in this MAP will ensure the FRU is
replaced and verified properly.
Procedure
1. Is there an existing problem log for the cluster bay FRU(s) being replaced?
v Yes. Display the problem log details and write down the time in the last
occurrence field. After the FRU has been replaced and the cluster has
powered on, you will display problems needing repair to determine if the
problem has been repaired. One of three conditions will exist:
a. No errors were detected. No new problem logs were created and the
timestamp in the last occurrence field of the existing problem log was
not updated.
b. The same error was detected. The timestamp in the last occurrence field
of the existing problem log was updated. A new problem log was not
created.
c. A different error was detected. A new problem log was created. The
timestamp in the last occurrence field of the existing problem log was
not updated.
v No. Go to the next step.
2. Connect the service terminal to the cluster bay that is not being repaired. See
″Service Terminal Setup″ in chapter 8 of the Enterprise Storage Server Service
Guide, Volume 3. You need to use the Alternate Cluster Repair menu options
from that cluster bay.
3. Quiesce the cluster bay being repaired using the service terminal Alternate
Cluster Repair menu option.
Problem Isolation Procedures, CHAPTER 3
375
MAP 4700: Cluster Bay FRU Replacement
Note: If pinned data is detected during the quiesce, you will be sent to MAP
4520: Pinned Data or FC Status.
From the service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
4. Was pinned data status detected during the quiesce cluster bay in the prior
step?
v Yes, ensure all the actions in MAP 4520: Pinned Data or FC Status were
attempted. Then quiesce the cluster bay this time using the Unconditionally
Quiesce the Alternate Cluster option instead of the Quiesce the Alternate
Cluster option. This will bypass the check for pinned data. When the
quiesce is complete, go to the next step.
v No, go to step 8.
5. Was the original pinned data status non-retryable?
v Yes, continue with the next step.
v No, go to step 8.
6. Is an NVS card FRU being replaced?
v Yes, continue with the next step.
v No, go to step 8.
7. The cluster bay must be prepared for the NVS to be repaired.
From the service terminal Main Service Menu, select:
Utility Menu
Pinned Data Menu
Pinned Data NVS Repair
Continue with the next step.
8. Power off the cluster bay being repaired using the service terminal Alternate
Cluster Repair menu option.
From the service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Power Off the Alternate Cluster
9. Slide the cluster bay into the service position.
Reference the correct cluster bay model repair procedure, in chapter 4 volume
2 of this book, see:
v 2105 Model E10/E20, ″Cluster Bay Service Position (E10/E20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Bay Service Position (F10/F20)″ in chapter 4
of the Enterprise Storage Server Service Guide, Volume 2
10. Open the cluster bay top cover.
Go to the correct cluster bay model repair procedure, in chapter 4 volume 2 of
this book, see:
v 2105 Model E10/E20, ″Cluster Top Bay Servicing (E10/E20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2 or
v 2105 Model F10/F20, ″Cluster Top Bay Servicing (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2
376
VOLUME 1, ESS Service Guide
MAP 4700: Cluster Bay FRU Replacement
11. Replace the cluster bay FRU(s). Use the following list to reference the
replacement procedures in chapter 4, volume 2 of this book.
After the repair, return here and continue with the next step.
Note: If replacing more than one FRU, ensure you read and do all the actions
for each FRU before completing this MAP.
Go to the correct model cluster bay repair (Removal and Replacement)
procedure, 2105 Model E10/E20 or 2105 Model F10/F20, below:
v 2105 Model E10/E20 FRUs:
a. ″System, I/O, and Power Planars, Cluster Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
b. ″332 Mhz CPU Card, Cluster Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
c. ″Memory Card, Cluster Bay″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
d. ″Memory Card, Memory Module, Cluster Bay″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
Attention: When replacing memory modules (DIMMs) on a 2105 Model
E10/E20, the DIMMs should be replaced in pairs to avoid a long service
action.
Note: If only one replacement DIMM is available, replacing one DIMM
has a 50 percent chance of a successful repair. If the verification
tests fail, repeat the repair after replacing the other DIMM in the
failing DIMM pair slot.
e. ″Service Processor Card, Cluster Bay (E10/E20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
f. ″Drives, Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
g. ″I/O Attachment Card, Cluster Bay (E10/E20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
h. ″NVS Memory Card and Top Card Crossover, Cluster Bay (E10/E20)″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
i. ″NVS Cache Module, Cluster Bay (E10/E20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
j. ″SSA Service Card, Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
k. ″SSA Device Card Dram Module, Cluster Bay (E10/E20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
l. ″Operator Panel, Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
m. ″I/O Planar Battery, Cluster Bay (E10/E20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
n. ″Cable, Cluster Bay (E10/E20)″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
v 2105 Model F10/F20 FRUs:
a. ″Cluster, Bay Fan″ in chapter 4 of the Enterprise Storage Server Service
Guide, Volume 2.
b. ″System, I/O, and Power Planars, Cluster Bay (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Problem Isolation Procedures, CHAPTER 3
377
MAP 4700: Cluster Bay FRU Replacement
c. ″255 Mhz CPU Card, Cluster Bay (F10/F20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
d. ″Memory Card, Cluster Bay (F10/F20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
e. ″Memory Card, Memory Module, Cluster Bay (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
Attention: When replacing memory modules (DIMMs) on a 2105 Model
E10/E20, the DIMMs should be replaced in pairs to avoid a long service
action.
Note: If only one replacement DIMM is available, a swapping action is
required to have a successful repair. When one DIMM fails, both
DIMMs in the pair are made unavailable. Each DIMM has a
unique internal serial number that is read at power up. Both
DIMMs in the pair will be made available ONLY when both DIMM
slots have a different DIMM serial number.
Perform the following repair actions:
1) Remove the defective DIMM from the indicated FRU location and
mark it as defective.
2) Remove the working DIMM from the other slot in the pair. Swap this
DIMM into the other slot that had the defective DIMM.
3) Install the new DIMM into the open slot.
f. ″Drives, Cluster Bay (F10/F20)″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
g. ″I/O Attachment Card, Cluster Bay (F10/F20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
h. ″NVS Memory Card and Top Card Crossover, Cluster Bay (F10/F20)″ in
chapter 4 of the Enterprise Storage Server Service Guide, Volume 2.
i. ″NVS Cache Module, Cluster Bay (F10/F20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
j. ″SSA Service Card, Cluster Bay (F10/F20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
k. ″SSA Service Card Dram Module, Cluster Bay (F10/F20)″ in chapter 4 of
the Enterprise Storage Server Service Guide, Volume 2.
l. ″Operator Panel, Cluster Bay (F10/F20)″ in chapter 4 of the Enterprise
Storage Server Service Guide, Volume 2.
m. ″I/O Planar Battery, Cluster Bay (F10/F20)″ in chapter 4 of the
Enterprise Storage Server Service Guide, Volume 2.
n. ″Cables, Cluster Bay (F10/F20)″ in chapter 4 of the Enterprise Storage
Server Service Guide, Volume 2.
12. Close the cluster bay top cover then slide the cluster bay into the operating
position.
13. Did you replace the I/O planar or I/O planar battery:
v Yes, go to step 14.
v No, power on the cluster bay being repaired using the Alternate Cluster
Repair menu option. Go to step 15 on page 379.
14. Replacing the I/O planar or I/O planar battery affects the NVRAM service
terminal connection serial port settings.
378
VOLUME 1, ESS Service Guide
MAP 4700: Cluster Bay FRU Replacement
a. Power on the cluster bay being repaired using the Alternate Cluster
Repair menu option. As soon as the cluster bay begins to power up,
immediately continue with this procedure.
b. Connect the service terminal cable to the S1 port on the cluster bay being
repaired and then logically connect the service terminal. Each time the
service terminal logical connection drops, you must quickly reconnect it.
c. Respond to the message requesting you to enter a 1 to define this port as
the unused system console. (The prompt from the system firmware may
say CONSOLE, but for the 2105 Model E10/E20, you will use the service
terminal instead.) The cluster bay code load will then continue. You can
now connect the service terminal to the S2 port.
d. Go to step 15.
Note: When the I/O planar or I/O planar battery is replaced, the NVRAM
memory will be reset and the system console port will not be set.
Shortly after cluster bay power on, the S1 and S2 ports each will
attempt to display a prompt to allow that port to be defined as the
system console port. (The system console port is not used.
However if the port is not defined, each power on code load will
take one additional minute as it times out waiting for the port to be
defined. The cluster bay code load will complete successfully in
either case.) The prompt is only displayed if the service terminal is
already connected to the proper port. The service terminal must be
connected to the S1 port. The prompt will display and then a 1 will
be entered. After this, the NVRAM settings will use only port S2 for
the service terminal. If you do not respond quickly enough to define
the port, you can repeat the cluster bay power on to have another
chance.
15. Wait for the cluster bay to come ready. Connect the service terminal to the
cluster bay being repaired and attempt to login.
Note: If there is still a problem, the Ready LED indicator may not come on. If
any of the automatic cluster firmware updates are needed, it will extend
the time to come ready for login.
Connect the service terminal to the cluster being repaired and attempt to login.
Was the service terminal able to login to the cluster being repaired?
v Yes, go to step continue with the next step.
v No, wait for the cluster to come ready, see Note above.
– If the cluster hangs displaying a code, go to “MAP 4360: Isolation Using
Codes Displayed by the Cluster Operator Panel” on page 342.
– If the cluster still does not come ready, connect the service terminal to
the cluster not being repaired and show and repair and new related
problems.
– If there are no new related problems call the next level of support.
16. Determine if the repair was successful:
From the service terminal Main Service Menu, select:
Repair Menu
Show/Repair Problems Needing Repair
a. Review step 1 in this MAP to ensure you understand how to
determine if a failure is still occurring.
Problem Isolation Procedures, CHAPTER 3
379
MAP 4700: Cluster Bay FRU Replacement
b. If there was an existing problem log, view the timestamp in the last
occurrence field and determine if it was updated during the cluster
bay power on. If it was, then replace any other FRUs called out. If
all FRUs have been replaced, call the next level of support.
c. If the existing problem log last occurrence timestamp was not
updated and there are no new related problems, the problem has
been repaired. Go to the next step.
d. If there is a new related problem log, repair it now. Then return to
this step.
17. The FRU(s) have been replaced and the cluster powered up and the code
loaded with no problems. The original problem log was not updated and no
new related problem were created. Some FRUs need additional tests to
ensure they work properly. For each FRU replaced, go to Table 28. and do any
additional actions listed.
Table 28. Cluster Bay FRU Replace Table
Cluster Bay FRU
Description and Action
v SCSI hard drive
Description: No additional verification needed.
Action: Go to see “MAP 4020: Performing the SCSI Hard Drive Build Process”
on page 316.
v CD-ROM drive
Description: Verify CD-ROM drive.
v SCSI signal cable
Action:
a. Run CD-ROM drive diagnostics.
v Connect service terminal to the cluster bay being repaired.
From the service terminal Main Service Menu, select:
Machine Test Menu
CD-ROM drive
Note: A test CD is part of the ship group and should be stored
in the document enclosure.
b. Go to step 18 on page 383.
v Diskette drive
Description: verify diskette drive.
v Diskette drive signal cable
Action:
a. Run diskette drive diagnostics.
v Connect the service terminal to the cluster bay being repaired.
From the service terminal Main Service Menu, select:
Machine Test Menu
Diskette Drive
Note: A test diskette is part of the ship group and should be
stored in the document enclosure.
b. Go to step 18 on page 383.
380
VOLUME 1, ESS Service Guide
MAP 4700: Cluster Bay FRU Replacement
Table 28. Cluster Bay FRU Replace Table (continued)
Cluster Bay FRU
Description and Action
v System planar
Description: Verify processors and memory.
v CPU card
Action:
v Memory card
a. Display memory.
v 128B memory module
v Cluster bay power planar
v Connect service terminal to the cluster bay being repaired.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Cluster Memory Menu
List Installed Cluster Memory
Ensure that both clusters list the same amount of Total
Installed and Available Memory. If not, then recheck the
cluster bay for loose or missing memory cards or memory
card modules.
b. Display CPUs.
v Connect service terminal to this cluster bay.
From the service terminal Main Service Menu, select:
Configurations Options Menu
Show Storage Facility Resources Menu
Show Storage Facility Resources
v Scroll down and ensure that resources proc0, proc1, proc2 and proc3
are all shown as Available. If not, then recheck the cluster bay for
loose or missing CPU cards.
c. Go to step 18 on page 383.
v I/O planar
Description: The time of day is automatically restored by the cluster bay
power on and code load when communication is established with the other
cluster bay. However additional verification tests are needed.
Action:
a. Verify processors and memory. Use procedure in Action column of this
table for the system planar FRU.
b. Verify the correct level of LIC firmware is on the I/O planar. Connect the
service terminal to the working cluster. Use the service terminal Main
Service Menu, Licensed Internal Code Maintenance Menu, Multiple LIC
Activation, Concurrent, SVP Service Processor / System Planar Activation
option to check and update the level if needed.
Note: The service processor function is integrated in the 2105 Model
F10/F20 I/O planar.
c. Verify customer e-mail notification. Use procedure in Action column of this
table for the Ethernet 10Base-T Cable.
d. Verify modem and expander connection (if installed). Use procedure in
Action column of this table for the serial interface cable (S3 port).
e. Go to step 18 on page 383.
v I/O planar battery
Description: The time of day is automatically restored by the cluster bay
power on and code load when communication is established with the other
cluster bay. No additional verification.
Action: Go to step 18 on page 383.
v I/O Attachment card, SSA card.
Description: No additional verification:
Action: Go to step 18 on page 383.
Problem Isolation Procedures, CHAPTER 3
381
MAP 4700: Cluster Bay FRU Replacement
Table 28. Cluster Bay FRU Replace Table (continued)
Cluster Bay FRU
Description and Action
v NVS card, NVS cache module, NVS Description: Verify NVS memory:
cache module battery, NVS top card
Action:
crossover.
a. Connect service terminal to the cluster bay being repaired.
From the service terminal Main Service Menu, select:
Install/Remove Menu
Non-Volatile Storage (NVS) Menu
List Installed NVS
Ensure that both clusters list the same amount of NVS
memory. If not, then recheck the cluster bay for loose or
missing NVS cards or NVS cache modules.
b. The NVS card cache module battery installation date needs to be entered
into the functional code any time an NVS FRU containing these batteries
is installed. Use the service terminal Main Menu, Utility Menu, Battery
Menu, Update Battery Installation Date option for each battery on the NVS
card. A date for each of the three batteries on an NVS card must be
entered. (This is used to create error logs in the future to replace these
batteries before they get exceed their expected life.)
c. Go to step 18 on page 383.
v Service Processor Card (2105
Model E10/E20 only)
Description: See action below.
Action:
a. Verify the correct level of LIC firmware is on the SP card. Use the service
terminal Main Service Menu, Licensed Internal Code Maintenance Menu,
Firmware LIC Menu, System Planar / Service Processor Menu options to
check and update the level if needed.
b. Go to step 18 on page 383.
v Cluster bay operator panel or
EEPROM
Description: No additional verification.
Action: The EEPROM on the operator panel has unique vital product data
(VPD) that includes the 2105 Model E10/E20 serial number and cluster ID.
The operator panel/EEPROMs cannot be swapped from cluster to cluster. The
EEPROM from the old operator panel should be moved to the new operator
panel FRU. If the new operator panel FRU still fails, then the old EEPROM
might be failed. Reinstall the EEPROM that came on the new operator panel
FRU. The new EEPROM will not have valid VPD. You must call the next level
of support for the procedure to enter the unique VPD for your cluster.
a. If the old EEPROM module was swapped to the new cluster bay Operator
panel, go to step 18 on page 383.
b. If the old EEPROM module was not swapped to the new cluster bay
Operator panel, call the next level of support for procedure to update the
Vital Product Data. After the VPD has been loaded, go to step 18 on
page 383.
v Serial interface cable (S1 and S2
ports)
Description: No additional verification. The S2 port has been tested while
using the service terminal connected to this cluster bay. The S1 port is not
used.
Action: Go to step 18 on page 383.
382
VOLUME 1, ESS Service Guide
MAP 4700: Cluster Bay FRU Replacement
Table 28. Cluster Bay FRU Replace Table (continued)
Cluster Bay FRU
Description and Action
v Serial interface cable (S3 port)
Description: Verify the connection to the modem and expander (if installed).
Action:
a. Verify modem and expander connection.
v Connect the service terminal to the S2 port of this cluster bay.
From the service terminal Main Service Menu, select:
Machine Test Menu
Send Test Notification Menu
Service Notification (via modem)
b. Go to step 18.
v Ethernet 10Base-T cable
Description: Test the ethernet connection to the other cluster bay.
Action:
a. To test the ethernet connection to the other cluster bay:
v Connect the service terminal to the S2 port of this cluster bay.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Ensure that the problem logs status is displayed for both cluster bays.
b. Go to step 18.
Description: The AUI connection if not used for the 2105 Model E10/E20.
v Ethernet AUI cable
Action: None
v Front Cluster bay fan (2105 Model
F10/F20 only)
Description: No additional verification
Action: Go to step 18.
18. Ensure the cluster being repaired has come ready by connecting the service
terminal to the cluster and attempting to login. The time to come ready will be
increased if any cluster firmware updates are needed. The updates occur
automatically during the cluster IML.
Was the service terminal able to login to the cluster being repaired?
v Yes, continue with the next step.
v No, wait for the cluster to come ready. If the cluster hangs displaying a
code, go to “MAP 4360: Isolation Using Codes Displayed by the Cluster
Operator Panel” on page 342. If the cluster still does not come ready,
display and repair any new related problems or call the next level of
support.
19. Resume the cluster bay using the Alternate Cluster Repair Menu option.
Note: Resuming a cluster that is not yet ready could corrupt an automatic
firmware update that is in progress causing a long service action.
20. Close the problem log for the cluster FRU when the repair is complete.
From the service terminal Main Service Menu, select:
Repair Menu
Problem Isolation Procedures, CHAPTER 3
383
MAP 4700: Cluster Bay FRU Replacement
Close a Previously Repaired Problem
Go to the next step.
21. If retryable pinned data was present during the original quiesce, display the
pinned data status again.
Is the retryable pinned data status still shown?
v Yes, repair related problem logs still needing repair. If there are no related
problem logs, call the next level of support.
v No, continue with the next step.
22. Go to “MAP 1500: Ending a Service Action” on page 68.
MAP 4710: Isolating a DDM LIC Update Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A failure was detected when new disk drive module (DDM) licensed internal code
was being downloaded to the DDMs.
Note: The term download means the same as update.
One of the following error conditions could have been detected:
v SSA card is not in the proper state.
v
v
v
v
v
Unable to check the array status.
Arrays are not in a the proper state.
DDM diagnostic failed for pdiskXX.
Download failed for pdiskXX.
The download process took too long and timed out.
The DDM code download process includes the following:
v The new DDM code is included on the 2105 LIC Code update CD-ROM.
v The LIC update process copies the code from the CD-ROM to the cluster bay.
v The DDM download process is started using the service terminal Disk Drive
Module (DDM) LIC Menu options. It automatically runs to one DDM at a time. It
runs the DDM diagnostics, then loads the new code, then runs the DDM
diagnostics again. If the diagnostics and code load are successful, the process is
repeated on the next DDM, until every DDM is complete.
v If a DDM diagnostic or DDM code update fails, a problem log is created. The
DDM that failed will also be recorded in the DDM code update status. The
remaining DDMs will not have been downloaded yet.
v After the DDM is repaired, the DDM download process needs to be started
again. The service terminal DDM Download Restart option will cause the cluster
to start with the first DDM and check each one until it finds the DDM that was
repaired. If the diagnostics and download are successful this time, the process
will continue to download the remaining DDMs, one at a time.
Isolation
1. Read the description section above.
384
VOLUME 1, ESS Service Guide
MAP 4710: DDM LIC Update
2. Use the service terminal to display problems needing repair. Look for related
problem (SSA or drawer FRUs).
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
v If there are no related problems, call the next level of support.
v If there are related problems, fix them and then return here and continue with
the next step.
3. Use the DDM Download Restart option to complete the DDM download process.
From the service terminal Main Service Menu, select:
Licensed Internal Code Maintenance Menu
Disk Drive Module (DDM) LIC Menu
DDM Download Restart
MAP 4720: Cluster or Host Bay Fails to Power Off
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
A power off request for a cluster or host bay from the service terminal or 2105
Model Exx/Fxx operator panel local power off switch failed. One of three conditions
occurred:
v The service terminal request for the bay to be powered off failed.
v The service terminal requested the bay to be powered off, but the bay indicated it
was already powered off.
v The service terminal utility menu options were used to attempt to power off a
cluster bay, but the other cluster bay was already powered off. Only one cluster
bay may be powered off at a time. Power on the other cluster bay. Then connect
the service terminal to the other cluster and use the Alternate Cluster Repair
menu option to power off this cluster bay.
A power off request does the following:
v A power off request for the cluster bay is sent to the RPC cards. The RPC cards
request the service processor to power off the cluster. When that does not occur,
the RPC cards force their cluster bay power enable line to the electronics cage
power supplies to off. When the power supplies output to the cluster bay did not
power off, a problem log was created.
v A power off request for the host bay is sent to the RPC cards. The RPC cards
request the electronics cage power supplies to power off the host bay. When the
host bay did not power off, a problem log was created.
Note: It takes two of the three power supplies to keep the bay powered on. A
single power supply cannot provide enough power.
Isolation
1. If you are sure the bay is still failing to power off, continue with the next step. If
you are not sure, do the following:
v Use the service terminal Repair Menu, FRU Replace option as a test. The
option will quiesce the bay, power it off, prompt to change the FRU which you
will not, power on and then resume. If the power off fails, continue with the
next step. If the power off works, complete the simulated FRU replace and
then go to: “MAP 1500: Ending a Service Action” on page 68.
Problem Isolation Procedures, CHAPTER 3
385
MAP 4730: Cluster Power Off
2. Observe the front LED indicators on the three electronics cage power supplies
above the bay that is failing to power off.
v The left LED indicator is for host bay 1 or 3.
v The middle LED indicator is for cluster bay 1 or 2.
v The right LED indicator is for host bay 2 or 4.
Use the description of the three LED indicators for the failing bay:
v All three indicators are on. Continue with the next step.
v One indicator is off, two are on. Continue with the next step.
v Two indicators are off, one is on. Replace the electronics cage power supply
that is stuck on. Use the service terminal Repair Menu, FRU Replace Menu
options. Return to the top of this map after the FRU is replaced.
v All three indicators are off. The bay is powered off, the original problem is no
longer occurring.
Ensure the problem log has been closed. Use the service terminal Repair
Menu, Close a Previously Repaired Problem. Then go to “MAP 1500: Ending
a Service Action” on page 68.
3. The RPC cards should be replaced one at a time using the service terminal
Replace a FRU Menu.
Connect the service terminal to the cluster bay that does not have the problem.
From the service terminal Main Service Menu, select:
Repair Menu
Replace FRU Menu
After replacing each FRU, attempt to power off the cluster bay.
v If it powers off, the problem is repaired, return to the original procedure or go
to “MAP 1500: Ending a Service Action” on page 68.
v If it still fails to power off, continue with the next step.
4. Replace the electronics cage power supplies one at a time until the failure no
longer occurs. Use the service terminal Repair Menu, Replace FRUs options.
Note: Attempt to power down the bay to determine if the problem is fixed.
v For a cluster bay, connect the service terminal to the cluster bay not
being repaired and use the Repair Menu, Alternate Cluster Repair
Menu options to first quiesce and then power it off.
v For a host bay, connect the service terminal to either cluster bay and
use the Repair Menu, FRU Replace Menu for the host bay planar. (Do
not actually replace the planar.)
If the bay powers down, ensure the problem log has been closed. Use the
service terminal Repair Menu, Close a Previously Repaired Problem. Then go to
“MAP 1500: Ending a Service Action” on page 68.
If all three power supplies have been replaced and the bay still fails to power
down, go to the next step.
5. The signal from both RPC cards must be received by each electronics bay
power supply to switch off the output to the bay. The remaining FRUs are:
v RPC to electronics cage cable
v Electronics cage sense card (only passes the signal through, no active
circuits for these signals)
v Electronics cage power planar
386
VOLUME 1, ESS Service Guide
MAP 4730: Cluster Power Off
You may want to call the next level of support before changing these FRUs.
MAP 4730: Isolating a Cluster Power Off Request Problem
Attention: This is not a stand-alone procedure. Perform it only at the direction of
the service terminal or other service guide procedures. Failure to follow this
attention can cause customer operations to be disrupted.
Description
A power off request for a cluster or host bay from the service terminal or 2105
Model Exx/Fxx operator panel local power off switch failed. One of three conditions
occurred:
v The service terminal utility menu options were used to attempt to power off a
cluster bay, but the other cluster bay was already powered off.
Isolation
v The service terminal utility menu options were used to attempt to power off a
cluster bay, but the other cluster bay was already powered off. Only one cluster
bay may be powered off at a time. Power on the other cluster bay. Then connect
the service terminal to the other cluster and use the Alternate Cluster Repair
menu option to power off this cluster bay.
MAP 4740: Fan Check Detected by I/O Planar, Model Exx Only
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Model E10/E20 cluster bay I/O planar firmware monitors that the cooling
fan connectors are receiving rotation signals. The 2105 Model F10/F20 cluster bay
I/O planar firmware does not monitor the cooling fan connectors.
The 2105 Model Exx/Fxx electronics cage cooling fans are powered and monitored
by the electronics cage sense card and RPC cards. The RPC cards provide dummy
fan rotation signals to the I/O Planar through the electronics cage sense card. The
dummy signals from one RPC card is enough to prevent the I/O planar from giving
a false fan check. The signals from each RPC card are combined on the electronics
cage sense card and sent through cable to each of the four fan connectors on the
I/O planar. Under normal operation, a failing RPC card cannot cause this failure.
Isolation
Attention: This MAP is only for 2105 Model E10/E20 If this is a 2105 Model
F10/F20 call the next level of support.
1. If the fan check occurred when one RPC card was powered off, the other RPC
card is either not creating the fan rotation signal or the signal is not reaching the
electronics cage sense card. Replace that other RPC card or its RPC to
electronics cage cable to the cluster bay with the fan failure. The cable from that
RPC card to the electronics cage sense card may be failing or not connected
correctly. Use the service terminal Replace a FRU option.
2. Use the service terminal to display and repair any problem logs for the RPC
cards or electronics cage fans before continuing.
3. Ensure the RPC to electronics cage cables are connected into the electronics
cage sense card (at rear of the cluster bay) and the RPC cards.
Problem Isolation Procedures, CHAPTER 3
387
MAP 4740: Fan Check Detected by I/O Planar
Note: If the cable needs to be disconnected then connected, use the service
terminal Repair Menu, Replace a FRU, rack power cooling FRUs for the
rack power control card the cable is connected to.
4. Ensure the fan/RPC to upper backplane cable is connected to the electronics
cage sense card (at rear of the cluster bay).
Note: If the cable needs to be disconnected then connected, use the
instructions in “MAP 4790: Repairing the Electronics Cage” on page 395.
5. Ensure each of the four I/O planar fan signal connectors J12, J15, J17, and J19
have a cable connected to it. Ensure the other end of each cable is connected
to the cluster bay power planar.
The I/O planar system firmware reports the connectors as:
v J12 = Fan 1
v J17 = Fan 2
v J15 = Fan 3
v J19 = Fan 4
6. One of the following FRUs is failing, use “MAP 4700: Replacing Cluster FRUs”
on page 375 for replacements: electronics cage sense card, fan/RPC to upper
back plane cable, electronics cage power planar, cluster bay power planar to
docking connector cable, cluster bay power planar, cluster drive power cable,
I/O planar.
MAP 4750: Cluster Bay Power is Off, Had to Force it Off
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster bay is powered off but the power had to be forced off. A power off
request to the service processor (SP) card to request the RPC cards to power off
the electronics cage power supply boundaries for the cluster bay failed. An error
recovery power off request directly to the RPC cards worked. This bypassed the
failing SP to RPC circuits.
Isolation
1. Verify that the cluster bay is powered off. Observe the three power indicators on
the front of each of the three electronics cage power supplies above the cluster
bay. The center indicator is for the cluster bay.
Is the center LED indicator on all three electronics cage power supplies for this
cluster bay off?
v Yes, continue with the next step.
v No, the cluster bay is not powered off. Attempt to power it off. Connect the
service terminal to the working cluster. Use the Repair Menu, Alternate
Cluster Repair Menu options. If it still fails, the existing problem log will have
the timestamp field in the problem details updated, or a new problem log will
be created. Attempt to repair the new problem log or call the next level of
support.
2. Replace the following FRUs until the problem no longer occurs when powering
off the cluster bay:
388
VOLUME 1, ESS Service Guide
MAP 4750: Cluster Bay Power Status Wrong
v SP Card, 2105 Model E10/E20 only
v I/O Planar
Use the following to do this:
v To power on or off the cluster bay, connect the service terminal to the working
cluster. Use the Repair Menu, Alternate Cluster Repair Menu options to
power off or on the cluster as needed.
v To check if the FRU replaced has corrected the problem, power off the
cluster. Then use the Problem Log Menu option to display this problem log.
Observe the last occurrence time-stamp field.
If it is updated after the latest cluster power off, the problem is still occurring.
Replace the next FRU. If all the FRUs have been replaced, call the next level
of support.
If it is not updated, the problem is repaired, go to “MAP 1500: Ending a
Service Action” on page 68.
MAP 4760: Recovering from Corrupted Files or Functions
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A cluster file (dataset) or function is corrupted. If this has affected customer
operations, a separate problem log should have been created. In many cases,
customer operations will not be affected. Only Processes and/or files used by the
RAS (maintenance package) processes may be affected.
There are three recommended actions:
v The cluster can be quiesced, powered off and on, then resumed. This reloads the
code into the cluster which might clear a hung process. If the failure is still
present, then the next action is needed.
v The code is reloaded onto the cluster SCSI Hard Drive. An important part of this
process is the saving and restoring of the configuration and customization files.
This allows the cluster to restore access to the customer data after the process is
complete. If the failure is still present, then the next action is needed.
v The next level of support is contacted. They can login through the modem and
do functions similar to that of an AIX system administrator.
Isolation
1. Read the description above.
2. Reload the cluster code by quiescing, powering off, powering on and then
resuming the cluster.
Connect the service terminal to the cluster that does not have the problem.
From the service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
Note: If the resume fails, that may need to be repaired before continuing with
this MAP. You may need to call the next level of support if this happens.
Problem Isolation Procedures, CHAPTER 3
389
MAP 4760: Corrupted Files or Functions
3. Display problems needing repair.
The original problem may have been updated if the problem is still occurring.
The time stamp in the Last Occurrence field will be updated from the original
Last Occurrence. It is also possible that a new related problem may have been
created.
v If an error was not detected during the power up and resume, then the
original condition may be gone. If you are not sure, go to the next step to
rebuild the SCSI Hard Drive with new code. If you believe the problem is no
longer occurring, go to “MAP 1500: Ending a Service Action” on page 68.
v If an error was detected, continue with the next step.
4. Rebuild the cluster SCSI hard drive, go to “MAP 4020: Performing the SCSI
Hard Drive Build Process” on page 316, then return here and continue when the
build process is complete.
5. Display problems needing repair.
The original problem may have been updated if the problem is still occurring.
The time stamp in the Last Occurrence field will be updated from the original
Last Occurrence. It is also possible that a new related problem may have been
created.
v If an error was not detected after the SCSI Hard Drive rebuild, then the
original condition has probably been corrected. If you believe the problem is
no longer occurring, go to “MAP 1500: Ending a Service Action” on page 68.
v If a related error was detected, continue with the next step.
6. Call the next level of support.
MAP 4770: Isolating a E152 Cluster Hang
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
There is a PCI error condition that prevents the firmware from loading the AIX
operating system. When this occurs, the cluster bay operator panel will stop with
E152 displayed. The second line may display a location code of the I/O planar slot
that was being tested. This will speed the isolation of the failing FRU.
The error is normally due to one of the cards plugged into the I/O planar slots.
Isolation
1.
Use the Alternate Cluster Repair menu to quiesce, power off and power on
the alternate cluster bay.
Connect the service terminal to the cluster bay that does not have the error.
From the service terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Does the cluster bay still stop with E152 displayed?
v Yes, go to step 2 on page 391.
v No, the failing condition has been cleared. Resume the cluster bay. Go to
“MAP 1500: Ending a Service Action” on page 68.
390
VOLUME 1, ESS Service Guide
MAP 4770: E152 Cluster Hang
2. Observe the cluster bay operator panel for a location code displayed below the
E152.
Is a location code displayed?
v Yes, determine the position of the I/O planar card to replace.
Go to the correct cluster bay model locations in chapter 7, volume 3 of this
book, see:
– 2105 Model E10/E20, ″Cluster Bay, System, I/O, and Power Planars
Location Codes (E10/E20)″ in chapter 7 of the Enterprise Storage Server
Service Guide, Volume 3
– 2105 Model F10/F20, ″Cluster Bay, System, I/O, and Power Planars
Location Codes (F10/F20)″ in chapter 7 of the Enterprise Storage Server
Service Guide, Volume 3
Return here and continue with the next step.
v No, go to step 11 on page 392.
3. Replace the cluster bay FRUs, go to “MAP 4700: Replacing Cluster FRUs” on
page 375. If the cluster still hangs with a code of E152, return here and
continue with the next step.
4. Do one of the following:
v If this is a 2105 Model E10/E20, continue with the next step.
v If this is a 2105 Model F10/F20, go to step 7.
5. An E152 error could be caused by a host bay planar connected to an I/O
attachment card through the CPI interface.
Was the E152 location code for I/O Planar slot 4 or 7 (I/O Attachment cards
locations for 2105 Models E10/E20)?
v Yes, continue with step 6.
v No, replace the I/O planar, go to “MAP 4700: Replacing Cluster FRUs” on
page 375. If the cluster still hangs with a code of E152, call your next level
of support.
6. Replacing the I/O attachment card did not correct the E152 error. The error
could be caused by either host bay planar or CPI cable that is connected to
the I/O attachment card connectors. Use the table below to determine both
host bays that are cabled to the I/O attachment card. Then go to step 9 on
page 392.
Table 29. CPI Interface Locations
Failing Cluster Bay
I/O Planar Slot
IOA Card Connector
Host Bay
1
4
top
3
1
4
bottom
1
1
7
top
4
1
7
bottom
2
2
4
top
1
2
4
bottom
3
2
7
top
2
2
7
bottom
4
7. An E152 error could be caused by a host bay planar connected to an I/O
attachment card through the CPI interface.
Problem Isolation Procedures, CHAPTER 3
391
MAP 4770: E152 Cluster Hang
Was the E152 location code for I/O Planar slot 5 or 8 (I/O Attachment cards
locations for 2105 Model F10/F20)?
v Yes, continue with the next step.
v No, replace the I/O planar, go to “MAP 4700: Replacing Cluster FRUs” on
page 375. If the cluster still hangs with a code of E152, call your next level
of support.
8. Replacing the I/O attachment card did not correct the E152 error. The error
could be caused by either host bay planar or CPI cable that is connected to
the I/O attachment card connectors. Use the table below to determine both
host bays that are cabled to the I/O attachment card. Then go to step 9.
Table 30. CPI Interface Locations
Failing Cluster Bay
I/O Planar Slot
IOA Card Connector
Host Bay
1
5
top
3
1
5
bottom
1
1
8
top
4
1
8
bottom
2
2
5
top
1
2
5
bottom
3
2
8
top
2
2
8
bottom
4
9. Replace the host bay planar or CPI cable. Use the service terminal Replace a
FRU option.
Connect the service terminal to the cluster bay that does not have the
problem. From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Host Bay FRUs
Continue with the next step.
10. Use the Alternate Cluster Repair menu options to power the failing cluster
bay off and then on:
v If the cluster bay powers on and displays READY, go to “MAP 1500: Ending
a Service Action” on page 68.
v If the cluster bay still stops at E152, continue with the next step.
v If the cluster bay has a different failure, go to “MAP 4360: Isolation Using
Codes Displayed by the Cluster Operator Panel” on page 342.
11. One of the cards in the I/O planar slots, or the I/O planar itself is causing the
problem. The failing card or planar will have to be isolated manually. Remove
or replace the card(s) until the failing cluster bay will power on with no E152
hang. The cluster bay may be powered off after passing the E152 hang point,
letting it attempt to complete the code load would either fail or create a new
problem log for the unplugged card(s). Go to “MAP 4700: Replacing Cluster
FRUs” on page 375. Return here when the E152 hang is fixed, or when all the
FRUs have been replaced and it still hangs at E152.
v If the cluster bay no longer stops at E152, Go to “MAP 1500: Ending a
Service Action” on page 68.
v If the cluster bay still stops at E152, continue with the next step.
392
VOLUME 1, ESS Service Guide
MAP 4770: E152 Cluster Hang
12. Call the next level of support before continuing with this step. Replace the host
bay planars one at a time until the failing cluster bay will power on with no
E152 hang.
Use the service terminal Replace a FRU option.
Connect the service terminal to the cluster bay that does not have the failure.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Bay Planars
Note: After each host bay planar is replaced, the failing cluster bay must be
powered off and then on to determine if it still stops at E152.
Use the Alternate Cluster Repair menu options.
v If the cluster bay now comes ready, go to “MAP 1500: Ending a Service
Action” on page 68.
v If the cluster bay still hangs at E152, call the next level of support.
MAP 4780: Isolating a Functional Code Not Running Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The cluster functional code was not loaded during the last cluster power on. Only
the AIX operating system and RAS (maintenance package) code was loaded. The
service terminal can login to the failing cluster because it only requires the RAS
code.
This most commonly occurs when both clusters are powering on and loading code,
and one cluster has an unrecoverable error. The other cluster powers the failing
cluster off then on in an attempt to recover from the error. This recovery action is
repeated up to two times. On the second attempt, the failing cluster is fenced with
its functional code not loaded. This can also occur if a fenced cluster is rebooted or
powered off and on without first being quiesced with the Alternate Cluster Repair
Menu.
If both clusters are in this condition, it is possible that both RPC cards are in an
incorrect logical state. Resetting the RPC card may clear this condition.
Isolation
1. Use the service terminal to display problems needing repair.
Is there any other related problem log for the failing cluster bay?
v Yes, exit this MAP and repair the related problem.
v No, continue with the next step.
2. Do both cluster bays have a problem log that calls this MAP?
v Yes, continue with the next step.
v No, go to step 5 on page 394.
Problem Isolation Procedures, CHAPTER 3
393
MAP 4780: Functional Code Not Running
3. There may be a false error condition in the rack power control cards that can be
reset.
a. Power Off the 2105 Model Exx/Fxx.
b. Switch the System Power AC circuit breaker on both primary power supplies
to Off (down).
c. Wait until the green Power Control Good indicators on both rack power
control cards are off. It takes up to 30 seconds for the logic voltage supplied
to the rack power control cards to discharge.
d. Switch the System Power AC circuit breaker on both primary power supplies
to On (up).
e. Power On the 2105 Model Exx/Fxx, then continue with the next step.
4. Wait more than the normal amount of time for the customer operator panel
Cluster 1 and 2 Ready indicators to come on solid. A failing cluster may attempt
to load its code up to three times before it posts an error. Each code load
attempt may take 10 to 20 minutes.
v If both clusters come ready, go to “MAP 1500: Ending a Service Action” on
page 68.
v If a cluster hangs and displays a code on its operator panel, go to “MAP
4360: Isolation Using Codes Displayed by the Cluster Operator Panel” on
page 342.
v If a cluster does not come ready, attempt to log in, display, and repair any
new related problem logs. If there are no new related problem logs, call the
next level of support.
5. Only one cluster has a problem log that sent you to this MAP. Verify that the
other Cluster Ready indicator on the rack operator panel is On.
Is the Cluster Ready indicator for the other cluster On?
v Yes, continue with the next step.
v No, display and repair any problem log for the other cluster first. If there are
none, call the next level of support.
6. From the cluster that is ready, attempt to clear the failing cluster by quiescing,
powering off, and powering on the failing cluster:
Connect the service terminal to the cluster that is not failing. From the service
terminal Main Service Menu, select:
Repair Menu
Alternate Cluster Repair
Quiesce the Alternate Cluster
Power Off the Alternate Cluster
Power On the Alternate Cluster
7. Wait more than the normal amount of time for the customer operator panel
Cluster Ready indicator to come on. A failing cluster may attempt to load its
code up to three times before it posts an error.
v If the cluster comes ready, Resume the Alternate Cluster. Then use the
Repair Menu, Close a Previously Repaired Problem for this problem log.
Then go to “MAP 1500: Ending a Service Action” on page 68.
v If the cluster hangs displaying a code on its operator panel, go to “MAP 4360:
Isolation Using Codes Displayed by the Cluster Operator Panel” on page 342.
v If the cluster does not come ready, attempt to log in, display, and repair any
new related problem logs. If there are none, call the next level of support
394
VOLUME 1, ESS Service Guide
MAP 4790: Repairing the Electronics Cage
MAP 4790: Repairing the Electronics Cage
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The electronics cage needs special preparation before replacing some of the power
and cooling FRUs. There is no single service terminal operation to do this.
Some errors, such as two failing fans will cause the power error recovery code to
have already powered off the host bays and cluster bay. It is still necessary to go
through the map steps to quiesce and then power off those resources. This puts
them in the proper state for being powered on.
Isolation
1. Connect the service terminal to the cluster that is not in the electronics cage
being repaired.
2. If you are repairing using a problem log, notice the time and date values in the
″last occurrence″ field. After the FRUs are replaced, you will view the problem
log again. If the ″last occurrence″ field was updated, the problem is still
occurring. Continue with the next step.
3. Quiesce the following resources in the electronics cage to be repaired. Use the
Utility Menu, Resource Management Menu, Quiesce a Resource option.
v Cluster Bay
v Host Bays (both of them)
v Electronics Cage Sense Card
Note: If a cluster bay or host bay is not in the list of resources to quiesce,
then it is already quiesced.
4. Power the host bays off. Use the Utility Menu, Host Bay Power Off/On.
5. Power the cluster bay off. Use the Utility Menu, Cluster Power Off/On option.
6. Switch off each of the three electronics cage power supplies. Set the switch at
the rear of the power supply to off (Ο, down).
7. Unplug both power input cables to each of the three electronics cage power
supplies.
8. Replace the electronic cage FRUs. Refer to Chapter 4 Remove and Replace
procedures in volume 2 of this book.
9. Connect both power input cables to each of the three electronics cage power
supplies.
10. Set each power supply switch to on (|, up).
11. Power on the electronics cage by pressing the 2105 Model Exx/Fxx operator
panel local power switch momentarily to the on (up) position.
12. Resume the electronics cage sense card. Use the Utility Menu, Resource
Management Menu, Resume a Resource option.
The resume will check all power and cooling status conditions for this
electronics cage. Go to the next step to determine if the problem status is no
longer occurring.
Problem Isolation Procedures, CHAPTER 3
395
MAP 4790: Repairing the Electronics Cage
13. Determine if the problem has been repaired. Use the service terminal Repair
Menu, Show / Repair Problem Needing Repair option. Check for both of the
following:
v View the original problem log (if it exists). Check if the ″last occurrence″
timestamp field was updated after you powered back on. If updated, the
problem has not been repaired. Replace the remaining FRUs or call the
next level of support.
v Fix any new problem log related to the electronics cage power or cooling
function.
v If the problem has been repaired, continue with the next step.
14. Resume both host bays and then the cluster bay. Use the Utility Menu,
Resource Management Menu, Resume a Resource option.
15. Return to the procedure that sent you here or go to “MAP 1500: Ending a
Service Action” on page 68.
MAP 4810: Unexpected Host Bay Power Off
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
A host bay can loose power in two ways:
v The electronic cage power supplies are operating correctly, but the power is not
reaching the host bay planar or the host bay planar is failing.
v The electronic cage power supplies are failing because they are not receiving a
signal to power on. Each RPC card has a power control line for each electronics
cage power boundary (each host bay and the cluster bay). The signal leaves the
RPC card, passes through the RPC to electronics cage cable, then into the
electronics cage sense card. The signals from each RPC card are combined
together on the sense card. The combined signal then passes through the
fan/RPC to upper backplane cable, the electronics cage backplane, and onto
each of the three electronics cage power supplies.
Note: If the electronic cage power supplies quit supplying power to a host bay, all
three may need to be reset before they can power up the host bay again.
The electronics cage cannot be used by the customer while you do this.
Isolation
1.
396
Verify that the host bay is powered off. Observe the three LED indicators on
the front of each electronics cage power supply above the failing host bay. The
left LED indicator is for the left host bay, the right LED indicator is for the right
host bay.
Is the same host bay LED indicator off on all three electronics cage power
supplies?
v Yes, continue with the next step.
v No, replace the electronics cage power supply with the indicator that is off.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Electronics Cage Power Cooling FRUs
VOLUME 1, ESS Service Guide
MAP 4810: Unexpected Host Bay Power Off
(Electronics Cage Power)
2. Ensure the host bay is fully seated. Attempt to power on the host bay. Use the
service terminal to simulate replacing the host bay planar FRU. This will do the
needed quiesce and power off prior to attempting to power on the host bay.
v If it powers on, continue with the verify and resume until the menu option is
complete.
v
If it does not power on, connect the service terminal to the working cluster.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU Menu
Host Bay FRUs
Did the host bay power on?
v Yes, return to the procedure that sent you here, or to “MAP 1500: Ending a
Service Action” on page 68.
v No, continue with the next step.
3. The power supplies are not receiving a power on signal for the host bay. Show
and repair any RPC or electronics cage related problems.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair a Problem Needing Repair
Is there a related problem?
- Yes, exit this MAP and repair the problem, then return to the
beginning of this MAP and power on the host bay.
- No, continue with the next step.
4. Isolate if the RPC-1 card is causing the problem. Repeat the following
procedure for each RPC card.
a.
Use the Replace a FRU menu option to quiesce and power off the RPC
card. The option will send you to the service guide for the power off
procedure.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Rack Power Cooling FRUs
(Rack Power Control Card)
b. Ensure the RPC card power green LED is off, then unplug the RPC to
electronics cage cable at the electronics cage sense card. Exit out from the
menu option leaving the RPC card quiesced and powered off. Continue
with the next step.
5. Attempt to power on the host bay.
From the service terminal Main Service Menu, select:
Utility Menu
Host Bay Power Off / On
Did the host bay power on (the electronics cage power supply indicator LEDs
for this host bay light)?
v Yes, the cable or RPC card that was disconnected is failing, continue with
the next step.
v No, go to step 11 on page 398.
Problem Isolation Procedures, CHAPTER 3
397
MAP 4810: Unexpected Host Bay Power Off
6. Replace the failing FRU(s), use the Replace a FRU menu option.
After the repair, return here and continue with the next step.
7. Power on the host bay.
From the service terminal Main Service Menu, select:
Utility Menu
Host Bay Power Off / On
8. Resume the host bay. Connect the service terminal to the working cluster.
From the service terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Resume a Resource
9. If the problem log that sent you here has not been closed, close it.
10.
11.
12.
13.
From the service terminal Main Service Menu, select:
Repair Menu
Close a Previously Repaired Problem
Ensure there are no additional problems needing repair and all resources have
been returned for customer use.
From the service terminal Main Service Menu, select:
Repair Menu
End of Call Status
Use the Replace a FRU menu option to return the RPC-1 card to customer
use.
Isolate if the RPC-2 card is causing the problem. Repeat the following
procedure on each RPC card.
a. Use the Replace a FRU menu option to quiesce and power off the RPC
card. The option will send you to the service guide for the power off
procedure.
From the service terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Rack Power Cooling FRUs
(Rack Power Control Card)
b. Ensure the RPC card power green LED is off, then unplug the RPC to
electronics cage cable at the electronics cage sense card. Exit out from the
menu option leaving the RPC card quiesced and powered off. Continue
with the next step.
Attempt to power on the host bay.
From the service terminal Main Service Menu, select:
Utility Menu
Host Bay Power Off / On
Did the host bay power on (the electronics cage power supply indicator LEDs
for this host bay light)?
v Yes, The cable or the RPC card that was disconnected is failing, go to step
6.
v No, the problem may be one of the three electronics cage power supplies
holding the power signal down. Continue with the next step.
398
VOLUME 1, ESS Service Guide
MAP 4810: Unexpected Host Bay Power Off
14.
This step requires taking both host bays and the cluster bay in this electronics
cage away from customer use.
Go to “MAP 4790: Repairing the Electronics Cage” on page 395, do only the
steps up to and including switching off the three electronics cage power
supplies. Then return here and continue with the next step.
Note: The three power output power LED indicators on the front of each
electronics cage power supply should be off.
15. Do the following to one of the three electronics cage power supplies.
a. Unplug both power input cables.
b. Remove the retaining screws and remove the electronics cage power
supply. Inspect the power docking connectors for any visual damage.
c. Set the rear switch on the two electronics cage power supplies still installed
to up (on).
Did the failing host bay power on?
v Yes, the electronics cage power supply that is removed is failing. Exit this
MAP and return to “MAP 4790: Repairing the Electronics Cage” on page 395
to replace the FRU
v No, reinstall the electronics cage power supply that is removed. Continue
with the next step.
16. Set the rear switch down (off) for all three electronics cage power supplies.
Have all three electronics cage power supplies been tested?
v Yes, continue with the next step.
v No, return to the last step and repeat it for the next electronics cage power
supply.
17. The power up signal from the RPC cards is reaching the electronics cage
sense card but it is not reaching the electronics cage power supplies. One of
the following FRUs is failing:
v Electronics cage sense card
v Fan/RPC to upper backplane cable
v Electronics cage backplane
Go to “MAP 4790: Repairing the Electronics Cage” on page 395.
18. If the host bay still does not power up, call your next level of support.
MAP 4820: Isolating a SCSI Card Configuration Timeout
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The SCSI card firmware load process did not complete the first load attempt which
created the problem log that sent you here. That failure should have caused a reset
that attempted a second firmware load attempt. If the card status is available, the
second firmware load attempt was successful.
Problem Isolation Procedures, CHAPTER 3
399
MAP 4820: SCSI Card Configuration Timeout
Isolation
1. Repair any other problem logs for this SCSI Card.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Were any other problem logs for this SCSI Card repaired?
v Yes, retry the firmware update load process. If it still fails, call the next level
of support.
v No, continue with the next step.
2. Read the description section above. Determine if SCSI card status is available.
From the service terminal Main Service Menu, select:
Utility Menu
Show Storage Facility Resources Menu
Show Storage Facility Resources
Use the left column to find the Engineering FRU Name listed in the problem log
and determine the status.
Is the status ’available’?
v Yes, continue with the next step.
v No, call the next level of support.
3. Close the problem log.
From the service terminal Main Service Menu, select:
Repair Menu
Close a Previously Repair Problem
MAP 4840: CPI Diagnostic Communication Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The CPI diagnostics are run from both clusters to each host bay. The clusters
communicate with each other through the cluster to cluster ethernet connection.
Note: The problem may list the failing resource as a CPI interface. The CPI
interface shown is the CPI interface that was being tested when the
communication failure occurred. It is not the actual failing resource.
Isolation
1. Go to “MAP 4380: Isolating a Customer LAN Connection Problem” on page 346.
Return here and continue after the communication problem is repaired.
2. The communication failure stopped the diagnostics before all of the CPI
interfaces were tested.
3. Has the customer been using this 2105 Model Exx/Fxx after the problem was
logged?
400
VOLUME 1, ESS Service Guide
MAP 4840: CPI Diagnostic Communication Problem
v Yes, show and repair any related CPI problems. If there are none, use the
Repair Menu, Close a Previously Repaired Problem option for the problem
that sent you here. Then exit this MAP and go to “MAP 1500: Ending a
Service Action” on page 68.
v No, continue with the next step.
4. You can run the CPI diagnostics in two ways:
a. Power the 2105 Model Exx/Fxx off and on again. This tests all four CPI
interfaces.
b. Quiesce/resume a cluster bay and then each host bay. This tests each CPI
interface one at a time.
Connect the service terminal to the working cluster. From the service
terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Quiesce a Resource
Then display problems needing repair and look for new related problems.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
MAP 4970: Isolating a Software Problem
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The 2105 Model Exx/Fxx functional code detected a software problem that will
require the next level of support to correct. Powering off and then on the cluster or
reloading the SCSI Hard Drive code will not fix it. The next level of support may ask
you to provide them with the information displayed in one or more fields of the
problem. This will help identify the specific problem and the actions needed to
correct it.
This MAP is also called if a LIC feature license failure has been detected by the
2105 code. Another MAP isolates this problem.
Procedure
Is the ESC listed in the problem one of the following?
v 384B - License Failure, license out of sync on each cluster bay
v 384C - License Failure, PAV disabled
v
v
v
v
v
384D - License Failure, XRC disabled
384E - License Failure, PPRC disabled
384F - License Failure, Flash Copy disabled
Yes, go to “MAP 4990: LIC Feature License Failure” on page 404.
No, call your next level of support.
Problem Isolation Procedures, CHAPTER 3
401
MAP 4980: Copy Services Problems
MAP 4980: Customer Copy Services Problems
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
The customer is experiencing problems or has asked for assistance with ESS Web
Copy Services.
One of the following conditions may be present:
v The customer is unfamiliar with managing Copy Services using the ESS
Specialist
v The customer wants help in managing Copy Services
v ESS Web Copy Services is not properly configured
v The customer has asked you to restart Copy Services
v The customer is not seeing a complete LSS list at the host
Procedure
Use the following table to help determine the action needed to resolve the
customer’s problem. Find the Symptom in the table and then use the Action to
isolate and repair the problem.
Table 31. ESS Web Copy Services Problems
Symptoms
Actions
The customer is unfamiliar with managing
Copy Services with the ESS Specialist.
Familiarize yourself with the use of the ESS
Specialist Copy Services feature and do one
of the following:
v instruct the customer on how perform the
necessary operations
v Use the ESS Specialist to manage Copy
Services for the customer
v Instruct the customer to refer to the IBM
Enterprise Storage Server Web Users
Interface Guide book, SC26-7346
The customer wants help managing Copy
Services.
Use the Copy Services SMIT screen option
Copy Services Menu under the
″Configurations Options Menu″ in chapter 8
of the Enterprise Storage Server Service
Guide, Volume 3.
Instruct the customer to refer to the IBM
Enterprise Storage Server Web Users
Interface Guide book, SC26-7346.
ESS Web Copy Services is not properly
configured
402
VOLUME 1, ESS Service Guide
Use the ″Configure Copy Services, with
DNS″ in chapter 6 of the Enterprise Storage
Server Service Guide, Volume 2, or the
″Configure Copy Services, without DNS″ in
chapter 6 of the Enterprise Storage Server
Service Guide, Volume 2.
MAP 4980: Copy Services Problems
Table 31. ESS Web Copy Services Problems (continued)
Symptoms
Actions
The customer has asked you to restart Copy From the service terminal Main Service
Services
Menu, select:
Configure Options Menu
Copy Services Menu
Copy Services Server Menu
Change Server Definitions
Select one of the following:
Reset to Primary
Restarts Copy Services
with Primary Server as
active server
Reset to Backup
Restarts Copy Services
with Backup Server as
active server
Problem Isolation Procedures, CHAPTER 3
403
MAP 4980: Copy Services Problems
Table 31. ESS Web Copy Services Problems (continued)
Symptoms
Actions
The customer is not seeing a complete LSS
list at the host terminal
Do one of the following:
1. If the customer has asked you to restart
Copy Services, from the service terminal
Main Service Menu, select:
Configure Options Menu
Copy Services Menu
Copy Services Server Menu
Change Server
Definitions
Select one of the following:
Reset to Primary
Restarts Copy Services
with Primary Server as
active server
Reset to Backup
Restarts Copy Services
with Backup Server as
active server
2. The network connecting the primary
server to the backup server may be
down. Ask the customer to check the
network.
3. The backup server may not be installed
or configured.
Has the backup server been installed?
v Yes, the backup server is installed but
may not be configured. Use the
″Configure Copy Services, with DNS″
in chapter 6 of the Enterprise Storage
Server Service Guide, Volume 2, or
the ″Configure Copy Services, without
DNS″ in chapter 6 of the Enterprise
Storage Server Service Guide, Volume
2.
v No, the backup server needs to be
installed. A new ESS subsystem needs
to be installed or a Copy Services
MES needs to be ordered and
installed on a currently installed
backup server ESS subsystem.
This MAP has not been able to resolve your
problem.
Contact your next level of support.
MAP 4990: LIC Feature License Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
404
VOLUME 1, ESS Service Guide
MAP 4990: LIC Feature License Failure
Description
There are LIC features that the customer buys a license for. The service
representative enables the feature by loading a customized diskette written for this
2105s serial number. If there is a mismatch, a problem log will be created with an
ESC field that identifies the feature that is disabled.
Procedure
1. Display the problem details screen and identify the ESC and LIC feature that is
disabled.
v 384B - License Failure, license out of sync on each cluster bay
v 384C - License Failure, PAV disabled
v 384D - License Failure, XRC disabled
v 384E - License Failure, PPRC disabled
v 384F - License Failure, Flash Copy disabled
2. Display the LIC feature status screen.
Connect the service terminal to the working cluster. From the service terminal
Main Service Menu, select:
Licensed Internal Code Maintenance Menu
LIC Feature Menu
Display Active LIC Features
3. The LIC feature will be disabled if the Configured Capacity exceeds the Feature
Capacity Limit. If it does, do one of the following:
v The configured capacity must be reduced.
v The customer must purchase more LIC feature capacity. Then the a
customized diskette enabling the added capacity must be installed.
4. The LIC feature will be disabled if the LIC Feature Control diskette has not been
created and installed. For more information on how to create the diskette
reference, ″LIC Feature Control Record Extraction″ in chapter 5 of the
Enterprise Storage Server Service Guide, Volume 2 book.
Note: The LIC feature are automatically reloaded as part of the SCSI hard
drive rebuild process.
5. The LIC feature capacities should be the same on both clusters. If they are not,
call the next level of support.
MAPs 5XXX: Host Interface Isolation Procedures
Procedures in the MAP 5XXX group of the Isolate chapter cover the host interface
attached to the 2105 Model Exx/Fxx and the internal read/write data paths.
MAP 5000: ESS Specialist Cannot Access Cluster
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Problem Isolation Procedures, CHAPTER 3
405
MAP 5000: ESS Specialist Cannot Access Cluster
Description
ESS specialist is accessed by using a web browser from the ESSNet console or
other customer console. The ESS specialist software runs on each 2105 Model
Exx/Fxx cluster. Both the customer console and the ESSNet console access the
cluster through the ESSNet ethernet hub.
Isolation
1. Does ESS Specialist access work from the ESSNet console?
v Yes, continue with the next step.
v No go to step 4.
2. Is access working from a customer console (if used)?
v Yes, ensure access works to both cluster bays before determining that the
problem is no longer occurring.
v No, continue with the next step.
3. ESS specialist works from the ESSNet console but fails from the customer
console. The customer network accesses the cluster bay through an ethernet
connection at the ESSNet console ethernet hub. Check the following:
v Customer is using the proper Hostname for the cluster bay on an intranet.
v Customer is using the proper Hostname and domain name for the cluster bay
on internet.
v Have the customer try the tcp/ip address.
v Have the customer ping the tcp/ip address. If the ping is successful, then
there is a problem with the domain nameserver or other customer or internet
problem.
v Verify that the ESSNet ethernet hub port indicator for the customer network
attachment is on or blinking. This means it is able to communicate with the
customer ethernet hub/connection. The problem is either a failing port on the
ESSNet ethernet hub or more likely a customer network problem. Go to
“MAP 4450: ESSNet Cluster Bay to Customer Network Problem” on
page 354.
4. Ensure that the cluster has ESS Specialist access enabled. The InfoServer
status will be running.
From the service terminal Main Service Menu, select:
Configuration Options Menu
Configure Communications Resources Menu
ESS Specialist Menu
Show ESS Specialist Status
Continue with the next step.
5. Is the InfoServer running?
v Yes, go to “MAP 4440: ESSNet Console to Cluster Bay Problem” on
page 352
v No, use the Enable / Disable ESS Specialist option to enable it.
MAP 5220: Isolating a SCSI Bus Error
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
406
VOLUME 1, ESS Service Guide
MAP 5220: SCSI Bus
Attention: To prevent electrostatic discharge, ensure you discharge all SCSI host
cables to the ESD discharge pad, before you plug them into the 2105 Model
Exx/Fxx. The ESD discharge pads are mounted on the front right and left corners of
the 2105 Model Exx/Fxx frame, next to each tailgate. See Figure 147 on page 409
for the location of the ESD discharge pads.
The SCSI bus has an error:
Description
SCSI bus errors can be detected by any SCSI bus card on the interface. The 2105
Model Exx/Fxx SCSI host card will most often detect errors in the signals it
receives. The customer host system SCSI card will most often detect errors in the
signals it receives. The SCSI cables seldom fail, but the SCSI cable connections
may cause errors if they are not properly seated. Errors can also be caused if there
are not terminators on each end of the SCSI cable. The 2105 Model Exx/Fxx SCSI
Host Adapter has a terminator on the card itself.
Isolation
1. Display and repair any 2105 Model Exx/Fxx reported SCSI adapter problems
that may be related to the failure. If none are found, continue with the next
step.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
2. Use the following checks to locate and repair the problem.
3. Check for a fenced condition:
Note: If SCSI parts have been replaced and the customer still does not have
access to some volumes. The original SCSI error could have fenced a
SCSI port.
a. Verify that the SCSI ports are not fenced:
4.
5.
6.
7.
Connect the service terminal to the cluster being serviced.
From the service terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Show Fenced Resources
b. Reset any fenced SCSI ports:
From the service terminal Main Service Menu, select:
Utility Menu
Resource Management Menu
Reset Fence For a Resource
Check that the SCSI host cable is properly connected at each SCSI card.
Check that the 2105 Model Exx/Fxx SCSI host card(s) is properly seated.
Check that the host system(s) SCSI card(s) is properly seated.
Check the termination of the SCSI Bus:
v A SCSI bus interface cable connects two or more SCSI cards. Connectors
at each end of the daisy-chain must be terminated. The 2105 Model
Exx/Fxx SCSI host card must be at one end of the SCSI cable. If two 2105
Model Exx/Fxx SCSI host cards are attached to a SCSI bus interface cable,
they must be at the opposite ends of the SCSI cable. The customer host
SCSI card(s) (one to four) must be in between. SCSI bus termination’s are
internal to the 2105 Model Exx/Fxx SCSI Host Card.
Problem Isolation Procedures, CHAPTER 3
407
MAP 5220: SCSI Bus
v If two 2105 Model Exx/Fxx SCSI host cards are connected to the SCSI bus,
ensure that the host system SCSI card(s) are not configured to terminate
the SCSI bus when the host system is powered off.
8. Check the SCSI ID Settings. There must be no duplicates for the ports
connected to the same SCSI bus cable.
v If two 2105 Model Exx/Fxx SCSI ports are attached to the same SCSI
cable, verify that the SCSI ID assignments in each port are not in conflict.
v Verify that each host SCSI card attached to the SCSI bus is set to a unique
SCSI ID.
v Verify that host SCSI host card SCSI ID assignments are correctly
registered in the 2105 Model Exx/Fxx SCSI port configuration.
9. Check SCSI bus slot parameter settings:
Note: 2105 Model Exx/Fxx SCSI bus parameters are set according to the host
type configuration setting for each 2105 Model Exx/Fxx SCSI port.
These are recorded on the customer worksheets that were used to
install the 2105 Model Exx/Fxx.
a. Verify that the host type setting is correct for each 2105 Model Exx/Fxx
SCSI host cards attached to the SCSI bus cable.
b. Verify that the SCSI bus parameter settings that have been configured into
each attached host system SCSI host card are in agreement with the 2105
Model Exx/Fxx SCSI bus parameter settings.
10. SCSI diagnostics:
v The 2105 Model Exx/Fxx has no SCSI diagnostics available to test the SCSI
interface. The customer host system may have SCSI diagnostics that can
be used to test the SCSI interface. Those same diagnostics may have
procedures available to recreate and isolate the problem. Those diagnostics
or procedures can be used now.
v If the problem is not yet isolated, the 2105 Model Exx/Fxx SCSI host card
can be replaced now.
Connect the service terminal to the cluster being serviced. From the service
terminal Main Service Menu, select:
Repair Menu
Replace a FRU
Host Bay FRUs
Follow the guided procedure.
11. Was a problem found and repaired?
v Yes, after the problem is repaired, go to “MAP 1500: Ending a Service
Action” on page 68.
v No, if no problem is found, and the failure still occurs, call the next level of
support.
408
VOLUME 1, ESS Service Guide
MAP 5230: Fixed Block Read Data
Cluster 1
Front
View
ESD Discharge
Pad
Cluster 2
Top View
Tailgate
ESD Discharge
Pad
Figure 147. 2105 Model Exx/Fxx ESD Discharge Pad Locations (S008339m)
MAP 5230: Isolating a Fixed Block Read Data Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: To prevent electrostatic discharge, ensure you discharge all SCSI host
cables to the ESD discharge pad, before you plug them into the 2105 Model
Exx/Fxx. The ESD discharge pads are mounted on the front right and left corners of
the 2105 Model Exx/Fxx frame, next to each tailgate. See Figure 147 for the
location of the ESD discharge pads.
Description
You are here to resolve a Data Check failure that has been logged with one of the
ESC values listed below. An action to repair hardware or microcode is necessary;
the action required may be to repair another problem record in the log.
This MAP isolates for the following ESCs:
v ESC 3490, customer data sequence number validation error with data LRC.
v ESC 34A0, customer data sequence number validation error without data LRC.
v ESC 34AF, third or later repeat of customer data sequence number validation
error on the same target LBA (Logical Block Address), track or volume.
v ESC 34B0, SCSI Send Diagnostic command initiated data transfer validation
process failure.
v ESC 4960, second occurrance of customer data sequence number validation
error on the same target LBA (Logical Block Address), track or volume.
Isolation
Refer to Table 32 on page 410 for the ESC that requires problem resolution.
Determine the necessary hardware or microcode repair action.
Problem Isolation Procedures, CHAPTER 3
409
MAP 5230: Fixed Block Read Data
Table 32. SCSI Read Data Failure ESC Repair Table
ESC
Description
Recommended Action
3490
Customer Data Sequence Number validation
error. Data transferred from a DDM to cache
memory is not from the expected Logical Block
Address (LBA). The Sequence Number in the
received LBA does not match the expected
Sequence Number. Sequence number validation
also detected LRC indicating that the LBA data is
defective.
LRC failures are a higher priority symptom. If the
problem log contains a failure with ESC value
33XX (LRC failure), the recommended action is
to repair the ESC 33XX problem record.
Customer Data Sequence Number validation
error. Data transferred from a DDM to cache
memory is not from the expected Logical Block
Address (LBA). The Sequence Number in the
received LBA does not match the expected
Sequence Number.
An error has occurred during the reading or
writing of data from the track, volume or array.
ESC 34AF indicates that additional Sequence
Number error events have been logged for the
same target LBA, track or volume.
Customer repair action may be required to
restore data after the hardware problem has
been resolved.
A SCSI Send Diagnostic command initiated data
transfer validation process failed. A write or read
data transfer failure would be logged as another
error event and ESC. If no other error has been
logged then this failure indicates that the data
read did not match the test pattern data written.
This problem record should only be used to
determine a repair action if the problem log does
not contain any other records for a hardware
failure that would be associated with this
diagnostic failure SCSI port, data path and target
volume.
34A0, 34AF, or
4960
34B0
If a problem record with ESC 33xx does not exist
then the probable cause for this failure is a
Microcode Logic Error. The recommended action
is to contact your next level of support for fault
isolate and repair assistance.
The recommended action is to contact your next
level of support for fault isolate and repair
assistance.
If you are unable to identify another hardware
repair action then the recommended action is to
contact your next level of support for fault isolate
and repair assistance.
MAP 5240: Isolating a Customer Data Check Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
You are here to resolve a Data Check failure that has been logged with one of the
ESC values listed below. An action to repair hardware or microcode is necessary.
This required action will be to repair another problem record in the log.
The failure has caused customer data to be unreadable. The customer must restore
the data after the hardware or microcode repair action is complete.
This MAP isolates for the following ESCs:
v
v
v
v
410
ESC
ESC
ESC
ESC
4910,
4920,
4930,
4940,
VOLUME 1, ESS Service Guide
Customer
Customer
Customer
Customer
data
data
data
data
check,
check,
check,
check,
DDM medium error, single LBA.
DDM medium error, multiple LBAs.
data LRC, single LBA.
data LRC, multiple LBAs.
MAP 5240: Customer Data Check
Isolation
Refer to Table 33 for the ESC that requires problem resolution. Determine the
necessary hardware or microcode repair action.
After the underlying hardware has been repaired, customer repair action will be
required to restore the track:
Fixed Block:
Refer to the Additional Information Message in the problem log for the failed
volume and first failing LBA on track information. Restore this data from
backup.
CKD:
A Media SIM for Media Maintenance Procedure 2 has been sent to the
host. Ask the customer to follow this procedure to return the track to usable
condition, then restore the customer data from backup. Media Maintenance
Procedure 2 is described in “Analyzing a Media SIM”.
If a hardware repair problem log record is not available for this failure, the failure
may be intermittent. If the data failure continues, call your next level of support for
assistance in isolating and repairing the problem.
Table 33. Customer Data Check Failure ESC Repair Table
ESC
4910 or
4920
Description
Recommended Action
Customer Data Check affecting one or Locate and repair the problem log
more Logical Block Address on the
record with ESC CXXX, DXXX or
EXXX that contains a repair action for
target volume. 4910 indicates one
LBA, 4920 indicates more than one
the DDM or SSA device card that is
LBA.
associated with this Data Check.
The SSA device card reported a
Medium Error during data transfer
from DDM to cache memory.
4930 or
4940
Customer Data Check affecting one or Locate and repair any problem log
records with ESC 33XX or 34XX.
more Logical Block Address on the
target volume. 4930 indicates one
LBA, 4940 indicates more than one
LBA.
An LRC check, sequence number
check or physical address check
detected during data transfer could not
be recovered. Data has been marked
defective on the DDM. Subsequent
attempts to read this data will fail.
Analyzing a Media SIM
For information about correcting a failure that causes a media SIM, see the
following chapters in Maintaining IBM Storage Subsystem Media:
1. ″Tools and Techniques Used to Perform Media Maintenance″
2. ″Performing Media Maintenance on SIM Devices″
Note: Before the customer does a media maintenance procedure, the customer
may need to determine the address of the cylinder and head involved in the
failure. Use the SIM portion of an EREP system execution report to obtain
the address (cccchh).
Problem Isolation Procedures, CHAPTER 3
411
MAP 5240: Customer Data Check
2105 Model Exx/Fxx Media SIM Maintenance: Instruct the customer to perform
the media maintenance procedure indicated in “Media Sim Maintenance Procedure
2” Also, look at the examples shown in “Example of Media Sim Maintenance
Procedure 2”.
Media Sim Maintenance Procedure 2: The first part of this procedure finds all
tracks with unrecoverable data and supplies information on the allocation of the
user data (for example, dataset names).
The second part of this procedure returns the indicated track to a usable condition.
Data on this track is no longer readable. All subsystem attempts at media
maintenance have been unsuccessful. All attempts to recover the data have been
unsuccessful.
1. Using ICKDSF Release 16 or higher, enter the following commands:
IODELAY SET MSEC(100)
ANALYZE <UNIT() |DDNAME()> NODRIVE SCAN
IODELAY adjusts ICKDSF to run concurrently with customer operations.
ANALYZE scans the volume for data that is not readable or usable.
2. See “Example of Media Sim Maintenance Procedure 2” for the location of the
ESC and addresses of the failing track and head (cccchh) in the Analyze sense
information.
3. For each track that reports an ESC of 49XX, issue the following command (all
on the same line):
INSPECT <UNIT() | DDNAME()> <VFY()|NOVFY>
ASSIGN NOCHECK NOPRESERVE TRACK(cccc,hh)
Warning: The above ICKDSF inspect command will result in the loss of all
customer data on that track.
The NOPRESERVE parameter must be specified for the 2105 Model Exx/Fxx.
The PRESERVE parameter is not valid for the 2105 Model Exx/Fxx. All previous
attempts by the subsystem to recover the data have not been successful.
Although the track will be returned to a usable state, all customer data on the
specified track will be lost when the INSPECT command is run.
Example of Media Sim Maintenance Procedure 2: To locate all tracks with
unrecoverable data, obtain information on the allocation of user data. To restore
such tracks to a usable condition, run the ICKDSF command sequence below.
ICKDSF must be at level 16 or higher. The bold text in the following example is
defined in the note below.
ENTER INPUT COMMAND:
analyze unit(1290) nodrive scan
ANALYZE UNIT(1290) NODRIVER SCAN
ICK00700I DEVICE INFORMATION FOR 1290 IS CURRENTLY AS FOLLOWS:
PHYSICAL DEVICE = XXXX
STORAGE CONTROLLER = XXXX
STORAGE CONTROL DESCRIPTOR = CC
DEVICE DESCRIPTOR = 06
ICK04000I DEVICE IS IN SIMPLEX STATE
ICK01400I 1290 ANALYZE STARTED
ICK01408I 1290 DATA VERIFICATION TEST STARTED
ICK21776I DATAVER TEST: ERROR DURING DATA VERIFICATION
CSW = D07C88 0200FFFF CCW = DE000000 3000FFFF FILEMASK = 1E
SENSE = 80000000 9000010B 00000034 80000004 02007667 FB200F0B 000040E2 0003A401
ICK21401I 1290 SUSPECTED DRIVE PROBLEM
412
VOLUME 1, ESS Service Guide
MAP 5240: Customer Data Check
ICK401I 1290 SUSPECTED DRIVE PROBLEMcchh
ICK01406I 1290 ANALYZE ENDED
ICK00001I FUNCTION COMPLETED, HIGHEST CONDITION CODE WAS 8
Note: In this example, the ESC is 0F0B and the failing track and head address
(cccchh) is 03A401. The cccc is 03A4 and the hh is 01.
Common ICKDSF Messages:
ICK31054I - Device not supported for specific function
Ensure that the parameters specified in the media maintenance procedure
are correct and rerun the ICKDSF media maintenance procedure.
ICK12155I - Parameter ignored for device type (parameter)
The parameter identified is not valid for the 2105 Model Exx/Fxx. This
parameter is ignored and processing continues. No action is needed.
MAP 5250: Isolating a Meta Data Check Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
You are here to resolve a Data Check failure that has been logged with one of the
ESC values listed below. An action to repair hardware or microcode is necessary.
This required action will be to repair another problem record in the log.
This MAP isolates for the following ESCs:
v ESC 4980, Meta data check, DDM medium error, single LBA.
v ESC 4990, Meta data check, DDM medium error, multiple LBA.
v ESC 49A0, Meta data check, data LRC, single LBA.
v ESC 49B0, Meta data check, data LRC, multiple LBA.
Isolation
Refer to Table 34 on page 414 for the ESC that requires problem resolution.
Determine the necessary hardware or microcode repair action.
Data will be recovered by internal microcode. No data repair action is required.
If a hardware repair problem log record is not available for this failure, the failure
may be intermittent. If the data failure continues, call your next level of support for
assistance in isolating and repairing the problem.
Problem Isolation Procedures, CHAPTER 3
413
MAP 5250: Meta Data Check
Table 34. Meta Data Check Failure ESC Repair Table
ESC
4980 or
4990
Description
Recommended Action
Meta Data Check affecting one or
more Logical Block Address on the
target volume. 4980 indicates one
LBA, 4990 indicates more than one
LBA.
Locate and repair the problem log
record with ESC CXXX, DXXX or
EXXX that contains a repair action for
the DDM or SSA device card that is
associated with this Data Check.
The SSA device card reported a
Medium Error during data transfer
from DDM to cache memory.
49A0 or
49B0
Meta Data Check affecting one or
more Logical Block Address on the
target volume. 49A0 indicates one
LBA, 49B0 indicates more than one
LBA.
Locate and repair the problem log
record with ESC 33XX that contains a
repair action for the DDM or SSA
device card that is associated with this
data check.
An LRC check detected during data
transfer from DDM to cache memory
could not be recovered.
MAP 5300: ESCON Link Fault
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Fiber Optic Cable Handling Precautions
CAUTION:
Do not look into the end of a fiber optic cable or into a fiber optic receptacle.
Eye injury can result. To verify the continuity of a fiber optic cable, use an
optical light source and a power meter. Although shining light into one end
and looking into the other end of a disconnected optical fiber to verify the
continuity of optic fibers may not injure the eye, this procedure is potentially
dangerous. Therefore, verifying the continuity of optical fibers by shining light
into one end and looking into the other end is not recommended. (1061)
Note: This notice is translated into selected languages. See ″Translation of
Cautions and Danger Notices″ in chapter 11 of the Enterprise Storage Server
Service Guide, Volume 3.
Attention: Fiber optic cables are easily damaged from fiber breakage. The cable
connectors also must be clean to perform correctly. Observe the following
precautions to prevent damage when you handle fiber optic cables:
1. Save all the plastic connector covers for later use. These covers can be used to
protect the ESCON cable connectors when you remove the ESCON adapter
card or when you store the cables.
2. Do not remove the protective cover plugs from the connector ends until you are
ready to insert the connector into a card. You may have to remove the cover to
feed the cable through the tailgate.
414
VOLUME 1, ESS Service Guide
MAP 5300: ESCON Link Fault
3. Before you insert the connector into a card, ensure that you clean the connector
end faces. Use the fiber optic cleaning procedure specified in the fiber optic
connector cleaning kit (New P/N 46G6844 or Old P/N 5453521).
4. Do not pull on the connector.
5. Do not bend the cable to a radius smaller than 12mm (0.5 in).
Description
Link incidents are problems that are not automatically detected, isolated, and
reported by any one single node on the ESCON link. They occur on an interface
and may cause multiple nodes to detect different types of link incidents. Each node
detecting and reporting a link incident will generate its own link incident.
Link incidents detected by the storage facility may be displayed from the error log.
They are also available in the EREP Event History and Detail Edit reports.
Fault isolation of link incidents is solved by the combined use of product and
system documentation:
v Enterprise Systems Link Fault Isolation book, form number SY22-9533
v Maintenance Information for S/390 Fiber Optic Links (ESCON, FICON, Coupling
Links, and Open System Adapters) book, form number SY27-2597.
Ensure that both documents are available for problem determination.
Isolation
1. Display and repair any 2105 Model Exx/Fxx reported ESCON adapter problems
that may be related to the failure. If none are found, continue with the next step.
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
v If a problem is found, continue with the next step.
v If no problem is found, and the failure still occurs, call the next level of
support.
2. Obtain the link incident reports from either the ES Connection Analyzer output
or the EREP Link Maintenance Information Event History Report. Start the
problem determination using MAP 100 in the Enterprise Systems Link Fault
Isolation book, form number SY22-9533. Use this map to determine the most
probable failing part of the link.
v If MAP 0100 finds the control unit node IS the most possible FRU, continue
in MAP 0100.
v If the control unit node is NOT the most possible FRU, continue with the next
step.
3. Check that the ESCON cable is properly connected at each ESCON card.
v If it IS properly connected, continue with the next step.
v If it is NOT connected correctly, reconnect it then continue with the next step.
4. Run the 2105 Model Exx/Fxx optical wrap tests on the failing link:
From the service terminal Main Service Menu, select:
Machine Test Menu
Interface Cards Menu
ESCON Host Ports Menu
ESCON Port Optical Wrap Test
Problem Isolation Procedures, CHAPTER 3
415
MAP 5300: ESCON Link Fault
Select the SA interface to be tested, and follow the instructions on the screen to
run the test.
Did the test run successfully?
v Yes, continue with the next step.
v No, use the repair process to replace the FRU:
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Repair any ESCON adapter problems shown.
5. Check the optical transmitter output level. Go to “MAP 5320: ESCON Optical
Power Measurement” on page 418, and return here when that test has been
completed.
Was the optical transmitter output correct?
v Yes, continue with the next step.
v No, use the repair process to replace the FRU:
Repair Menu
Replace a FRU
Host Bay FRUs
Select the host bay containing the ESCON card.
Select the ESCON card.
Replace the ESCON card with this procedure.
6. Check that the optical receiver is receiving a correct signal level. Go to “MAP
5320: ESCON Optical Power Measurement” on page 418 and return here when
that test has been completed.
Was the optical receiver input level correct?
v Yes, optical power testing is complete. Continue with the next step.
v No, reconnect the link. Additional problem determination is needed to isolate
the fault. Return to MAP 0120 in the Enterprise Systems Link Fault Isolation
book, form number SY22-9533.
7. Are you working on a Bit Error Rate (BER) incident?
v Yes, go to “MAP 5310: ESCON Bit Error Validation” to do bit error validation
testing.
v No, additional link problem determination is needed.
a. Ensure that all optical link cables are reconnected.
b. Return to MAP 0120 in the Enterprise Systems Link Fault Isolation book,
form number SY22-9533.
MAP 5310: ESCON Bit Error Validation
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
Bit Error Rate Threshold incidents are caused by specific conditions at an interface
or along a line which can cause bits to be received or interpreted incorrectly. These
416
VOLUME 1, ESS Service Guide
MAP 5310: ESCON Bit Error Validation
bit errors are counted, and when a specific number is reached (threshold
exceeded), the link is operating in a degraded mode.
Bit errors are counted by each node attached on a link. You must determine which
node(s) in a link have detected a threshold exceeded condition to identify the link or
nodes causing the incident.
Isolation
1. Determine what type of error was reported by the customer.
Was the customer-reported error a ″Bit Error Threshold Exceeded″ (BER)
detected at the ATTACHED node?
v Yes, continue with step 4.
v No, display problems using the following service panel options:
From the service terminal Main Service Menu, select:
Repair Menu
Show / Repair Problems Needing Repair
Are there any bit error rate problems (ESC=356A) for the failing link?
– Yes, continue with the next step.
– No, additional link problem determination is needed. Ensure that all optical
link cables are reconnected, then return to MAP 0120 in the Enterprise
Systems Link Fault Isolation book, form number SY22-9533.
2. Test the bit error rate:
v Reconnect the optical link cables to the subsystem.
v Run the Bit Error Rate Test on the failing link:
From the service terminal Main Service Menu, select:
Machine Test Menu
Interface Cards Menu
ESCON Host Ports Menu
ESCON Port Optical Bit-Error-Rate Test
Select the SA interface to be tested, and follow the instructions on the screen
to run the test.
Did the test run successfully?
– Yes, cancel any outstanding Bit Error Rate problems logged for this link
and resume any quiesced links. The call is complete.
– No, continue with the next step.
3. Determine how many times the ESCON Port Optical Bit-Error-Rate Test has
been run.
Has this test been run only one time?
v Yes, clean the fiber optic connectors and run this test again. Use the fiber
optic cleaning procedure specified in the fiber optic connector cleaning kit
(New P/N 46G6844 or Old P/N 5453521).
v No, cancel any outstanding Bit Error Rate problems logged for this link,
resume any quiesced links, then go to MAP 0120 in the Enterprise Systems
Link Fault Isolation book, form number SY22-9533.
4. Test the bit error rate:
v Install the optical wrap tool in the link connector for the failing link addresses.
v Run the Bit Error Rate Test on the failing link:
From the service terminal Main Service Menu, select:
Problem Isolation Procedures, CHAPTER 3
417
MAP 5310: ESCON Bit Error Validation
Machine Test Menu
Interface Cards Menu
ESCON Host Ports Menu
ESCON Port Optical Bit-Error-Rate Test
Did the test run successfully?
– Yes, ensure that all optical link cables are reconnected, cancel any
outstanding Bit Error Rate problems (ESC=356A) logged for this link,
resume any quiesced links, then go to MAP 0120 in the Enterprise
Systems Link Fault Isolation book, form number SY22-9533.
– No, continue with the next step.
5. Determine how many times the ESCON Port Optical Bit-Error-Rate Test has
been run.
Has this test been run only one time?
v Yes, clean the fiber optic connectors and run this test again. Use the fiber
optic cleaning procedure specified in the fiber optic connector cleaning kit
(New P/N 46G6844 or Old P/N 5453521).
v No, use the repair process to replace the FRU.
MAP 5320: ESCON Optical Power Measurement
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Attention: To prevent severe disruption of customer operations, ensure that the
cluster is quiesced. Also verify that each affected ESCON CHPID is configured off
or access is blocked to each affected ESCON port before you run this test.
Ensure that you read the “Fiber Optic Cable Handling Precautions” on page 414
before you run this test.
Description
This MAP contains two procedures, “Isolation Procedure 1: Optical Transmitter
Measurement” on page 419 and “Isolation Procedure 2: Optical Receiver
Measurement” on page 420. These procedures measure the optical power at the
2105 Model Exx/Fxx ESCON card and the customer’s ESCON port cable using the
optical power meter (P/N 18F7005). The coupler and test cable are part of the fiber
optic test support kit (P/N 18F6953).
418
VOLUME 1, ESS Service Guide
MAP 5320: ESCON Optical Power Measurement
Figure 148. Measuring Optical Transmit Power (S008185m)
Isolation Procedure 1: Optical Transmitter Measurement: This procedure
measures the optical power transmitted from the 2105 Model Exx/Fxx ESCON card
through a short test cable (P/N 18F6948).
Note: Clean the fiber optic connectors as described in the cleaning instructions in
the fiber optic cleaning kit (New P/N 46G6844 or Old P/N 5453521) before
connecting or reconnecting the fiber optic cables.
1. Ensure that the host bay containing the 2105 Model Exx/Fxx ESCON card is
powered on.
2. Disconnect the fiber optic cable connector from the duplex connector on the
2105 Model Exx/Fxx ESCON card.
3. Connect the duplex connector of the optical power meter test cable to the 2105
Model Exx/Fxx ESCON card duplex connector (see Figure 148).
If the optical power meter has not been previously turned on, zeroed, and set to
the correct scale, set the meter using “Optical Power Meter Setup” on page 421.
After the meter is set, insert the black biconic connector of the test cable, P/N
18F6948, into the receptacle on the top of the power meter.
4. Use the optical power meter to obtain a reading. The power reading should be
at least -21 dBm (-20 dBm is more than -21 dBm. For example, -22 dBm
indicates that the transmitter is failing.)
Record the actual measurement value for possible use during the link fault
isolation procedures.
5. Disconnect the test cable from the 2105 Model Exx/Fxx ESCON card.
6. Return to the procedure that sent you here.
Problem Isolation Procedures, CHAPTER 3
419
MAP 5320: ESCON Optical Power Measurement
Figure 149. Measuring Optical Receive Power (s008186n)
Isolation Procedure 2: Optical Receiver Measurement: This procedure
measures the power received at the end of the customer’s ESCON link cable (input
into optical receiver).
Note: Always clean the fiber optic connectors as described in the cleaning
instructions in the fiber optic cleaning kit (New P/N 46G6844 or Old P/N
5453521) before connecting or reconnecting the fiber optic cables.
1. Ensure that the device on the other end of the link is powered on.
2. Disconnect the fiber optic cable connector from the duplex connector on the
2105 Model Exx/Fxx ESCON card.
3. Connect the duplex connector of the customer’s fiber optic cable (the duplex
connector that was removed from the 2105 Model Exx/Fxx ESCON card) into
one side of the duplex-to-duplex test coupler, P/N 18F6952 (see Figure 149).
4. Connect the duplex connector of the optical power meter test cable into the
other side of the duplex-to-duplex test coupler.
If the optical power meter has not been previously turned on, zeroed, and set to
the correct scale, set the meter using “Optical Power Meter Setup” on page 421.
After the meter is set, insert the black biconic connector of the test cable, P/N
18F6948, into the receptacle on the top of the power meter.
5. Use the optical power meter to obtain a reading. The power reading should be
at least -29.0 dBm (-28.0 dBm is more than -29.0 dBm).
Record the actual measurement value for possible use later during the link fault
isolation procedures.
6. Disconnect the customer fiber optic channel cable from the coupler and
reconnect the cable to the 2105 Model Exx/Fxx ESCON card.
7. Return to the procedure that sent you here.
420
VOLUME 1, ESS Service Guide
MAP 5320: ESCON Optical Power Measurement
Optical Power Meter Setup: Use this procedure only to do the initial setup of the
optical power meter (P/N 18F7005):
1. Power meter On
2. Set the meter to 1300 nanometers (nm)
3. Zero the meter
4. Set the meter to display the dBm scale
Note: Do not hold down a push-button for more than one-half second. When held
down for more than approximately three seconds, the push-button generates
results different from those needed.
1. Ensure that the black cap is over the biconic receptacle at the top of the power
meter.
2. Press Power On/Off. AUTO OFF will be displayed and the meter will turn off if
no push-button is pressed in ten minutes. Allow a two minute warm-up period.
3. If the meter does not display 1300 nm, press the λ (lambda) push-button
repeatedly until 1300 nm is displayed.
4. Press ZERO, two displays will be seen:
v A value between 0.30 and 0.70 nW (nanowatts).
v ZERO will blink after a short time, indicating that the meter is properly set to
zero.
If the above indicators do not display and Hi or Lo is displayed after pressing
ZERO, press ZERO again. Using a small screwdriver, adjust the trim pot that is
next to the biconic receptacle at the top of the meter until a value of between
.30 and .70 nW is displayed. Set the value as close to .50 nW as possible.
Press ZERO again to zero the meter.
5. The meter must also display dBm (decibel, based on one milliwatt). If nW is
displayed, press dBm/Watt.
Continue with one of the following:
v “Isolation Procedure 1: Optical Transmitter Measurement” on page 419
v “Isolation Procedure 2: Optical Receiver Measurement” on page 420
MAP 5340: CKD Read Data Failure
Attention: This is not a stand-alone procedure.
Customer disruption may occur if microcode and power boundaries are not in the
proper conditions for this service action. Ensure that you start all service activities in
“Chapter 2: Entry MAP for All Service Actions” on page 29 in the Enterprise Storage
Server Service Guide, Volume 1.
Description
You are here to resolve a Data Path failure that has been logged with one of the
ESC values listed below. An action to repair hardware or microcode is necessary.
The action may require the repair of another problem record in the log. The failure
may have caused customer data to be unreadable. If this occurs the customer must
restore the data after the hardware or microcode repair action is complete.
This MAP isolates for the following ESCs:
v ESC 334B, physical address validation error.
v ESC 334C, third or later repeat of physical address validation error on the same
physical address.
Problem Isolation Procedures, CHAPTER 3
421
MAP 5340: CKD Read Data
v ESC 4970, second occurrance of physical address validation error on the same
physical addr