AssuredSAN 3000 Series Service Guide

AssuredSAN 3000 Series Service Guide
Service Guide
Nexio® Farad ™ 2300 Series
April 2014
175-100431-01
Delivering the Moment
Publication Information
© 2014 Imagine Communications Corp. Proprietary and Confidential.
Imagine Communications considers this document and its contents to be proprietary and confidential. Except for
making a reasonable number of copies for your own internal use, you may not reproduce this publication, or any
part thereof, in any form, by any method, for any purpose, or in any language other than English without the
written consent of Imagine Communications. All others uses are illegal.
This publication is designed to assist in the use of the product as it exists on the date of publication of this manual,
and may not reflect the product at the current time or an unknown time in the future. This publication does not in
any way warrant description accuracy or guarantee the use for the product to which it refers. Imagine
Communications reserves the right, without notice to make such changes in equipment, design, specifications,
components, or documentation as progress may warrant to improve the performance of the product.
Trademarks
6800+™, ADC™, CCS Navigator™, Channel ONE™, ChannelView™, ClipSync™, Delay™, D-Series™, D-Series DSX™,
Deliver the Moment™, Delivering the Moment™, FAME™, Farad™, G8™, G-Scribe™, HView™, IconMaster™,
IconLogo™, IconStation™, IconKey™, InfoCaster™, InfoCaster Creator™, InfoCaster Manager™, InfoCaster Player™,
InstantOnline™, Invenio®, Live-Update™, mCAPTURE™, Magellan™, Magellan CCS Navigator™, Magellan Q-SEE™,
MultiService SDN™, NetPlus™, NetVX™, NewsForce™, Nexio® G8™, Nexio AMP® ChannelView™, Nexio® Channel
ONE™, Nexio® ClipSync™, Nexio® Delay™, Nexio® Digital Turnaround Processor™, Nexio® Farad™, Nexio® GScribe™, Nexio® IconKey™, Nexio® IconLogo™, Nexio® IconMaster™, Nexio® IconStation™, Nexio® InfoCaster™,
Nexio® InfoCaster Creator™, Nexio® InfoCaster Manager™, Nexio® InfoCaster Player™, Nexio® InfoCaster Traffic™,
Nexio® InstantOnline™, Nexio® mCAPTURE™, Nexio® NewsForce™, Nexio® NXIQ™, Nexio® Playlist™, Nexio®
Remote™, Nexio®RTX Net™, Nexio® TitleMotion™, Nexio® TitleOne™, Nexio® Velocity ESX™, Nexio® Velocity
PRX™, Nexio® Velocity XNG™, Nexio® Volt™, OPTO+™, Panacea™, Platinum™, Playlist™, Predator II-GRF™, Predator
II-GX™, Punctuate™, Remote™, RTX Net™, QuiC™, Q-SEE™, SD-STAR™, Selenio™, Selenio 6800+™, SelenioNext™,
Selenio X50™, Selenio X85™, Selenio X100™, TitleMotion™, TitleOne™, Velocity ESX™, Velocity PRX™, Velocity
XNG™, Versio™, Videotek® SD-STAR™, X50™, and X85™ are trademarks of Imagine Communications or its
subsidiaries.
Altitude Express®, Connectus®, Enabling PersonalizedTV®, ICE® Broadcast System, ICE Illustrate®, ICE-Q®
algorithms, ICEPAC®, Imagine ICE®, Inscriber®, Inscriber® Connectus®, Invenio®, NEO®, Nexio®, Nexio AMP®,
PersonalizedTV®, RouterWorks®, Videotek®, Videotek® ASI-STAR®, Videotek® GEN-STAR®, and Videotek® HDSTAR® are registered trademarks of Imagine Communications or its subsidiaries.
Microsoft® and Windows® are registered trademarks of Microsoft Corporation. HD-BNC is a trademark of
Amphenol Corporation. Some products are manufactured under license from Dolby Laboratories. Dolby and the
double-D symbol are registered trademarks of Dolby Laboratories. DTS Neural audio products are manufactured
under license from DTS Licensing Limited. DTS and the Symbol are registered trademarks & the DTS Logos are
trademarks of DTS, Inc. © 2008-2010 DTS, Inc. All other trademarks and trade names are the property of their
respective companies.
Contact Information
Imagine Communications has office locations around the world. For locations and contact information see:
http://www.imaginecommunications.com/contact-us/
Support Contact Information
For support contact information see:


Support Contacts: http://www.imaginecommunications.com/services/technical-support/
eCustomer Portal: http://support.imaginecommunications.com
© 2014 Imagine Communications Corp.
Proprietary and Confidential
Contents
About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Intended audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Related documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Document conventions and symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
11
11
12
1 Fault isolation methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Basic steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Options available for performing basic steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Performing basic steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Troubleshooting using RAIDar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Problems using RAIDar to access a storage system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Determining storage-system status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Viewing information about all vdisks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Viewing information about a vdisk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vdisk properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Disk properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Viewing information about an enclosure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Enclosure properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Disk properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Power supply properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller module properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller module: network port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller module: host port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller module: expansion port properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controller module: CompactFlash properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Drive enclosure: I/O Module properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I/O Module: In port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I/O Module: Out port properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Viewing the system event log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Clearing disk metadata (not supported) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Isolating faulty disk drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Identifying a faulty disk drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reviewing the event logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reconstructing a vdisk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problems scheduling tasks (not supported) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Effect of changing the date and time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Errors associated with scheduling tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Correcting enclosure IDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problems after power-on or restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
15
16
16
16
17
18
18
18
19
19
20
20
20
20
20
21
21
21
22
23
23
23
24
24
24
25
25
25
3 Troubleshooting using the CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Isolating data path faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing fault isolation settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Resetting expander error counters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Disabling or enabling a PHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Disabling or enabling PHY isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Isolating internal data path faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Checking PHY status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Resolving PHY faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Isolating external data path faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Resolving external data path faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Resetting a host channel on an FC storage system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
NEXIO Farad 2300 Series Service Guide
27
27
27
27
27
28
28
28
29
29
29
3
Command reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Viewing help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
abort scrub (not supported). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
clear cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
clear disk-metadata (not supported) ( . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
clear expander-status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
clear events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
rescan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
reset host-link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
restore defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
set debug-log-parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
set expander-fault-isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
set expander-phy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
set led . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
set protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
show debug-log-parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
show events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
show expander-status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
show frus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
show host-parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
show protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
show redundancy-mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
show sensor-status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
show vdisk-statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
show volume-statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Troubleshooting using event logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Events and event messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Viewing the event log in the CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Viewing the event log in RAIDar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Viewing an event log saved from RAIDar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Reviewing event logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Saving log information to a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5 Troubleshooting using system LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Using enclosure status LEDs – front panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Using disk drive module LEDs – front panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Using controller module host port LEDs – rear panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Using the controller module expansion port status LED – rear panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
IUsing controller module network port LEDs – rear panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Using controller module status LEDs – rear panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Using power supply module LEDs – rear panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Using expansion module LEDs – rear panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Diagnostic steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Is the enclosure front panel “Fault/Service Required” LED amber? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Is the controller-back-panel “FRU OK” LED lit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Is the controller-back-panel “Fault/Service Required” LED amber?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Are both disk-drive-module LEDs off? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Is the disk-drive-module “Fault” LED amber? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Is a connected host port’s “Host Link Status” LED lit?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Is a connected port’s “Expansion Port Status” LED lit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Is a connected port’s “Network Port Link Status” LED lit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Is the power supply’s “Input Power Source” LED lit?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Is the “Voltage/Fan Fault/Service Required” LED amber? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Isolating a host-side connection fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Isolating a controller module expansion port connection fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6 Troubleshooting and replacing FRUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
ESD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4
Contents
Preventing ESD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Grounding methods to prevent ESD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing chassis FRU components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing a controller module or expansion module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configuring PFU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Shutting down a controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removing a controller module or expansion module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Installing a controller module or expansion module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Swapping controllers in the same enclosure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Updating firmware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing a disk drive module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Air management modules (not supported) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removing a disk drive module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Installing a disk drive module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Determine if a disk is missing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing a power supply module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PSUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removing a PSU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Installing a PSU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Connecting a power cable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing an FC transceiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removing an SFP module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Installing an SFP module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Replacing a controller enclosure chassis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Preparing to remove a damaged storage enclosure chassis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Removing a damaged storage enclosure chassis from the rack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Installing the replacement storage enclosure chassis in the rack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Completing the process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying component operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
73
73
74
74
75
75
75
76
78
78
79
79
80
80
80
80
81
81
83
83
83
84
84
84
85
86
86
87
87
87
87
88
88
89
89
89
89
90
91
91
91
92
7 Voltage and temperature warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Resolving voltage and temperature warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sensor locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Power supply sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cooling fan sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Temperature sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Power supply module voltage sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
93
93
93
94
94
A Event descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Events and event messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Event format in this appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Resources for diagnosing and resolving problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Event descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Troubleshooting steps for leftover disk drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
NEXIO Farad 2300 Series Service Guide
5
Using the trust command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Power supply faults and recommended actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Events sent as indications to SMI-S clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
B System LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
12-disk enclosure front panel LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Disk drive LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Controller enclosure: Rear panel layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
NEXIO Farad 2300 Controller Modules: Rear panel LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Power supply LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
C Available FRUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Product overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
FRUs addressing 12-drive enclosures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Enclosure bezel for 12-drive model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6
Contents
Figures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Disengaging a controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Extracting a controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Removing a controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Inserting a controller module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Disengaging a disk drive module or blank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Removing a disk drive module or blank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Installing a disk drive module or blank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
AC PSU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Removing a PSU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Orienting a PSU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Sample SFP connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Disconnect fibre-optic interface cable from SFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Flip SFP actuator upwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Flip SFP actuator downwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Connect fibre-optic interface cable to SFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Farad 2300 Front Bezel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Farad 2300 Without Bezel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Disk Drive. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
NEXIO Farad 2300 Controller Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
PSUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Controller enclosure exploded view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Controller enclosure assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Controller enclosure exploded view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Partial controller enclosure assembly showing bezel alignment (2U12) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Controller enclosure assembly with bezel installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Enclosure architecture — internal components sub-assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
NEXIO Farad 2300 Series Service Guide
7
8
Figures
Tables
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Related Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
Document conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
Problems using RAIDar to access a storage system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Errors Associated with Scheduling Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
Diagnostics LED status: Front panel “Fault/Service Required”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66
Diagnostics LED status: Rear panel “FRU OK” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Diagnostics LED status: Rear panel “Fault/Service Required” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Diagnostic LED status: Disk drive module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Diagnostics LED status: Disk drive “Fault” LED (LFF and SFF modules). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Diagnostics LED status: Rear panel “Host Link Status” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68
Diagnostics LED status: Rear panel “Expansion Port Status” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68
Diagnostics LED status: Rear panel “Network Port Link Status” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Diagnostics LED status: Rear panel power supply “Input Power Source” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Diagnostics LED status: Rear panel power supply “Voltage/Fan Fault/Service Required” . . . . . . . . . . . . . . . . . . .69
Power supply faults and recommended actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84
PSU LED descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87
Removing and replacing a controller enclosure chassis and its FRUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90
Power supply sensor descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93
Cooling fan sensor descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Controller module temperature sensor descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Power supply temperature sensor descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Voltage sensor descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Disk error conditions and recommended actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
Power supply faults and recommended actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140
Events and corresponding SMI-S indications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140
LEDs: Enclosure Front Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .141
LEDs - Disk drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142
Disk Drive LEDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142
LEDs: Vdisk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .143
LEDs: NEXIO Farad Controller modules — rear panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144
LEDs: PSUs — rear panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146
NEXIO Farad 2300 Series product components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .148
NEXIO Farad 2300 Series Service Guide
9
10
Tables
About this guide
This guide describes how to maintain and troubleshoot NEXIO Farad 2300 Series products.
Intended audience
This guide is intended for storage system administrators and service personnel.
Prerequisites
Prerequisites for using this product include knowledge of:
•
Network administration
•
Storage system configuration
•
SAN management and DAS
•
FC, SAS, and Ethernet protocols
•
RAID technology
Before you begin to follow procedures in this guide, you must have already installed enclosures and learned of any
late-breaking information related to system operation, as described in the NEXIO Farad 2300 Setup Guide and
Release Notes.
Related documentation
Table 1
Related Documentation
For information about
See
Enhancements, known issues, and late-breaking
information not included in product documentation
Release Notes
Overview of product shipkit contents and setup tasks
NEXIO Farad Quick Start Guide
Regulatory compliance and safety and disposal
information
AssuredSan_PRCS_83-00005172-10-01.PDF
Using a rackmount bracket kit to install an enclosure into a
rack
DH_RBKI_83-00004524-13-01-B.PDF
Product hardware setup and related troubleshooting
NEXIO Farad 2300 Series Setup Guide
Using the web interface to configure and manage the
product
NEXIO Farad 2300 Series RAIDar User Guide
Using the CLI to configure and manage the product
NEXIO Farad 2300 Series CLI Reference Guide
NEXIO Farad 2300 Series Service Guide
11
Document conventions and symbols
Table 2
Document conventions
Convention
Element
Blue text
Cross-reference links and e-mail addresses
Blue, underlined text
Web site addresses
Bold font
•
•
•
Italics font
Text emphasis
Monospace font
•
•
•
•
File and directory names
System output
Code
Text typed at the command-line
Monospace, italic font
•
•
Code variables
Command-line variables
Monospace, bold font
Emphasis of file and directory names, system output, code, and text typed at
the command line
CAUTION:
Indicates that failure to follow directions could result in damage to equipment or data.
IMPORTANT:
NOTE:
TIP:
12
Provides clarifying information or specific instructions.
Provides additional information.
Provides helpful hints and shortcuts.
About this guide
Key names
Text typed into a GUI element, such as into a box
GUI elements that are clicked or selected, such as menu and list items,
buttons, and check boxes
1
Fault isolation methodology
NEXIO Farad 2300 Series storage systems provide many ways to isolate faults. This section presents the basic
methodology used to locate faults within a storage system, and to identify the pertinent FRUs affected.
Use RAIDar to configure and provision the system upon completing the hardware installation. As part of this
process, configure and enable event notification so the system will notify you when a problem occurs that is at or
above the configured severity. With event notification configured and enabled, you can follow the recommended
actions in the notification message to resolve the problem, as further discussed in the options presented below.
Basic steps
The basic fault isolation steps are listed below:
•
Gather fault information, including using system LEDs
(see Gather fault information on page 14)
•
Determine where in the system the fault is occurring
(see Determine where the fault is occurring on page 14)
•
Review event logs
(see Review the event logs on page 14)
•
If required, isolate the fault to a data path component or configuration
(see Isolate the fault on page 14)
Options available for performing basic steps
When performing fault isolation and troubleshooting steps, select the option or options that best suit your site
environment. Four options are described below. Use of any option is not mutually-exclusive to the use of another
option. You can use RAIDar to check the health icons/values for the system and its components to ensure that
everything is okay, or to view a problem component. If you discover a problem, both RAIDar and the CLI provide
recommended-action text online. Options for performing basic steps are listed according to frequency of use:
•
Use RAIDar
•
Use the CLI
•
Monitor event notification
•
View the enclosure LEDs
Use RAIDar
RAIDar uses health icons to show OK, Degraded, Fault, or Unknown status for the system and its components.
RAIDar enables you to monitor the health of the system and its components. If any component has a problem, the
system health will be Degraded, Fault, or Unknown. Use RAIDar’s GUI to find each component that has a
problem, and follow actions in the component Health Recommendations field to resolve the problem.
Use the CLI
As an alternative to using RAIDar, you can run the show system command in the CLI to view the health of the
system and its components. If any component has a problem, the system health will be Degraded, Fault, or
Unknown, and those components will be listed as Unhealthy Components. Follow the recommended actions in the
component Health Recommendation field to resolve the problem.
Monitor event notification
With event notification configured and enabled, you can view event logs to monitor the health of the system and
its components. If a message tells you to check whether an event has been logged, or to view information about an
event in the log, you can do so using either RAIDar or the CLI. Using RAIDar, you view the event log and then
click on the event message to see detail about that event. Using the CLI, you run the show events detail
command (with additional parameters to filter the output) to see the detail for an event. The events will be listed,
in reverse chronological order (most recent messages are at the top of the list). RAIDar will only display the last
100 events.
NEXIO Farad 2300 Series Service Guide
13
View the enclosure LEDs
You can view the LEDs on the hardware (while referring to System LEDs for your enclosure model) to identify
component status. If a problem prevents access to either RAIDar or the CLI, this is the only option available.
However, monitoring/management is often done at a management console using storage management interfaces
rather than relying on line-of-sight to LEDs of racked hardware components.
Performing basic steps
You can use any of the available options described above in performing the basic steps comprising the fault
isolation methodology.
Gather fault information
When a fault occurs, it is important to gather as much information as possible. Doing so will help you determine
the correct action needed to remedy the fault.
Begin by reviewing the reported fault:
•
Is the fault related to an internal data path or an external data path?
•
Is the fault related to a hardware component such as a disk drive module, controller module, or PSU?
By isolating the fault to one of the components within the storage system, you will be able to determine the
necessary corrective action more quickly.
Determine where the fault is occurring
Once you have an understanding of the reported fault, review the enclosure LEDs. The enclosure LEDs are
designed to immediately alert users of any system faults, and might be what alerted the user to a fault in the first
place.
When a fault occurs, the Fault ID Status LED on an enclosure’s right ear illuminates (see the diagram pertaining to
your product’s front panel components in System LEDs). Check the LEDs on the back of the enclosure to narrow
the fault to a FRU, connection, or both. The LEDs also help you identify the location of a FRU reporting a fault.
Use RAIDar to verify any faults found while viewing the LEDs. RAIDar is also useful in determining where the
fault is occurring if the LEDs cannot be viewed due to the location of the system. RAIDar provides you with a
visual representation of the system and where the fault is occurring. It can also provide more detailed information
about FRUs, data, and faults.
Review the event logs
The event logs record all system events. Each event has a numeric code that identifies the type of event that
occurred, and has one of the following severities:
•
Critical. A failure occurred that may cause a controller to shut down. Correct the problem immediately.
•
Error. A failure occurred that may affect data integrity or system stability. Correct the problem as soon as
possible.
•
Warning. A problem occurred that may affect system stability, but not data integrity. Evaluate the problem and
correct it if necessary.
•
Informational. A configuration or state change occurred, or a problem occurred that the system corrected. No
immediate action is required.
See Event descriptions for information about specific events.
It is very important to review the logs, not only to identify the fault, but also to search for events that might have
caused the fault to occur. For example, a host could lose connectivity to a vdisk if a user changes channel settings
without taking the storage resources assigned to it into consideration. In addition, the type of fault can help you
isolate the problem to either hardware or software.
Isolate the fault
Occasionally, it might become necessary to isolate a fault. This is particularly true with data paths, due to the
number of components comprising the data path. For example, if a host-side data error occurs, it could be caused
by any of the components in the data path: controller module, cable, or data host.
14
Fault isolation methodology
2
Troubleshooting using RAIDar
Problems using RAIDar to access a storage system
The following table lists problems you might encounter when using RAIDar to access a storage system.
Table 3
Problems using RAIDar to access a storage system
Problem
Solution
You cannot access RAIDar in a web
browser.
•
Verify that you entered the correct IP address for the controller’s
network port, and do not include a leading zero in the address (for
example, enter 10.1.4.33 not 10.1.4.033).
•
If the system has two controllers, enter the IP address of the partner
controller’s network port.
•
Ask the administrator whether the system is configured to allow
WBI access via HTTP or HTTPS. Depending on the answer, enter
either http://ip-address/index.html or
https://ip-address/index.html.
You cannot sign in to RAIDar.
You cannot navigate beyond
RAIDar’s Sign In page.
RAIDar pages do not display
properly.
Menu options are not available.
You cannot access online help.
•
Verify that you are entering the correct user name and password.
•
Ask the system administrator to verify that WBI access is enabled
for the user account.
•
Set the browser’s local-intranet security option to medium or
medium-low. For Internet Explorer 8, adding each controller’s
network IP address as a trusted site can avoid access issues.
•
Verify that the browser is set to allow cookies at least for the IP
addresses of the storage-system network ports.
•
Use a color monitor and set its color quality to the highest setting.
•
Prevent RAIDar pages from being cached by disabling web page
caching in your browser.
•
Check whether another user has changed the system configuration or
user-account preferences.
•
The options are not relevant to the system’s current state. For
example, if no volumes have been created, options to map and
unmap volumes are not available.
•
The user account has a monitor (view only) role and therefore
cannot access panels where system settings can be changed.
•
Ensure pop-up windows are enabled.
Determining storage-system status
The storage system can have the following health values:
OK. The system is operating normally.
Degraded. At least one component is degraded.
Fault. At least one component has a fault.
Unknown. Health status is not available.
If the system or a physical subcomponent has a fault or is degraded, the Degraded or Fault icon is displayed to the
left of that component in the Configuration View panel. If you see either of these icons, select the component and
in its overview panel look for the Health Reason value. This value gives a short explanation of the health problem.
NEXIO Farad 2300 Series Service Guide
15
Viewing information about all vdisks
In the Configuration View panel, right-click Vdisks and select View > Overview. The Vdisks Overview table
shows the overall health of existing vdisks. For each vdisk, the Vdisk Overview table shows the following health
fields; other fields that display are described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
Unknown
A second table shows the following status fields; other fields that display are described in RAIDar documentation.
•
Status.
•
CRIT: Critical. The vdisk is online but isn’t fault tolerant because some of its disks are down.
•
FTDN: Fault tolerant with down disks. The vdisk is online and fault tolerant, but some of its disks are
down.
•
FTOL: Fault tolerant and online.
•
OFFL: Offline. Either the vdisk is using offline initialization, or its disks are down and data may be lost.
•
QTCR: Quarantined critical. The vdisk is offline and quarantined because at least one disk is missing;
however, the vdisk could be accessed. For instance, one disk is missing from a mirror or RAID-5.
•
QTDN: Quarantined with down disks. The vdisk is offline and quarantined because at least one disk is
missing; however, the vdisk could be accessed and would be fault tolerant. For instance, one disk is
missing from a RAID-6.
•
QTOF: Quarantined offline. The vdisk is offline and quarantined because multiple disks are missing and
user data is incomplete.
•
STOP: The vdisk is stopped.
•
UNKN: Unknown.
•
UP: Up. The vdisk is online and does not have fault-tolerant attributes.
Viewing information about a vdisk
In the Configuration View panel, right-click a vdisk and select View > Overview. The Vdisk Overview table
shows the health of the selected vdisk and the disks in that vdisk.
Vdisk properties
When you select the vdisk component, the Properties for Vdisk table shows the following health and status fields;
other fields that display are described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
Unknown
16
•
Health Reason. If a vdisk’s health is not OK, this entry lists the reasons for the health state. If a vdisk’s health
is OK, no information is displayed.
•
Health Recommendation. If a vdisk’s health is not OK, this entry lists recommendations for correcting the
health. If a vdisk’s health is OK, no information is displayed.
•
Status.
•
CRIT: Critical. The vdisk is online but is not fault tolerant because some of its disks are down.
•
FTDN: Fault tolerant with down disks. The vdisk is online and fault tolerant, but some of its disks are
down.
•
FTOL: Fault tolerant and online.
Troubleshooting using RAIDar
•
•
OFFL: Offline. Either the vdisk is using offline initialization, or its disks are down and data may be lost.
•
QTCR: Quarantined critical. The vdisk is offline and quarantined because at least one disk is missing;
however, the vdisk could be accessed. For instance, one disk is missing from a mirror or RAID-5.
•
QTDN: Quarantined with down disks. The vdisk is offline and quarantined because at least one disk is
missing; however, the vdisk could be accessed and would be fault tolerant. For instance, one disk is
missing from a RAID-6.
•
QTOF: Quarantined offline. The vdisk is offline and quarantined because multiple disks are missing and
user data is incomplete.
•
STOP: The vdisk is stopped.
•
UNKN: Unknown.
•
UP: Up. The vdisk is online and does not have fault-tolerant attributes.
Current Job.
•
Disk Scrub: Not supported.
•
Expand: Not supported.
•
Initialize: Not supported.
•
Reconstruct: The vdisk is being reconstructed.
•
Verify: Not supported.
•
Media Scrub: Not supported.
A second table displays information about unhealthy components. If all components are healthy, this table
displays the text, “There is no data for your selection”.
Disk properties
When you select the Disks component, a Disk Sets table and enclosure view appear. The enclosure view table has
two tabs. The Tabular tab shows the following health and status fields; other fields that display are described in
RAIDar documentation.
•
Health. Shows whether the disk is healthy or has a problem.
OK
Degraded
Fault
N/A
Unknown
If the disk’s health is not OK, select it in the Configuration View panel to view details about it.
•
State. Shows how the disk is used:
•
If the disk is in a vdisk, its RAID level
•
AVAIL: Available
•
FAILED: The disk is unusable and must be replaced. Reasons for this status include: excessive media
errors; SMART error; disk hardware failure; unsupported disk.
•
SPARE: Spare assigned to a vdisk
•
GLOBAL SP: Global spare
•
LEFTOVR: Leftover
Also shows any job running on the disk:
•
•
DRSC: Not supported.
•
EXPD: Not supported.
•
INIT: The vdisk is being initialized
•
RCON: TNot supported.
•
VRFY: Not supported.
•
VRSC: Not supported.
Status. Up (operational) or Not Present.
The Graphical tab shows the locations of the vdisk's disks in system enclosures and each disk’s health and state.
NEXIO Farad 2300 Series Service Guide
17
Viewing information about an enclosure
In the Configuration View panel, right-click an enclosure and select View > Overview. You can view information
about the enclosure and its components in a front or rear graphical view, or in a front or rear tabular view.
•
Front Graphical. Shows a graphical view of the front of each enclosure and its disks.
•
Front Tabular. Shows a tabular view of each enclosure and its disks.
•
Rear Graphical. Shows a graphical view of components at the rear of the enclosure.
•
Rear Tabular. Shows a tabular view of components at the rear of the enclosure.
In any of these views, select a component to see more information about it. Components vary by enclosure model.
If any components are unhealthy, a table at the bottom of the panel identifies them. When a disk is selected, you
can view properties or historical performance statistics.
Enclosure properties
When you select an enclosure, a table shows the following health fields; other fields that display are described in
RAIDar documentation.
•
Health.
OK
Degraded
Fault
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
Disk properties
When you select a disk, a table shows the following health and status fields; other fields that display are described
in RAIDar documentation.
•
Health.
OK. The disk is operating normally.
Degraded
Fault
N/A
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
•
•
Up: The disk is present and is properly communicating with the expander.
•
Spun Down: Not supported.
•
Warning: The disk is present but the system is having communication problems with the disk LED
processor. For disk and midplane types where this processor also controls power to the disk, power-on
failure will result in Error status.
•
Error: The disk is present but is not detected by the expander.
•
Unknown: Initial status when the disk is first detected or powered on.
•
Not Present: The disk slot indicates that no disk is present.
How Used.
Two values are listed together: the first is How Used and the second is Current Job. That is, a disk used in a
vdisk (VDISK) that is being scrubbed (VRSC) is listed as VDISKVRSC.
•
18
How used:
Troubleshooting using RAIDar
•
•
•
AVAIL: Available.
•
FAILED: The disk is unusable and must be replaced. Reasons for this status include: excessive media
errors; SMART error; disk hardware failure; unsupported disk.
•
GLOBAL SP: Global spare.
•
LEFTOVR: Leftover.
•
VDISK: Used in a vdisk.
•
VDISK SP: Spare assigned to a vdisk.
Current Job.
•
DRSC: Not supported.
•
EXPD: Not supported.
•
INIT: The vdisk is being initialized.
•
RCON: The vdisk is being reconstructed.
•
VRFY: Not supported.
•
VRSC: Not supported.
Transfer Rate. The data transfer rate in Gbit/s.
NOTE: Some 6-Gbit/s disks might not consistently support a 6-Gbit/s transfer rate. If this happens, the
controller automatically adjusts transfers to those disks to 3 Gbit/s, increasing reliability and reducing error
messages with little impact on system performance. This rate adjustment persists until the controller is
restarted or power-cycled.
Power supply properties
When you select a power supply, a table shows the following health and status fields; other fields that display are
described in RAIDar documentation.
When you select a power supply, a table shows:
•
Health.
OK
Degraded
Fault
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
Controller module properties
When you select a controller module, a table shows the following health and status fields; other fields that display
are described in RAIDar documentation.
•
Health.
OK
Fault
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
NEXIO Farad 2300 Series Service Guide
19
Controller module: network port properties
When you select a network port, a table shows the following health fields; other fields that display are described in
RAIDar documentation.
•
Health.
OK. The port is operating normally.
Degraded. The port’s operation is degraded.
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
Controller module: host port properties
When you select a host port, a table shows the following health and status fields; other fields that display are
described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
N/A
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Status.
Controller module: expansion port properties
When you select an expansion (Out) port, a table shows the following health and status fields; other fields that
display are described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
N/A
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
Controller module: CompactFlash properties
When you select a CompactFlash card, a table shows the following health and status fields; other fields that
display are described in RAIDar documentation.
When you select a CompactFlash card in the Rear Tabular view, a table shows:
•
Health.
OK
Fault
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
Drive enclosure: I/O Module properties
When you select an IOM, a table shows the following health and status fields; other fields that display are
described in RAIDar documentation.
20
Troubleshooting using RAIDar
•
Health.
OK
Degraded
Fault
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Status.
I/O Module: In port properties
When you select an In port, a table shows the following health and status fields; other fields that display are
described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
N/A
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
I/O Module: Out port properties
When you select an Out port, a table shows the following health and status fields; other fields that display are
described in RAIDar documentation.
•
Health.
OK
Degraded
Fault
N/A
Unknown
•
Health Reason. If Health is not OK, this field shows the reason for the health state.
•
Health Recommendation. If Health is not OK, this field shows recommended actions to take to resolve the
health issue.
•
Status.
Viewing the system event log
NOTE: If you are having a problem with the system or a vdisk, review events as described below before calling
technical support. Event information might enable you to resolve the problem.
In the Configuration View panel, right-click the system and select View > Event Log. The System Events panel
shows the 100 most recent events that have been logged by either controller. All events are logged, regardless of
event-notification settings. Click the buttons above the table to view all events, or only critical, warning, or
informational events.
The event log table shows the following information:
•
Severity.
Critical. A failure occurred that may cause a controller to shut down. Correct the problem immediately.
NEXIO Farad 2300 Series Service Guide
21
Error. A failure occurred that may affect data integrity or system stability. Correct the problem as soon
as possible.
Warning. A problem occurred that may affect system stability but not data integrity. Evaluate the
problem and correct it if necessary.
Informational. A configuration or state change occurred, or a problem occurred that the system
corrected. No action is required.
•
Time. Date and time when the event occurred, shown as year-month-day hour:minutes:seconds. Time stamps
have one-second granularity.
•
Event ID. An identifier for the event. The prefix A or B identifies the controller that logged the event.
•
Code. An event code that helps you and support personnel diagnose problems. For event-code descriptions
and recommended actions, see Event descriptions.
•
Message. Brief information about the event. Click the message to show or hide additional information and
recommended actions.
When reviewing events, do the following:
1. For any critical, error, or warning events, click the message to view additional information and recommended
actions. This information also appears in the Event descriptions.
Identify the primary events and any that might be the cause of the primary event. For example, an
over-temperature event could cause a disk failure.
2. View the event log and locate other critical/error/warning events in the sequence for the controller that
reported the event.
Repeat this step for the other controller if necessary.
3. Review the events that occurred before and after the primary event.
During this review you are looking for any events that might indicate the cause of the critical/error/warning
event. You are also looking for events that resulted from the critical/error/warning event, known as secondary
events.
4. Review the events following the primary and secondary events.
You are looking for any actions that might have already been taken to resolve the problems reported by the
events.
Clearing disk metadata (not supported)
CAUTION:
•
Executing the Clear Disk Metadata function will result in a temporary interruption of 10 to 15 seconds to
activity on the stack. The Clear disk metadata function is not supported.
•
Only use this command when all vdisks are online and leftover disks exist. Improper use of this command may
result in data loss.
•
Do not use this command when a vdisk is offline and one or more leftover disks exist.
Each disk in a vdisk has metadata that identifies the owning vdisk, the other members of the vdisk, and the last
time data was written to the vdisk. The following situations cause a disk to become a leftover:
•
Vdisk members’ timestamps do not match so the system designates members having an older timestamp as
leftovers.
•
A drive that had previously failed or been used on another system would become a leftover. This could also
happen if a drive failed via RAIDar or CLI. Reusing a previously failed drive is not supported.
•
A disk is not detected during a rescan, then is subsequently detected.
When a disk becomes a leftover, the following changes occur:
22
•
The disk’s health becomes Degraded and its How Used state becomes LEFTOVR.
•
The disk is automatically excluded from the vdisk, causing the vdisk’s health to become Degraded or Fault,
depending on the RAID level.
Troubleshooting using RAIDar
•
The disk’s Fault LED is illuminated amber.
If spares are available, and the health of the vdisk is Degraded, the vdisk will use them to start reconstruction.
When reconstruction is complete, you can replace the failed drive with a new drive and configure it as a global
spare.
If spares are not available to begin reconstruction, replace the failed drive with a new drive and configure it as a
global spare. Once this is done, reconstruction will begin.
This command clears metadata from leftover disks only. If you specify disks that are not leftovers, the disks are
not changed.
To clear metadata from leftover disks
(This function is not supported.)
Isolating faulty disk drives
When a drive fault occurs, basic troubleshooting actions are:
•
Identify the faulty drive
•
Review the drive error statistics
•
Review the event log
•
Replace the faulty drive
•
Reconstruct the associated vdisk
Identifying a faulty disk drive
The identification of a faulty disk drive involves confirming the drive fault and identifying the physical location of
the drive.
To confirm a drive fault, use the basic troubleshooting steps in Determining storage-system status on page 15. You
can also right-click on the System and select View > Event Log. Look for any notifications pertaining to a disk
drive fault.
When you have confirmed a drive fault, record the drive’s enclosure number and slot number.
To identify the physical location of a faulty drive:
1. Select Physical in the Configuration View.
2. Select the Enclosure indicated in the drive fault error.
3. Click the Front Graphical tab in the Enclosure Overview.
This displays a graphical view of the enclosure. The faulty disk drive is indicated with the appropriate health
icon as described in Viewing information about all vdisks on page 16.
A disk having the status FAILED cannot be reused in the storage system and must be replaced.
For more information about viewing disk information, see RAIDar documentation.
Reviewing the event logs
If all the steps in Identifying a faulty disk drive on page 23 have been performed, you have determined the
following:
•
A disk drive has encountered a fault
•
The location of the disk drive
•
What fault occurred
The next step is to review the event logs to determine if there were any events that led to the fault. If you skip this
step, you could replace the faulty drive and then encounter another fault.
To view the event logs from any page, right-click the system and select View > Event Log. See Viewing the
system event log on page 21 for more information about the Event Log.
NEXIO Farad 2300 Series Service Guide
23
Reconstructing a vdisk
Vdisk reconstruction does not require I/O to be stopped, so the vdisk can continue to be used while the
Reconstruct utility runs. Vdisk reconstruction starts automatically if:
•
One or more disks fail in a fault-tolerant vdisk (RAID 1, 3, 5, 6, 10, or 50),
•
The vdisk is still operational, and
•
Compatible spares are available.
The storage system automatically uses the spares to reconstruct the vdisk. A compatible spare is one whose
capacity is equal to or greater than the smallest disk in the vdisk. A compatible spare has enough capacity to
replace the failed disk and is the same type (linear SAS). If no compatible spares are available, reconstruction does
not start automatically. To start reconstruction manually, replace each failed disk and then add each new disk as a
global spare.
Remember that a global spare might be taken by a different critical vdisk than the one you intended. When a
global spare replaces a disk in a vdisk, the global spare’s icon in the enclosure view changes to match the other
disks in that vdisk.
RAID6 reconstruction behaves as follows:
•
During online initialization, if one disk fails, initialization continues and the resulting vdisk will be degraded
(FTDN status). After initialization completes, the system can use a compatible spare to reconstruct the vdisk.
•
During online initialization, if two disks fail, initialization stops (CRIT status). The system can use two
compatible spares to reconstruct the vdisk.
•
During vdisk operation, if one disk fails and a compatible spare is available, the system begins to use that
spare to reconstruct the vdisk. If a second disk fails during reconstruction, reconstruction continues until it is
complete, regardless of whether a second spare is available. If the spare fails during reconstruction,
reconstruction stops.
•
During vdisk operation, if two disks fail and only one compatible spare is available, the system waits five
minutes for a second spare to become available. After five minutes, the system begins to use that spare to
reconstruct one disk in the vdisk (referred to as “fail 2, fix 1” mode). If the spare fails during reconstruction,
reconstruction stops.
•
During vdisk operation, if two disks fail and two compatible spares are available, the system uses both spares
to reconstruct the vdisk. If one of the spares fails during reconstruction, reconstruction proceeds in “fail 2, fix
1” mode. If the second spare fails during reconstruction, reconstruction stops.
When a disk fails, its Fault LED illuminates amber. When a spare is used as a reconstruction target, its Activity
LED blinks green.
NOTE: Reconstruction can take hours or days to complete, depending on the vdisk RAID level and size, disk
speed, utility priority, and other processes running on the storage system. You can stop reconstruction only by
deleting the vdisk.
Problems scheduling tasks (not supported)
Effect of changing the date and time
Resetting the storage system date or time might affect scheduled tasks. Because the schedule begins with the start
time, no schedules will run until the date and time are set. If the system is configured to use NTP, and if an NTP
server is available, the system time and date are obtained from the NTP server. To manually change the date or
time, see the NEXIO Farad 2300 Series RAIDar User Guide or the Farad Infrastructure Commissioning Checklist
(v4).
24
Troubleshooting using RAIDar
Errors associated with scheduling tasks
The following table describes error messages associated with scheduling tasks.
Table 4
Errors Associated with Scheduling Tasks
Error Message
Solution
Task Already Exists
Select a different name for the task.
Schedule Already Exists
Select a different name for the schedule.
Correcting enclosure IDs
When installing a system with enclosures attached, the enclosure IDs might differ from the physical cabling order.
This is because the controller might have been previously attached to some of the same enclosures and it attempts
to preserve the previous enclosure IDs if possible. To correct this condition, you can perform a rescan.
To rescan disk channels
1. Verify that both controllers are operating normally.
2. In the Configuration View panel, right-click the system and select Tools > Rescan Disk Channels.
3. Click Rescan.
Problems after power-on or restart
After powering on the storage system or restarting the MC or SC, the processors take about 45 seconds to boot up,
and the system takes an additional minute or more to become fully functional and able to process commands from
RAIDar or the CLI. The time to become fully functional depends on many factors such as the number of
enclosures, the number of disk drives, the number of vdisks, and the amount of I/O running at the time of the
restart. During this time, some RAIDar or CLI commands might fail and some RAIDar pages may not be
available. If this occurs, wait a few minutes and try again.
NEXIO Farad 2300 Series Service Guide
25
26
Troubleshooting using RAIDar
3
Troubleshooting using the CLI
Isolating data path faults
When isolating data path faults, you must first isolate the fault to an internal data path or an external data path.
This will help to target your troubleshooting efforts.
Internal data paths include the following:
•
Controller to disk connectivity
•
Controller to controller connectivity
•
Controller ingress (incoming signals from drive enclosures)
•
Controller egress (outgoing signals to drive enclosures)
External data paths consist of the connections between the storage system and data hosts.
Changing fault isolation settings
By default, the EC in each I/O module performs fault-isolation analysis of SAS expander PHY statistics. When
one or more error counters for a specific PHY exceed the built-in thresholds, the PHY is disabled to maintain
storage system operation.
You can use these commands to help isolate PHY errors.
Resetting expander error counters
You can clear the counters and status for SAS expander lanes. Use the clear expander-status command on page 32
to reset error counters.
CAUTION:
For use by or with direction from a service technician.
Disabling or enabling a PHY
If a PHY continues to accumulate errors you can disable it. Use the set expander-phy command on page 38 to
disable or enable a specific PHY.
CAUTION:
For use by or with direction from a service technician.
Disabling or enabling PHY isolation
You can change an expander’s PHY Isolation setting to enable or disable fault monitoring and isolation for all
PHYs in that expander. While troubleshooting a storage system problem, use the set expander-fault-isolation
command on page 37 to temporarily disable fault isolation for a specific EC in a specific enclosure.
CAUTION:
For use by or with direction from a service technician.
NEXIO Farad 2300 Series Service Guide
27
Isolating internal data path faults
Fault isolation firmware monitors hardware PHYs for problems.
PHYs are tested and verified before shipment as part of the manufacturing and qualification process.
Subsequent problems in a PHY cause symptoms such as:
•
A host or controller continually rescans drives.
This can disrupt I/O or cause I/O errors. I/O errors can result in a failed drive, causing a vdisk to become
critical or causing complete loss of a vdisk if more than one fails.
•
Bad cables connecting enclosures, damaged controller connectors, and other physical damage.
This can cause continual errors, which the fault isolation firmware can often trace to a single problematic PHY.
The fault isolation firmware recognizes the large number and rapid rate of these errors and disables this PHY
without user intervention. This disabling, sometimes referred to as PHY fencing, eliminates the I/O errors and
enables the system to continue operation without suffering performance degradation. To avoid these problems,
problem PHYs are identified and disabled, if necessary, and status information is transmitted to the controller
so that each action can be reported in the event log. Problem PHY ID and status information is reported in
RAIDar, but disabled PHYs are only reported through event messages.
•
PHY errors when powering on an enclosure, when removing or inserting a controller, and when connecting or
disconnecting an enclosure.
An incompletely connected or disturbed cable might also generate a PHY error. These errors are usually not
significant enough to disable a PHY, so the fault isolation firmware analyzes the number of errors and the error
rate. If errors for a particular PHY increase at a slow rate, the PHY is usually not disabled. Instead the errors
are accumulated and reported.
The firmware recognizes large number and rapid rate of these errors and disables the indicated PHY without user
intervention. This disabling eliminates the I/O errors and enables the system to continue operation.
If a PHY becomes disabled, the event log entry helps to determine which enclosure or enclosures and which
controller (or controllers) are affected. Troubleshooting using event logs on page 59 for more information about
view and interpreting logs.
To enable a disabled PHY, reset the affected controller or power cycle the enclosure. Before doing so, it may be
necessary to replace a defective cable or FRU. See Troubleshooting and replacing FRUs on page 73 for more
information about replacing defective FRUs.
Checking PHY status
PHY status can be checked from the CLI. See show expander-status on page 44 for more information about how
to check the PHY lanes.
CAUTION:
For use by or with direction from a service technician.
Resolving PHY faults
WARNING! Resetting a controller may cause temporary disruption os data delivery. Do not reset a controller
without the advice and consent of Customer Support.
1. Ensure that the cables are securely connected. If they are not, tighten the connectors.
2. Reset the affected controller or power-cycle the enclosure.
3. If the problem persists, replace the affected FRU or enclosure.
4. Periodically run the show expander-status to see if the fault isolation firmware disables the same PHY again. If
it does:
a. Replace the appropriate cable.
b. Reset the affected controller or power-cycle the enclosure.
28
Alphabetical list of commands
Isolating external data path faults
To troubleshoot external data path faults, perform the following steps:
1. Use the show host-parameters command as described on page 48 to display information about host port status.
2. To target the cause of the link failure, review the detail as output.
The details include:
•
Port information - Selected controller and port number
•
Media - Host link type
•
Target ID
•
Status - Condition of the link, Up or Disconnected.
In Fibre Channel storage systems
•
Topology - Port connection type
•
Speed - Both actual and configured link speed, 2 Gbit/s, 4 Gbit/s, 8 Gbit/s, or Auto
•
Primary ID
Resolving external data path faults
Review the output for host ports with a status of Disconnected. This can be caused by one or more of the following
conditions:
•
A faulty HBA in the host
•
A faulty cable
•
A disconnected cable
•
A faulty port in the host interface module
Resetting a host channel on an FC storage system
For an FC system using loop topology, you might need to reset a host port (channel) to fix a host connection or
configuration problem.
Use the reset host-link command on page 33 to reset a host port.
Command reference
This section provides information about CLI commands commonly used for debugging and includes syntax,
parameters and usage examples. This list is not inclusive of all CLI commands. Additional information about
commands and command syntax can be found in the NEXIO Farad 2300 Series CLI Reference Guide.
Command references that include a See Also section include two types of references. Commands in blue with a
page number are commands referenced in this guide. All other references, in black, can be found in the NEXIO
Farad 2300 Series CLI Reference Guide.
See Related documentation on page 11 for information about additional reference material.
Viewing help
To view brief descriptions of all commands that are available to the user level you logged in as, enter:
help
To view help for a command and then return to the command prompt, enter:
help command-name
To view information about command syntax, enter:
help syntax
To view information about command completion, editing, and history, enter:
help help
NEXIO Farad 2300 Series Service Guide
29
abort scrub (not supported)
Description Aborts the scrub vdisk operation for specified vdisks.
Syntax abort scrub vdisk vdisks
Parameters vdisk vdisks
Names or serial numbers of the vdisks to stop scrubbing. For vdisk syntax, see Command
Syntax in the NEXIO Farad 2300 Series CLI Reference Guide.
Example Abort scrubbing vdisk vd1:
# abort scrub vdisk vd1
Info: Scrub was aborted on vdisk vd1. (vd1)
Success: Command completed successfully. (2012-01-20 15:42:08)
See also •
•
scrub vdisk
show vdisks
clear cache
Description Clears unwritable cache data from both controllers. This data cannot be written to disk because
it is associated with a volume that no longer exists or whose disks are not online. If the data is
needed, the volume's disks must be brought online. If the data is not needed it can be cleared, in
which case it will be lost and data will differ between the host and disk. Unwritable cache is also
called orphan data.
You can clear unwritable cache data for a specified volume or for all volumes.
Syntax clear cache [volume volume]
Parameters volume volume
Optional. Name or serial number of the volume whose cache data should be cleared. For volume
syntax, see Command Syntax in the NEXIO Farad 2300 Series CLI Reference Guide. If this
parameter is omitted, the command clears any unneeded orphaned data for volumes that are no
longer online or that no longer exist.
Example Clear unwritable cache data for volume V1 from both controllers:
# clear cache volume v1
Success: Command completed successfully - If unwritable cache data
existed, it has been cleared. (2012-01-18 14:21:11)
30
Alphabetical list of commands
clear disk-metadata (not supported)
(
Description Clears metadata from leftover disks.
CAUTION:
•
Executing the clear disk-metadata function will result in a 10 to 15 second interruption of the
activity on the stack. This function is not supported.
•
Only use this command when all vdisks are online and leftover disks exist. Improper use of
this command may result in data loss.
•
Do not use this command when a vdisk is offline and one or more leftover disks exist.
If you are uncertain whether to use this command, contact technical support for further
assistance.
Each disk in a vdisk has metadata that identifies the owning vdisk, the other members of the
vdisk, and the last time data was written to the vdisk. The following situations cause a disk to
become a leftover:
•
Vdisk members' timestamps do not match so the system designates members having an older
timestamp as leftovers. A drive that had previously failed or been used on another system
would become “leftover.” This would also hapen if a drive was failed via RAIDar or CLI.
Re-using a previously failed drive is not supported.
•
A disk is not detected during a rescan, then is subsequently detected.
When a disk becomes a leftover, the following changes occur:
•
The disk's health becomes Degraded and its How Used state becomes LEFTOVR.
•
The disk is automatically excluded from the vdisk, causing the vdisk's health to become
Degraded or Fault, depending on the RAID level.
•
The disk's Fault LED is illuminated amber.
If spares are available, and the health of the vdisk is Degraded, the vdisk will use them to start
reconstruction. When reconstruction is complete, you can replace the failed drive with a new
drive and configure it as a global spare.
If spares are not available to begin reconstruction, replace the failed drive with a new drive and
configure it as a global spare. Then reconstruction will begin.
This command clears metadata from leftover disks only. If you specify disks that are not
leftovers, the disks are not changed.
Syntax clear disk-metadata disks
Parameters disks
IDs of the leftover disks to clear metadata from. For disk syntax, see Command Syntax in the
NEXIO Farad 2300 Series CLI Reference Guide.
Example Show disk usage:
# show disks
Location ... How Used ...
----------------------...
1.1
... LEFTOVR ...
1.2
... VDISK
...
NEXIO Farad 2300 Series Service Guide
31
Clear metadata from a leftover disk:
# clear disk-metadata 1.1
Info: Updating disk list...
Info: Disk disk_1.1 metadata was cleared. (2012-01-18 10:35:39)
Success: Command completed successfully. - Metadata was cleared.
(2012-01-18 10:35:39)
Try to clear metadata from a disk that is not leftover:
# clear disk-metadata 1.2
Error: The specified disk is not a leftover disk. (1.2) - Metadata was
not cleared for one or more disks. (2012-01-18 10:32:59)
clear expander-status
CAUTION:
For use by or with direction from a service technician.
Description Clears the counters and status for SAS expander lanes. Counters and status can be reset to a good
state for all enclosures, or for a specific enclosure whose status is Error as shown by the show
expander-status command.
NOTE: If a rescan is in progress, the clear operation will fail with an error message saying that
an EMP does exist. Wait for the rescan to complete and then retry the clear operation.
Syntax clear expander-status [enclosure ID]
Parameters enclosure ID
Optional. The enclosure number.
Example Clear the expander status for the first enclosure:
# clear expander-status enclosure 0
Success: Command completed successfully. - Expander status was cleared.
(2012-01-18 14:18:53)
See also •
show expander-status on page 44
clear events
CAUTION:
For use by or with direction from a service technician.
Description Clears the event log for controller A, B, or both.
Syntax clear events [a|b|both]
Parameters a|b|both
Optional. The controller event log to clear. If this parameter is omitted, both event logs are
cleared.
Example Clear the event log for controller A:
# clear events a
Success: Command completed successfully. - The event log was
successfully cleared. (2012-01-18 10:40:13)
See also •
32
show events on page 42
Alphabetical list of commands
rescan
Description This command forces rediscovery of attached disks and enclosures. If both SCs are online and
able to communicate with both expansion modules in each connected enclosure, this command
also reassigns enclosure IDs based on controller A’s enclosure cabling order. A manual rescan
may be needed after system power-up to display enclosures in the proper order.
A manual rescan is not required to detect when disks are inserted or removed; the controllers do
this automatically. When disks are inserted they are detected after a short delay, which allows
the disks to spin up.
When you perform a manual rescan, it temporarily pauses all I/O processes, then resumes
normal operation.
Syntax rescan
Example Scan for device changes and re-evaluate enclosure IDs:
# rescan
Success: Command completed successfully. (2012-01-21 12:20:57)
reset host-link
Description Resets specified controller host ports (channels).
For an FC host port configured to use FC-AL topology, a LIP is issued.
For SAS, resetting a host port issues a COMINIT/COMRESET sequence and might reset other
ports.
Syntax reset host-link
ports ports
[controller a|b]
Parameters port ports
A controller host port ID, a comma-separated list of IDs, a hyphenated range of IDs, or a
combination of these. A port ID is a controller ID and port number, and is not case sensitive. Do
not mix controller IDs in a range.
controller a|b
Optional. The controller ID, either A or B.
Example Reset the host link on port A1:
# reset host-link ports A1
Success: Command completed successfully. - Reset Host Link(s) on port(s)
A1 from current controller. (2012-01-21 11:36:28)
See also •
show ports
NEXIO Farad 2300 Series Service Guide
33
restart
Description Restarts the SC or MC in a controller module.
If you restart an SC, it attempts to shut down with a proper failover sequence, which includes
stopping all I/O operations and flushing the write cache to disk, and then the controller restarts.
The MC is not restarted so it can provide status information to external interfaces.
If you restart an MC, communication with it is lost until it successfully restarts. If the restart
fails, the partner MC remains active with full ownership of operations and configuration
information.
CAUTION: If you restart both controller modulesyou and users lose access to the system and
its data until the restart is complete. This action should be taken only with the advice and
consent of Customer Support.
NOTE: When an SC is restarted, live performance statistics that it recorded will be reset;
historical performance statistics are not affected. Disk statistics may be reduced but will not be
reset to zero, because disk statistics are summed between the two controllers. For more
information, see help for commands that show statistics.
Syntax restart
sc|mc
[a|b|both]
[noprompt]
Parameters sc|mc
The controller to restart:
•
sc: SC
•
mc: MC
a|b|both
Optional. The controller module containing the controller to restart. If this parameter is omitted,
the command affects the controller being accessed.
noprompt
Optional in console format; required for XML API format. Suppresses the confirmation prompt,
which requires a yes or no response. Specifying this parameter allows the command to proceed
without user interaction.
Example Restart the MC in controller A, which you are logged in to:
# restart mc a
During the restart process you will briefly lose communication with the
specified Management Controller(s).
Continue? yes
Info: Restarting the local MC (A)...
Success: Command completed successfully. (2012-01-21 11:38:47)
From controller A, restart the SC in controller B:
# restart sc b
Success: Command completed successfully. - SC B was restarted.
(2012-01-21 11:42:10)
34
Alphabetical list of commands
Restart both SCs:
# restart sc both
Restarting both controllers can cause a temporary loss of data
availability.
Do you want to continue? yes
Success: Command completed successfully. - Both SCs were restarted.
(2012-01-21 13:09:52)
See also •
shutdown
restore defaults
CAUTION:
For use by or with direction from a service technician.
Description Restores the default configuration to the controllers. For details about which settings are
restored see the NEXIO Farad 2300 Series CLI Reference Guide.
CAUTION: This command changes how the system operates and might require some
reconfiguration to restore host access to volumes.
Syntax restore defaults
[noprompt]
[prompt yes|no]
Parameters noprompt
Optional in console format; required for XML API format. Suppresses the confirmation prompt,
which requires a yes or no response. Specifying this parameter allows the command to proceed
without user interaction.
prompt yes|no
Optional. Specifies an automatic response to the confirmation prompt:
•
yes: Allow the command to proceed.
•
no: Cancel the command.
If this parameter is omitted, you must manually reply to the prompt.
Example Restore the controllers’ default configuration:
# restore defaults
WARNING: The configuration of the array controller will be re-set to
default settings. The Management Controller will restart once this is
completed. Are you sure? yes
Success: Command completed successfully. - Device default configuration
was restored.
See also •
restart on page 34
NEXIO Farad 2300 Series Service Guide
35
set debug-log-parameters
CAUTION:
For use by or with direction from a service technician.
Description Sets the types of debug messages to include in the SC debug log.
Syntax set debug-log-parameters message-type+|- [...]
Parameters message-type+|One of the following message types, followed by a plus (+) to enable or a minus (-) to disable
inclusion in the log:
•
awt: Auto-write-through cache triggers debug messages. Disabled by default.
•
bkcfg: Internal configuration debug messages. Enabled by default.
•
cache: Cache debug messages. Enabled by default.
•
capi: Internal CAPI debug messages. Enabled by default.
•
capi2: Internal CAPI tracing debug messages. Disabled by default.
•
disk: Disk interface debug messages. Enabled by default.
•
emp: EMP debug messages. Enabled by default.
•
fo: Failover and recovery debug messages. Enabled by default.
•
fruid: FRU ID debug messages. Enabled by default.
•
hb: Not used.
•
host or host-dbg: Host interface debug messages. Enabled by default.
•
init: Not used.
•
ioa: I/O interface driver debug messages (standard). Enabled by default.
•
iob: I/O interface driver debug messages (resource counts). Disabled by default.
•
ioc: I/O interface driver debug messages (upper layer, verbose). Disabled by default.
•
iod: I/O interface driver debug messages (lower layer, verbose). Disabled by default.
•
mem: Internal memory debug messages. Disabled by default.
•
misc: Internal debug messages. Enabled by default.
•
msg: Inter-controller message debug messages. Enabled by default.
•
mui: Internal service interface debug messages. Enabled by default.
•
ps: Not used.
•
raid: RAID debug messages. Enabled by default.
•
rcm: Removable-component manager debug messages. Disabled by default.
•
res2: Internal debug messages. Disabled by default.
•
resmgr: Reservation Manager debug messages. Disabled by default.
Example Include RAID and cache messages, exclude EMP messages, and leave other message types
unchanged:
# set debug-log-parameters raid+ cache+ empSuccess: Command completed successfully. - Debug-log parameters were
changed. (2012-01-21 11:58:38)
See also •
36
show debug-log-parameters on page 41
Alphabetical list of commands
set expander-fault-isolation
CAUTION:
For use by or with direction from a service technician.
Description By default, the EC in each I/O module performs fault-isolation analysis of SAS expander PHY
statistics. When one or more error counters for a specific PHY exceed the built-in thresholds, the
PHY is disabled to maintain storage system operation.
While troubleshooting a storage system problem, a service technician can use this command to
temporarily disable fault isolation for a specific EC in a specific enclosure.
NOTE: If fault isolation is disabled, be sure to re-enable it before placing the system back into
service. Serious problems can result if fault isolation is disabled and a PHY failure occurs.
Syntax set expander-fault-isolation
[wwn wwn]
encl enclosure-ID
controller a|b|both
enabled|disabled|on|off
Parameters wwn wwn
Optional. The WWN of the PHY.
encl enclosure-ID
The enclosure ID of the enclosure containing the PHY.
controller a|b|both
The I/O module containing the EC whose setting you want to change: A, B, or both.
enabled|disabled|on|off
Whether to enable or disable PHY fault isolation.
Example Disable PHY fault isolation for EC A in an enclosure:
# set expander-fault-isolation encl 0 controller a disabled
Success: Command completed successfully. - Expander fault isolation was
disabled. (2012-01-21 12:05:41)
Re-enable PHY fault isolation for EC A in the same enclosure:
# set expander-fault-isolation encl 0 controller a enabled
Success: Command completed successfully. - Expander fault isolation was
enabled. (2012-01-21 12:05:51)
See also •
•
set expander-phy on page 38
show expander-status on page 44
NEXIO Farad 2300 Series Service Guide
37
set expander-phy
CAUTION:
For use by or with direction from a service technician.
\
Description Disables or enables a specific PHY.
Syntax set expander-phy
[encl enclosure-ID]
controller a|b|both
[type
drive|inter-exp|sc|sc-0|sc-1|ingress|ingress-0|ingress-1|egress
|egress-0|egress-1]
[phy phy-ID]
enabled|disabled|on|off
[wwn wwn]
Parameters encl enclosure-ID
Optional. The enclosure ID of the enclosure containing the PHY.
controller a|b|both
The I/O module containing the PHY to enable or disable: A, B, or both.
type drive|inter-exp|sc|sc-0|sc-1|ingress|ingress-0|ingress-1|egress
|egress-0|egress-0]
Optional. The PHY type:
•
drive: PHY connected to a disk drive.
•
inter-exp: communication between multiple expanders in the same enclosure.
•
sc: PHY in the ingress bus to the SC.
•
sc-0: PHY in the ingress bus to the local SC.
•
sc-1: PHY in the ingress bus to the partner SC.
•
ingress: PHY in an ingress port.
•
ingress-0: PHY in an ingress port.
•
ingress-1: PHY in an ingress port.
•
egress: PHY in an egress port.
•
egress-0: PHY in an egress port.
•
egress-1: PHY in an egress port.
phy phy-ID
Optional. The logical PHY number.
set expander-phy
[encl enclosure-ID]
controller a|b|both
[type
drive|inter-exp|sc|sc-0|sc-1|ingress|ingress-0|ingress-1|egress
|egress-0|egress-1]
[phy phy-ID]
enabled|disabled|on|off
[wwn wwn]
enabled|disabled|on|off
Whether to enable or disable the specified PHY.
wwn wwn
Optional. The WWN of the PHY.
38
Alphabetical list of commands
Example Disable the first egress PHY in controller A, and check the resulting status:
# set expander-phy encl 0 controller a type egress phy 0 disabled
Success: Command completed successfully. - Disabled PHY 0 on controller
a in enclosure 0. (PHY type: egress) (2012-01-21 12:07:36)
# show expander-status
Encl Ctlr Phy Type
Status
Elem Status Disabled Reason
--------------------------------------------------------------------0
A
0
Egress Disabled Disabled
Disabled PHY control
--------------------------------------------------------------------Success: Command completed successfully. (2012-01-21 12:03:42)
Enable the PHY for disk 5 in controller B, and check the resulting status:
# set expander-phy encl 0 controller b type drive phy 5 enabled
Success: Command completed successfully. - Enabled PHY 5 on controller b
in enclosure 0. (PHY type: drive) (2012-01-21 12:07:50)
# show expander-status
Encl Ctlr Phy Type
Status
Elem Status Disabled Reason
----------------------------------------------------------------------0
B
5
Drive
Enabled-Healthy OK
Enabled
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-21 12:03:42)
See also •
•
set expander-fault-isolation on page 37
show expander-status on page 44
set led
Description Changes the state of the ID LED on a specified disk or enclosure. For a disk this affects the fault
LED. For an enclosure this affects the unit locator LED.
Syntax To set a disk LED:
set led
disk ID
enable|disable|on|off
To set an enclosure LED:
set led
enclosure ID
enable|disable|on|off
Parameters disk ID
The disk to locate.
enclosure ID
The enclosure to locate.
enable|disable|on|off
Specifies to set or unset the LED.
Example Identify disk 5 in the first enclosure:
# set led disk 0.5 on
Success: Command completed successfully. - Enabling identification LED
for disk 0.5... (2012-01-21 12:23:18)
Stop identifying the first enclosure:
# set led enclosure 0 off
Success: Disabling identification LED for enclosure 0... (2012-01-21
12:24:03)
NEXIO Farad 2300 Series Service Guide
39
set protocols
Description Enables or disables management services and protocols.
Syntax set protocols
[debug enabled|disabled|on|off]
[ftp enabled|disabled|on|off]
[http enabled|disabled|on|off]
[https enabled|disabled|on|off]
[ses enabled|disabled|on|off]
[smis enabled|disabled|on|off]
[snmp enabled|disabled|on|off]
[ssh enabled|disabled|on|off]
[telnet enabled|disabled|on|off]
[usmis enabled|disabled|on|off]
Parameters debug enabled|disabled|on|off
Optional. Enables or disables the Telnet debug port. This is disabled by default.
ftp enabled|disabled|on|off
Optional. Enables or disables the expert interface for updating firmware. This is enabled by
default.
http enabled|disabled|on|off
Optional. Enables or disables the standard RAIDar web server. This is enabled by default.
https enabled|disabled|on|off
Optional. Enables or disables the secure RAIDar web server. This is enabled by default.
ses enabled|disabled|on|off
Optional. Enables or disables the in-band SES interface. This is enabled by default.
smis enabled|disabled|on|off
Optional. Enables or disables the secure SMI-S interface. This option allows SMI-S clients to
communicate with each controller’s embedded SMI-S provider via HTTPS port 5989. HTTPS
port 5989 and HTTP port 5988 cannot be enabled at the same time, so enabling this option will
disable port 5988. This is enabled by default.
snmp enabled|disabled|on|off
Optional. Enables or disables the SNMP interface. Disabling this option disables all SNMP
requests to the MIB and disables SNMP traps. To configure SNMP traps use the set
snmp-parameters command. This is enabled by default.
ssh enabled|disabled|on|off
Optional. Enables or disables the SSH CLI. This is enabled by default.
telnet enabled|disabled|on|off
Optional. Enables or disables the standard CLI. This is enabled by default.
usmis enabled|disabled|on|off
Optional. Enables or disables the unsecure SMI-S interface. This option allows SMI-S clients to
communicate with each controller's embedded SMI-S provider via HTTP port 5988. HTTP port
5988 and HTTPS port 5989 cannot be enabled at the same time, so enabling this option will
disable port 5989. On NEXIO Farad NXS2300, this should be enabled so Farad Storage
Manager can auto-discover Farad stacks.
Example Disable unsecure HTTP connections and enable FTP:
# set protocols http disabled ftp enabled
Success: Command completed successfully. (2012-01-21 14:46:55)
See also •
40
show protocols on page 50
Alphabetical list of commands
show debug-log-parameters
CAUTION:
For use by or with direction from a service technician.
Description Shows which debug message types are enabled (On) or disabled (Off) for inclusion in the SC
debug log.
Syntax show debug-log-parameters
Output •
host: Host interface debug messages. Enabled by default.
•
disk: Disk interface debug messages. Enabled by default.
•
mem: Internal memory debug messages. Disabled by default.
•
fo: Failover and recovery debug messages. Enabled by default.
•
msg: Inter-controller message debug messages. Enabled by default.
•
ioa: I/O interface driver debug messages (standard). Enabled by default.
•
iob: I/O interface driver debug messages (resource counts). Disabled by default.
•
ioc: I/O interface driver debug messages (upper layer, verbose). Disabled by default.
•
iod: I/O interface driver debug messages (lower layer, verbose). Disabled by default.
•
misc: Internal debug messages. Enabled by default.
•
rcm: Removable-component manager debug messages. Disabled by default.
•
raid: RAID debug messages. Enabled by default.
•
cache: Cache debug messages. Enabled by default.
•
emp: EMP debug messages. Enabled by default.
•
capi: Internal CAPI debug messages. Enabled by default.
•
mui: Internal service interface debug messages. Enabled by default.
•
bkcfg: Internal configuration debug messages. Enabled by default.
•
awt: Auto-write-through cache triggers debug messages. Disabled by default.
•
res2: Internal debug messages. Disabled by default.
•
capi2: Internal CAPI tracing debug messages. Disabled by default.
•
fruid: FRU ID debug messages. Enabled by default.
•
resmgr: Reservation Manager debug messages. Disabled by default.
•
init: Not used.
•
ps: Not used.
•
hb: Not used.
Example Show debug log parameters:
# show debug-log-parameters
Debug Log Parameters
-------------------host: On
disk: On
mem: Off
Success: Command completed successfully. (2012-01-18 14:59:52)
See also •
set debug-log-parameters on page 36
NEXIO Farad 2300 Series Service Guide
41
show events
Description Shows events logged by each controller in the storage system. A separate set of event numbers
is maintained for each controller. Each event number is prefixed with a letter identifying the
controller that logged the event.
Events are listed from newest to oldest, based on a timestamp with one-second granularity;
therefore the event log sequence matches the actual event sequence within about one second.
For further information about diagnosing and resolving problems, see:
•
The troubleshooting chapter and the LED descriptions appendix in the NEXIO Farad 2300
Series Setup Guide
•
Verifying component failure on page 75
Syntax To show a certain number of events:
show events
[detail]
[last #]
[a|b|both|error]
To show events by time:
show events
[detail]
[from timestamp]
[to timestamp]
[a|b|both|error]
To show events by ID:
show events
[detail]
[from-event event-ID]
[to-event event-ID]
[a|b|both|error]
Parameters detail
Optional. Shows additional information and recommended actions for displayed events. This
information is also in the Service Guide.
last #
Optional. Shows the latest specified number of events. If this parameter is omitted, all events are
shown.
from timestamp
Optional. Shows events including and after a timestamp specified with the format
MMDDYYhhmmss. For example, 043011235900 represents April 30 2011 at 11:59:00 p.m. This
parameter can be used with the to parameter or the to-event parameter.
to timestamp
Optional. Shows events before and including a timestamp specified with the format
MMDDYYhhmmss. For example, 043011235900 represents April 30 2011 at 11:59:00 p.m. This
parameter can be used with the from parameter or the from-event parameter.
42
Alphabetical list of commands
from-event event-ID
Optional. Shows events including and after the specified event ID. If this number is smaller than
the ID of the oldest event, events are shown from the oldest available event. Events are shown
only for the controller that the event ID specifies (A or B). This parameter can be used with the
to parameter or the to-event parameter.
to-event event-ID
Optional. Shows events before and including the specified event ID. If this number is larger than
the ID of the oldest event, events are shown up to the latest event. Events are shown only for the
controller that the event ID specifies (A or B). This parameter can be used with the from
parameter or the from-event parameter.
a|b|both|error
Optional. Specifies to filter the event listing:
•
a: Shows events from controller A only. Do not use this parameter with the from-event
parameter or the to-event parameter.
•
b: Shows events from controller B only. Do not use this parameter with the from-event
parameter or the to-event parameter.
•
both: Shows events from both controllers. Do not use this parameter with the
from-event parameter or the to-event parameter.
•
error: Shows Warning, Error, and Critical events.
Output •
Date and time when the event was logged
•
Event code identifying the type of event to help diagnose problems; for example, [181]
•
Event ID prefixed by A or B, indicating which controller logged the event; for example,
#A123
•
Model, serial number, and ID of the controller module that logged the event
•
Severity:
•
•
CRITICAL: A failure occurred that may cause a controller to shut down. Correct the
problem immediately.
•
ERROR: A failure occurred that may affect data integrity or system stability. Correct the
problem as soon as possible.
•
WARNING: A problem occurred that may affect system stability but not data integrity.
Evaluate the problem and correct it if necessary.
•
INFORMATIONAL: A configuration or state change occurred, or a problem occurred
that the system corrected. No action is required.
Event-specific message giving details about the event
Example Show the last two events:
# show events last 2
Show the last three non-Informational events:
# show events last 3 error
Show all events from April 30 2011 at 11:59:00 p.m. through May 2 2011 at 11:59:00 a.m.:
# show events from 043011235900 to 050211115900
Show a range of events logged by controller A:
# show events from-event a100 to-event a123
Show detailed output for a specific event:
# show events from-event A2264 to-event A2264 detail
See also •
clear events
•
set snmp-parameters
•
show snmp-parameters
NEXIO Farad 2300 Series Service Guide
43
show expander-status
CAUTION:
For use by or with direction from a service technician.
Description Shows diagnostic information relating to SAS EC physical channels, known as PHY lanes. For
each enclosure, this command shows status information for PHYs in I/O module A and then I/O
module B.
Syntax show expander-status
Output Encl
Enclosure that contains the SAS expander.
Ctlr
I/O module that contains the SAS expander.
Phy
Identifies a PHY’s logical location within a group based on the PHY type. Logical IDs are 0–23
for drive PHYs; 0–1 for SC PHYs; and 0–3 for other PHYs. If the PHY's controller module or
expansion module is not installed, this field shows “--”.
Type
•
Drive: 1-lane PHY that communicates between the expander and a disk drive.
•
Egress: 4-lane PHY that communicates between the expander and an expansion port or
SAS Out port.
•
SC-1: (Controller module only) 2-lane PHY that communicates between the expander and
the partner’s expander.
•
SC-0: (Controller module only) 4-lane PHY that communicates between the expander and
the SC.
•
Ingress: (Expansion module only) 4-lane PHY that communicates between the expander
and an expansion port.
•
Inter-Exp: (Expansion module only) Communicates between the expander and the
partner’s expander.
•
Undefined: No status information is available.
•
Unused: The PHY exists in the expander but is not connected, by design.
Status
44
•
Enabled - Healthy: The PHY is enabled and healthy.
•
Enabled - Degraded: The PHY is enabled but degraded.
•
Disabled: The PHY has been disabled by a user or by the system.
Alphabetical list of commands
Elem Status
A standard SES status for the element:
•
Disabled: Critical condition is detected.
•
Error: Unrecoverable condition is detected. Appears only if there is a firmware problem
related to PHY definition data.
•
Non-critical: Non-critical condition is detected.
•
Not Used: Element is not installed in enclosure.
•
OK: Element is installed and no error conditions are known.
•
Unknown: Either:
•
Sensor has failed or element status is not available. Appears only if an I/O module
indicates it has fewer PHYs than the reporting I/O module, in which case all additional
PHYs are reported as unknown.
•
Element is installed with no known errors, but the element has not been turned on or set
into operation.
Disabled
•
Enabled: PHY is enabled.
•
Disabled: PHY is disabled.
Reason
•
Blank if Elem Status is OK.
•
Error count interrupts: PHY disabled because of error-count interrupts.
•
Phy control: PHY disabled by a SES control page as a result of action by an SC or user.
•
Not ready: PHY is enabled but not ready. Appears for SC-1 PHYs when the partner I/O
module is not installed. Appears for Drive, SC-1, or Ingress PHYs when a connection
problem exists such as a broken connector.
•
Drive removed: PHY disabled because drive slot is empty.
•
Unused - disabled by default: PHY is disabled by default because it is not used.
•
Excessive Phy changes: PHY is disabled because of excessive PHY change counts.
Example Show expander status for a single-enclosure system with an empty disk slot:
# show expander-status
Encl Ctlr Phy Type Status
Elem Status Disabled Reason
----------------------------------------------------------------------0
A
0
Drive Enabled-Healthy
OK
Enabled
0
A
1
Drive Enabled-Degraded Non-critical Enabled
Not ready
0
A
23
Drive Disabled
OK
Disabled No Drive
0
A
0
SC-1
Enabled-Healthy
OK
Enabled
0
A
1
SC-1
Enabled-Healthy
OK
Enabled
0
A
0
SC-0
Enabled-Healthy
OK
Enabled
0
A
3
SC-0
Enabled-Healthy
OK
Enabled
0
A
0
Egress Enabled-Healthy
OK
Enabled
0
A
3
Egress Enabled-Healthy
OK
Enabled
-----------------------------------------------------------------------
NEXIO Farad 2300 Series Service Guide
45
Encl Ctlr Phy Type
Status
Elem Status
Disabled Reason
----------------------------------------------------------------------0
B
0
Drive Enabled-Healthy
OK
Enabled
0
B
1
Drive Enabled-Degraded Non-critical Enabled
Not ready
0
B
23
Drive Disabled
OK
Disabled No Drive
0
B
0
SC-1
Enabled-Healthy
OK
Enabled
0
B
1
SC-1
Enabled-Healthy
OK
Enabled
0
B
0
SC-0
Enabled-Healthy
OK
Enabled
0
B
0
Egress Enabled-Healthy
OK
Enabled
0
B
3
Egress Enabled-Healthy
OK
Enabled
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-18 15:02:13)
See also •
46
clear expander-status
•
set expander-fault-isolation on page 37
•
set expander-phy on page 38
Alphabetical list of commands
show frus
Description Shows FRU information for the storage system. Some information is for use by service
technicians.
Syntax show frus
Output FRU fields:
Name
•
CHASSIS_MIDPLANE: 2U chassis and midplane circuit board
•
RAID_IOM: Controller module
•
IOM: Expansion module
•
POWER_SUPPLY: Power supply module
Description
FRU description
Part Number
FRU part number
Serial Number
FRU serial number
Revision
Hardware revision level
Dash Level
FRU template revision number
FRU Shortname
Short description
Manufacturing Date
Date and time in the format year-month-day hour:minutes:seconds when a PCBA
was programmed or a power supply module was manufactured
Manufacturing Location
City, state/province, and country where the FRU was manufactured
Manufacturing Vendor ID
JEDEC ID of the manufacturer
FRU Location
Location of the FRU in the enclosure:
•
MID-PLANE SLOT: Chassis midplane
•
UPPER IOM SLOT: Controller module or expansion module A
•
LOWER IOM SLOT: Controller module or expansion module B
•
LEFT PSU SLOT: Power supply module on the left, as viewed from the back
•
RIGHT PSU SLOT: Power supply module on the right, as viewed from the back
Configuration SN
Configuration serial number
FRU Status
•
Absent: Component is not present
•
Fault: One or more subcomponents has a fault
•
OK: All subcomponents are operating normally
•
Not Available: Status is not available
Original SN
For a power supply module, the original manufacturer serial number; otherwise, N/A.
NEXIO Farad 2300 Series Service Guide
47
Original PN
For a power supply module, the original manufacturer part number; otherwise, N/A.
Original Rev
For a power supply module, the original manufacturer hardware revision; otherwise, N/A.
show host-parameters
Description Shows information about host ports on both controllers. This command shows the same
information as the show ports command.
Syntax show host-parameters
Output Ports
Controller ID and port number
Media
•
FC(L): FC-AL (public or private)
•
FC(P): FC Point-to-Point
•
FC(-): FC disconnected
•
SAS: SAS
Target ID
Port WWN
Status
•
Up: Port is cabled and has an I/O link.
•
Disconnected: Either no I/O link is detected or the port is not cabled.
Speed (A)
Actual link speed in Gbit/s. Blank if not applicable.
Speed (C)
Configured host-port link speed. Does not display for SAS.
•
FC: Auto, 8Gb, 4Gb, or 2Gb (Gbit/s)
•
Blank if not applicable
Health
•
OK
•
Degraded
•
Fault
•
N/A
Health Reason
If Health is not OK, this field shows the reason for the health state.
Health Recommendation
If Health is not OK, this field shows recommended actions to take to resolve the health issue.
Topo (C)
FC only. Configured topology.
Width
SAS only. Number of PHY lanes in the SAS port. Displays instead of PID.
PID
FC only. Primary ID, or blank if not applicable. Displays instead of Width.
48
Alphabetical list of commands
Example Show port information for a system with two FC ports:
# show host-parameters
Ports Media
Target ID
Status
Speed(A) Speed(C) Health
Health Reason
Recommendation
Topo(C)
PID
----------------------------------------------------------------------A0
FC(L)
WWPN
Up
8Gb
Auto
OK
OK
Loop
0
A1
FC(-)
WWPN
No host connection.
Disconnected Auto
No action
Loop
N/A
0
---------------------------------------------------------------------Success: Command completed successfully. (2012-01-18 15:03:24)
Show port information for a system with two FC ports:
# show host-parameters
Ports Media
Target ID
Status
Speed(A) Speed(C) Health
Health Reason
Recommendation
Topo(C)
PID
----------------------------------------------------------------------A0
FC(L)
WWPN
Up
8Gb
Auto
OK
OK
Loop
0
A1
FC(-)
WWPN
There is no host.
Disconnected
No action
Loop
Auto
N/A
0
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-18 15:03:24)
See also •
•
set host-parameters
show ports
NEXIO Farad 2300 Series Service Guide
49
show protocols
Description Shows which management services and protocols are enabled or disabled.
Syntax show protocols
Example Show the status of service and security protocols:
# show protocols
Service and Security Protocols
-----------------------------Web Browser Interface (HTTP): Enabled
Secure Web Browser Interface (HTTPS): Enabled
Command Line Interface (Telnet): Enabled
Secure Command Line Interface (SSH): Enabled
Storage Management Initiative Specification (SMI-S): Enabled
Unsecure Storage Management Initiative Specification (SMI-S 5988):
Disabled
File Transfer Protocol (FTP): Disabled
Simple Network Management Protocol (SNMP): Enabled
Service Debug (Debug): Disabled
In-band SES Management (SES): Enabled
Success: Command completed successfully. (2012-01-18 15:13:23)
See also •
set protocols on page 40
show redundancy-mode
Description Shows the redundancy status of the system.
Syntax show redundancy-mode
Output Controller Redundancy Mode
Shows the system’s operating mode, also called the cache redundancy mode:
•
Independent Cache Performance Mode: Controller failover is disabled and data
in a controller’s write-back cache is not mirrored to the partner controller. This improves
write performance at the risk of losing unwritten data if a controller failure occurs while
there is data in controller cache.
•
Active-Active ULP: Both controllers are active using ULP. Data for volumes
configured to use write-back cache is automatically mirrored between the two controllers to
provide fault tolerance.
•
Fail Over: Operation has failed over to one controller because its partner is not
operational. The system has lost redundancy.
•
Down: Both controllers are not operational.
Controller Redundancy Status
•
Redundant with independent cache: Both controllers are operational but are not
mirroring their cache data to each other.
•
Redundant: Both controllers are operational.
•
Operational but not redundant: In active-active mode, one controller is
operational and the other is offline. In single-controller mode, the controller is operational.
•
Down: This controller is not operational.
•
Unknown: Status information is not available.
Controller ID Status
•
50
Operational: The controller is operational.
•
Down: The controller is installed but not operational.
•
Not Installed: The controller is not installed.
Alphabetical list of commands
Controller ID Serial Number
•
Controller module serial number
•
Not Available: The controller is down or not installed.
Example From either controller, show the redundancy status where both controllers are operating:
# show redundancy-mode
System Redundancy
----------------Controller Redundancy Mode: Active-Active ULP
Controller Redundancy Status: Redundant
Controller A Status: Operational
Controller A Serial Number: SN
Controller B Status: Operational
Controller B Serial Number: SN
Success: Command completed successfully. (2012-01-18 11:02:36)
From either controller, show the redundancy status where controller B is down:
# show redundancy-mode
System Redundancy
----------------Controller Redundancy Mode: Fail Over
Controller Redundancy Status: Operational but not redundant
Controller A Status: Operational
Controller A Serial Number: SN
Controller B Status: Down
Controller B Serial Number: SN
Success: Command completed successfully. (2012-02-01 11:03:39)
From either controller, show the redundancy status where both controllers are down:
# show redundancy-mode
System Redundancy
----------------Controller Redundancy Mode: Down
Controller Redundancy Status: Down
Controller A Status: Down
Controller A Serial Number: SN
Controller B Status: Down
Controller B Serial Number: SN
Success: Command completed successfully. (2012-02-01 11:03:39)
show sensor-status
Description Shows the status of each environmental sensor in each enclosure.
Information shown only for a controller enclosure: on-board temperature, disk controller
temperature, memory controller temperature, supercapacitor voltage and charge, overall unit
(enclosure) status.
Information shown for all enclosures: temperature, voltage, and current for each IOM
(controller module or expansion module); temperature, voltage, and current for each PSU.
Normal and error ranges for temperature and voltage are specified in the NEXIO Farad 2300
Series Setup Guide.
Syntax show sensor-status
NEXIO Farad 2300 Series Service Guide
51
Output Encl
Enclosure number.
Sensor Name
Sensor name and location.
Value
•
For a sensor, its value.
•
For overall unit status, one of the status values below.
Status
•
OK: The sensor is present and detects no error condition.
•
Warning: The sensor detected a non-critical error condition. Temperature, voltage, or
current is between the warning and critical thresholds.
•
Error: The sensor detected a critical error condition. Temperature, voltage, or current
exceeds the critical threshold.
•
Unavailable: The sensor is present with no known errors, but has not been turned on or
set into operation because it is initializing. This typically occurs during controller startup.
•
Unrecoverable: The EMP cannot communicate with the sensor.
•
Unknown: The sensor is present but status is not available.
•
Not Installed: The sensor is not present.
•
Unsupported: Status detection is not implemented.
Example Show sensor status for a system that includes a controller enclosure and a drive enclosure:
# show sensor-status
Encl Sensor Name
Value
Status
---------------------------------------------------0
On-Board Temperature 1-Ctlr A
55 C
OK
0
On-Board Temperature 1-Ctlr B
54 C
OK
0
On-Board Temperature 2-Ctlr A
76 C
OK
0
On-Board Temperature 2-Ctlr B
69 C
OK
0
On-Board Temperature 3-Ctlr A
53 C
OK
0
On-Board Temperature 3-Ctlr B
55 C
OK
0
Disk Controller Temp-Ctlr A
31 C
OK
0
Disk Controller Temp-Ctlr B
30 C
OK
0
Memory Controller Temp-Ctlr A
71 C
OK
0
Memory Controller Temp-Ctlr B
76 C
OK
0
Capacitor Pack Voltage-Ctlr A
8.20
OK
0
Capacitor Pack Voltage-Ctlr B
8.12
OK
0
Capacitor Cell 1 Voltage-Ctlr A
2.04
OK
0
Capacitor Cell 1 Voltage-Ctlr B
2.02
OK
0
Capacitor Cell 2 Voltage-Ctlr A
2.04
OK
0
Capacitor Cell 2 Voltage-Ctlr B
2.08
OK
0
Capacitor Cell 3 Voltage-Ctlr A
2.03
OK
0
Capacitor Cell 3 Voltage-Ctlr B
2.02
OK
0
0
0
0
0
52
Alphabetical list of commands
Capacitor Cell 4 Voltage-Ctlr A
Capacitor Cell 4 Voltage-Ctlr B
Capacitor Charge-Ctlr A
Capacitor Charge-Ctlr B
Overall Unit Status
2.08
2.00
100%
100%
OK
OK
OK
OK
OK
OK
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Temperature Loc: upper-IOM A
Temperature Loc: lower-IOM B
Temperature Loc: left-PSU
Temperature Loc: right-PSU
Voltage 12V Loc: upper-IOM A
Voltage 5V Loc: upper-IOM A
Voltage 12V Loc: lower-IOM B
Voltage 5V Loc: lower-IOM B
Voltage 12V Loc: left-PSU
Voltage 5V Loc: left-PSU
Voltage 3.3V Loc: left-PSU
Voltage 12V Loc: right-PSU
Voltage 5V Loc: right-PSU
Voltage 3.3V Loc: right-PSU
Current 12V Loc: upper-IOM A
Current 12V Loc: lower-IOM B
Current 12V Loc: left-PSU
Current 5V Loc: left-PSU
Current 12V Loc: right-PSU
Current 5V Loc: right-PSU
42 C
38 C
33 C
36 C
11.92
5.08
11.86
5.08
11.93
5.11
3.48
12.04
5.13
3.49
4.33
4.42
5.17
7.24
5.80
7.15
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
show vdisk-statistics
Description Shows live or historical performance statistics for vdisks. You can view live statistics for all or
specified vdisks, or historical statistics for a specified vdisk. The system samples
disk-performance statistics every quarter hour and retains performance data for 6 months.
The historical option allows you to specify a time range or a number (count) of data samples to
include. It is not recommended to specify both the time-range and count parameters; if
both parameters are specified, and more samples exist for the specified time range, the samples'
values will be aggregated to show the required number of samples.
For each vdisk these statistics quantify destages, read-aheads, and host reads that are cache
misses. For example, each time data is written from a volume’s cache to disks in the vdisk that
contains the volume, the vdisk's statistics are adjusted.
Syntax To show live statistics:
show vdisk-statistics [vdisks]
To show historical statistics:
show vdisk-statistics
vdisk
historical
[time-range "date/time-range"]
[count number-of-data-samples]
[all]
Parameters vdisks
Optional. Identifies one or more vdisks to show live statistics for. If this parameter is omitted,
statistics will be shown for all vdisks.
vdisk
Identifies one vdisk to show historical statistics for. For vdisk syntax, see the NEXIO Farad
2300 Series CLI Reference Guide.
historical
Optional. Specifies to show historical statistics. If this parameter is omitted, live statistics will
be shown.
NEXIO Farad 2300 Series Service Guide
53
time-range "date/time-range"
Optional. Specifies the date/time range of historical statistics to show, in the format "start
yyyy-mm-dd hh:mm [AM|PM] end yyyy-mm-dd hh:mm [AM|PM]". If the start
date/time is specified but no end date/time is specified, the current date/time will be used as the
end date/time. The system will return the oldest sample taken after the start time and the latest
sample taken before the end time. If the specified start date/time is earlier than the oldest
sample, that sample will be used as the start date/time. If you specify this parameter, do not
specify the count parameter. If this parameter is omitted, the most recent 100 data samples will
be displayed.
count number-of-data-samples
Optional. Specifies the number of data samples to display, from 1–100. Each sample will be
shown as a separate row in the command output. If this parameter is omitted, 100 samples will
be shown. If you specify this parameter, do not specify the time-range parameter.
all
Optional. Specifies to show the full set of performance metrics. If this parameter is omitted, the
default set of performance metrics will be shown.
Output Name
Vdisk name.
Serial Number
Vdisk serial number.
Bytes per second
Data transfer rate calculated over the interval since these statistics were last requested or reset.
This value will be zero if it has not been requested or reset since a controller restart.
IOPS
I/O operations per second, calculated over the interval since these statistics were last requested
or reset. This value will be zero if it has not been requested or reset since a controller restart.
Number of Reads
Number of read operations since these statistics were last reset or since the controller was
restarted.
Number of Writes
Number of write operations since these statistics were last reset or since the controller was
restarted.
Data Read
Amount of data read since these statistics were last reset or since the controller was restarted.
Data Written
Amount of data written since these statistics were last reset or since the controller was restarted.
I/O Resp Time
Average response time in microseconds for read and write operations, calculated over the
interval since these statistics were last requested or reset.
Read Resp Time
Average response time in microseconds for all read operations, calculated over the interval since
these statistics were last requested or reset.
Write Resp Time
Average response time in microseconds for all write operations, calculated over the interval
since these statistics were last requested or reset.
Reset Time
Date and time, in the format year-month-day hour:minutes:seconds, when these
statistics were last reset, either by a user or by a controller restart.
54
Alphabetical list of commands
Name
Vdisk name.
Serial Number
Vdisk serial number.
Data Transferred
Total amount of data read and written since the last sampling time.
Data Read
Shown by the all parameter. Amount of data read since the last sampling time.
Data Written
Shown by the all parameter. Amount of data written since the last sampling time.
Total Bps
Data transfer rate, in bytes per second, since the last sampling time. This is the sum of Read
Bps and Write Bps.
Read Bps
Shown by the all parameter. Data transfer rate, in bytes per second, for read operations since
the last sampling time.
Write Bps
Shown by the all parameter. Data transfer rate, in bytes per second, for write operations since
the last sampling time.
Sample Time
Date and time, in the format year-month-day hour:minutes:seconds, when the data
sample was taken.
Example Show live statistics for vdisks VD1 and MyVdisk:
# show vdisk-statistics VD1,MyVdisk
Name
Serial Number Bytes per second IOPS Number of Reads
Number of Writes Data Read Data Written I/O Resp Time Read Resp
Time
Write Resp Time Reset Time
----------------------------------------------------------------------VD1
SN
22.0MB
82
6179839
10507038
478.8GB
1024.4GB
156240
12699
240665
2011-01-17 08:15:01
MyVdisk SN
22.1MB
78
4872260
9913102
539.3GB
1044.1GB
79033
16405
109815
2012-01-17 21:01:20
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-19 16:25:26)
Show all historical statistics (the latest 100 samples) for vdisk VD2:
# show vdisk-statistics VD2 historical all
Name Serial Number
------------------------------------------VD2
SN
Data Transferred
Bps
Sample Time
Data Read
Data Written
Total Bps
Read Bps
Write
----------------------------------------------------------------------44.8GB
22.4GB
22.4GB
49.8MB
24.9MB
24.9MB
2012-01-19 11:30:00
------------------------------------------Success: Command completed successfully. (2012-01-19 12:35:06)
NEXIO Farad 2300 Series Service Guide
55
Show live statistics for vdisks VD1 and MyVdisk:
# show vdisk-statistics VD1,MyVdisk
Name
Serial Number Bytes per second IOPS Number of Reads
Number of Writes Data Read Data Written I/O Resp Time Read Resp
Time
Write Resp Time Reset Time
----------------------------------------------------------------------VD1
SN
22.0MB
82
6179839
10507038
478.8GB
1024.4GB
156240
12699
240665
2011-01-17
08:15:01
MyVdisk SN
2.1MB
78
4872260
9913102
539.3GB
1044.1GB
79033
16405
109815
2012-01-17
21:01:20
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-19 16:25:26)
See also •
56
reset all-statistics
•
reset vdisk-statistics
•
show controller-statistics
•
show disk-statistics
•
show host-port-statistics
•
show vdisks
•
show volume-statistics on page 57
Alphabetical list of commands
show volume-statistics
Description Shows live performance statistics for all or specified volumes. For each volume these statistics
quantify I/O operations between hosts and the volume. For example, each time a host writes to a
volume’s cache, the volume’s statistics are adjusted.
Properties shown only in XML API format are described in the NEXIO Farad 2300 Series CLI
Reference Guide.
Syntax show volume-statistics [volumes]
Parameters volumes
Optional. Names or serial numbers of the volumes to show information about. If this parameter
is omitted, information is shown for all volumes.
Output Name
Volume name.
Serial Number
Volume serial number.
Bytes per second
Data transfer rate calculated over the interval since these statistics were last requested or reset.
This value will be zero if it has not been requested or reset since a controller restart.
IOPS
I/O operations per second, calculated over the interval since these statistics were last requested
or reset. This value will be zero if it has not been requested or reset since a controller restart.
Number of Reads
Number of read operations since these statistics were last reset or since the controller was
restarted.
Number of Writes
Number of write operations since these statistics were last reset or since the controller was
restarted.
Data Read
Amount of data read since these statistics were last reset or since the controller was restarted.
Data Written
Amount of data written since these statistics were last reset or since the controller was restarted.
Reset Time
Date and time, in the format year-month-day hour:minutes:seconds, when these
statistics were last reset, either by a user or by a controller restart.
Example Show statistics for volume vd1_v0001:
# show volume-statistics vd1_v0001
Name
Serial Number Bytes per second IOPS Number of Reads
Number of Writes Data Read Data Written Reset Time
----------------------------------------------------------------------vd1_v0001 SN
5696.0KB
236
44091454
60342344
1133.0GB
1378.9GB
2012-01-20 10:14:54
----------------------------------------------------------------------Success: Command completed successfully. (2012-01-20 12:44:50)
NEXIO Farad 2300 Series Service Guide
57
See also •
58
reset all-statistics
•
reset volume-statistics
•
show controller-statistics
•
show disk-statistics
•
show host-port-statistics
•
show vdisk-statistics on page 53
•
show volumes
Alphabetical list of commands
4
Troubleshooting using event logs
Events and event messages
When an event occurs in a storage system, an event message is recorded in the system’s event log and, depending
on the system’s event notification on settings, may also be sent to users (via email), to host-based applications (via
SNMP or SMI-S) or via SysLog Server.
Each event has a numeric code that identifies the type of event that occurred, and has one of the following
severities:
•
Critical. A failure occurred that may cause a controller to shut down. Correct the problem immediately.
•
Error. A failure occurred that may affect data integrity or system stability. Correct the problem as soon as
possible.
•
Warning. A problem occurred that may affect system stability but not data integrity. Evaluate the problem and
correct it if necessary.
•
Informational. A configuration or state change occurred, or a problem occurred that the system corrected. No
immediate action is required. In this guide, this severity is abbreviated as “Info.”
An event message may specify an associated error code or reason code, which provides additional detail for
technical support. Error codes and reason codes are outside the scope of this guide.
Viewing the event log in the CLI
Use the show event command to view the event log in the CLI. See show events for more information.
Viewing the event log in RAIDar
In the Configuration View panel, right-click the system and select View > Event Log. The System Events panel
shows the 100 most recent events that have been logged by either controller. All events are logged, regardless of
event-notification settings. Click the buttons above the table to view all events, or only critical, warning, or
informational events.
The event log table shows the following information:
•
Severity.
Critical. A failure occurred that may cause a controller to shut down. Correct the problem immediately.
Error. A failure occurred that may affect data integrity or system stability. Correct the problem as soon as
possible.
Warning. A problem occurred that may affect system stability but not data integrity. Evaluate the problem
and correct it if necessary.
Informational. A configuration or state change occurred, or a problem occurred that the system corrected.
No action is required.
•
Time. Date and time when the event occurred, shown as year-month-day hour:minutes:seconds. Time stamps
have one-second granularity.
•
Event ID. An identifier for the event. The prefix A or B identifies the controller that logged the event.
•
Code. An event code that helps you and support personnel diagnose problems. For event-code descriptions
and recommended actions, see Event descriptions.
•
Message. Brief information about the event. Click the message to show or hide additional information and
recommended actions.
NOTE: If you are having a problem with the system or a vdisk, check the event log before calling technical
support. Event messages might enable you to resolve the problem.
NEXIO Farad 2300 Series Service Guide
59
When reviewing events, do the following:
1. For any critical, error, or warning events, click the message to view additional information and recommended
actions. This information also appears in Event descriptions.
Identify the primary events and any that might be the cause of the primary event. For example, an
over-temperature event could cause a disk failure.
2. View the event log and locate other critical/error/warning events in the sequence for the controller that
reported the event.
Repeat this step for the other controller if necessary.
3. Review the events that occurred before and after the primary event.
During this review you are looking for any events that might indicate the cause of the critical/error/warning
event. You are also looking for events that resulted from the critical/error/warning event, known as secondary
events.
4. Review the events following the primary and secondary events.
You are looking for any actions that might have already been taken to resolve the problems reported by the
events.
Viewing an event log saved from RAIDar
You can save event log data to a file on your network as described in Saving log information to a file on page 61.
CAUTION:
Support.
Do not save Event Logs from RAIDar or the CLI without the advice and consent of Customer
The managed log feature monitors the following controller-specific log files:
NOTE:
Managed Logs are not supported.
•
SC debug-log records. Contain date/time stamps of the form mm/dd hh:mm:ss.
•
SC crash logs (diagnostic dumps). Produced if the firmware fails. Upon restart, such logs are available, and the
restart boot log is also included. The four most recent crash logs are retained in the storage system.
•
EC debug logs. EC revision data and SAS PHY statistics are also provided.
•
MC debug logs. Transferred by the managed logs feature are for five internal components: appsv, mccli,
logc, web, and snmpd. The contained files are log-file segments for these internal components and are
numbered sequentially.
The file lists up to 400 events for both controllers. The events are listed in chronological order; that is, the most
recent event is at the bottom of a section. In the event log sections, the following information appears:
60
•
Event ID – Event Serial Number. The prefix (A or B) indicates which controller logged the event. This
corresponds to the Event Serial Number column in RAIDar.
•
Date/Time – Year, month, day, and time when the event occurred.
•
Code – Event code that assists service personnel when diagnosing problems. This corresponds to the Event
Code column in RAIDar.
•
Severity – Informational; Warning; Error; Critical. This corresponds to the Severity Level column in
RAIDar.
•
Message – Information about the event. This corresponds to the Message column in RAIDar.
Troubleshooting using event logs
For example:
Event SN
Date/Time
Code
Severity
Controller
Description
A29856
08-06 09:35:07
33
I
A
Time/date has been changed
A29809
08-04 12:12:05
65
C
A
Uncorrectable ECC error in buffer
memory address 0x0 on bootup
Reviewing event logs
When reviewing events, do the following:
1. Review the critical/warning events.
Identify the primary events and any that might be the cause of the primary event. For example, an over
temperature event could cause a drive failure.
2. Review the event log for the controller that reported the critical/warning event by viewing the event log by
controller. Locate the critical/warning events in the sequence.
Repeat this step for the other controller if necessary.
3. Review the events that occurred before and after the primary event.
During this review you are looking for any events that might indicate the cause of the critical/warning event.
You are also looking for events that resulted from the critical/warning event, known as secondary events.
4. Review the events following the primary and secondary events.
You are looking for any actions that might have already been taken to resolve the problems reported by the
events.
Saving log information to a file
To help service personnel diagnose a system problem, you might be asked to provide system log data. Using
RAIDar, you can save log data to a compressed zip file. The file will contain the following data:
•
Device status summary, which includes basic status and configuration data for the system
•
Each controller’s event log
•
Each controller’s debug log
•
Each controller’s boot log, which shows the startup sequence
•
Critical error dumps from each controller, if critical errors have occurred
•
CAPI traces from each controller
NOTE: The controllers share one memory buffer for gathering log data and for loading firmware. Do not try to
perform more than one save-logs operation at a time, or to perform a firmware-update operation while performing
a save-logs operation.
To save logs
CAUTION: Saving Logs can result in a short loss of I/O on the affected stack. Save logs only with the advice
and consent of Customer Support. The best way to capture the most recent event log data shown in RAIDar is to
take a screenshot.
1. In the Configuration View panel, right-click the system and select Tools > Save Logs.
2. In the main panel:
a. Enter your name, email address, and phone number so support personnel will know who provided the log
data.
NEXIO Farad 2300 Series Service Guide
61
b. Enter comments, describing the problem and specifying the date and time when the problem occurred.
This information helps service personnel when they analyze the log data. Comment text can be 500 bytes
long.
3. Click Save Logs.
NOTE: In Microsoft Internet Explorer if the download is blocked by a security bar, select its Download
File option. If the download does not succeed the first time, return to the Save Logs panel and retry the
save operation.
Log data is collected, which takes several minutes.
4. When prompted to open or save the file, click Save.
•
If you are using Firefox and have a download directory set, the file store.zip is saved there.
•
Otherwise, you are prompted to specify the file location and name. The default file name is store.zip.
Change the name to identify the system, controller, and date.
NOTE: Because the file is compressed, you must uncompress it before you can view the files it contains. To
examine diagnostic data, first view store_yyyy_mm_dd__hh_mm_ss.logs.
62
Troubleshooting using event logs
5
Troubleshooting using system LEDs
Check the controller enclosure status LEDs periodically, or after you have received an error notification. If an
LED is illuminated amber, the enclosure has experienced a fault or failure.
More than one LED may display a fault condition at the same time. For example, if a disk drive failed due to an
exceedingly high ambient temperature, both the Temperature Fault and Fault/Service Required LEDs indicate the
fault. This functionality can help you determine the cause of a fault in a FRU.
For descriptions of LED statuses, see the component diagram and table within the System LEDs section that
pertains to your specific enclosure model.
Using enclosure status LEDs – front panel
Enclosure status LEDs are located on the front of the controller enclosure. See 12-disk enclosure front panel LEDs
on page 141 for the enclosure front view pertaining to your model.
Normal operation
•
During normal operation, the FRU OK and Temperature Fault LEDs are green, and the other status LEDs are
off.
Other LED behaviors
•
If the FRU OK LED is off, the enclosure is not powered on.
•
If the enclosure should be powered on, verify that its PSUs are properly cabled to active power sources,
see Connecting a power cable on page 86.
•
If the Fault/Service Required LED is amber, an enclosure-level fault occurred, and service action is required.
See Diagnostic steps on page 66.
•
If the Temperature Fault LED is amber, the enclosure temperature is above threshold.
Using disk drive module LEDs – front panel
•
Disk drive module LEDs are located on the front of the controller enclosure.
•
See Disk drive LEDs on page 142 for the disk drive type used by your controller enclosure model.
Normal operation
During normal operation, the OK to Remove LED is off and the Power/Activity/Fault LED is green (steady or
blinking).
Other LED behaviors
•
If the Power/Activity/Fault LED is off, the disk drive is not powered on.
•
•
•
If the disk drive should be powered on, verify that it is fully inserted and latched in place, and that the
enclosure is powered on.
If the Power/Activity/Fault LED is steady amber, one of the following conditions exist. See Diagnostic steps
on page 66.
•
The disk has experienced a fault or has failed.
•
The associated vdisk is critical and no spare is available. This LED is illuminated for all disk drives in the
vdisk.
•
The associated vdisk is initializing or reconstructing. This LED is illuminated for all disk drives in the
vdisk. No action is required.
If the OK to Remove LED is blue, the disk drive module is prepared for removal. However, if the disk drive
has failed, and the failure is such that the controller cannot communicate with the disk drive, the LED is off.
CAUTION: Do not remove a disk drive that is reconstructing. Doing so may terminate the current operation and
cause data loss.
NEXIO Farad 2300 Series Service Guide
63
Using controller module host port LEDs – rear panel
Normal operation
•
When a controller module host port is connected to a data host, the port’s host Link Status LED and host Link
Activity LEDs are green.
•
If there is I/O activity, the host Activity LED blinks green.
•
See the host port LED descriptions pertaining to your controller enclosure model(s) within System LEDs for
more information.
Other LED behaviors
•
If hosts are having trouble accessing the storage system, check the following:
•
•
In RAIDar’s Configuration View panel, right-click on the system, and select View > Overview. The
System Overview table displays. From the menu bar, select Configuration > System Settings > Host
Interfaces and verify the settings pertaining to the host interface(s) for your model. Modify these settings if
necessary.
If a connected host port’s link status LED is off, the link is down.
•
In RAIDar, select View > Event Log. The System Events table displays. Review the event logs for
indicators of a specific fault in a host data path component.
•
If you cannot locate a specific fault, or you cannot access the event logs, see Diagnostic steps on page 66 to
isolate the fault.
Using the controller module expansion port status LED – rear panel
Normal operation
When a controller module’s expansion port is connected to another enclosure, the expansion port status LED is
green. See the Expansion Port Status LED description pertaining to your enclosure model within the System LEDs
for more information.
Other LED behaviors
•
If the connected port’s LED is off, the link is down.
•
In RAIDar, select View > Event Log. The System Events table displays. Review the event logs for
indicators of a specific fault.
•
If you cannot locate a specific fault, or you cannot access the event logs, see Diagnostic steps on page 66 to
isolate the fault.
IUsing controller module network port LEDs – rear panel
Normal operation
•
When a controller module’s network port is connected, its Ethernet Link Status LED is green.
•
If there is I/O activity, the Ethernet Activity LED blinks green. See the Network Port LED descriptions
pertaining to your enclosure model within the System LEDs for more information.
Other LED behaviors
If a connected port’s Ethernet Link Status LED is off, the link is down. Use standard networking troubleshooting
procedures to isolate faults on the network.
Using controller module status LEDs – rear panel
Normal operation
64
•
The FRU OK LED is green; the Cache Status LED can be green, blinking, or off; and the other controller
module status LEDs (Unit Locator, OK to Remove, Fault/Service Required) are off.
•
See the OK to Remove, Unit Locator, FRU OK, Fault/Service Required, and Cache Status LED descriptions
pertaining to your enclosure model within the System LEDs for more information.
Troubleshooting using system LEDs
Other LED behaviors
•
•
If the FRU OK LED is off, either:
•
The controller module is not powered on. If it should be powered on, verify that it is fully inserted and
latched in place, and that the controller enclosure is powered on.
•
The controller module has failed. In RAIDar, select View > Event Log. The System Events table displays.
Review the event logs for indicators of a specific fault.
If the Cache Status LED is blinking green, a CompactFlash flush or cache self-refresh is in progress, indicating
cache activity. No action is required (See also Diagnostic steps on page 66).
•
If the LED is blinking evenly, a cache flush is in progress. When a controller module loses power and write
cache is dirty (contains data that has not been written to disk), the super-capacitor pack provides backup
power to flush (copy) data from write cache to CompactFlash memory. When cache flush is complete, the
cache transitions into self-refresh mode.
•
If the LED is blinking momentarily slowly, the cache is in a self-refresh mode. In self-refresh mode, if
primary power is restored before the backup power is depleted (3 – 30 minutes, depending on various
factors), the system boots, finds data preserved in cache, and writes it to disk. This means the system can
be operational within 30 seconds, and before the typical host I/O time-out of 60 seconds, at which point
system failure would cause host-application failure. If primary power is restored after the backup power is
depleted, the system boots and restores data to cache from CompactFlash, which can take about 90
seconds.
NOTE: The cache flush and self-refresh mechanism is an important data protection feature; essentially
four copies of user data are preserved: one in each controller’s cache and one in each controller’s
CompactFlash.
•
If the Fault/Service Required LED is steady amber, a fault occurred or a service action is required. See
Diagnostic steps on page 66.
•
If the Fault/Service Required LED is blinking amber, one of the following errors occurred. See Diagnostic
steps on page 66.
•
Hardware-controlled power up error
•
Cache flush error
•
Cache self-refresh error
•
If the OK to Remove LED is blue, the controller module is prepared for removal.
•
If the controller has failed or does not start, see Diagnostic steps on page 66.
Using power supply module LEDs – rear panel
Normal operation
•
The Input Source Power Good LED is green.
•
See Power supply LEDs on page 145 for power supply descriptions, and refer to the PSU (AC or DC) included
with your controller enclosure.
Other LED behaviors
•
If the AC Power Good LED is off, the module is not receiving adequate power.
•
Verify that the power cable is properly connected, and check the power source it is connected to (see
Connecting a power cable on page 86).
•
When isolating faults in a power supply module, remember that the fans in both modules receive power
through a common bus on the midplane, so if a PSU fails, the fans continue to operate normally.
Using expansion module LEDs – rear panel
Normal operation
•
When the expansion module is connected to a controller module or host, the SAS In port status LED is green.
•
If the SAS Out port is connected to another module, the SAS Out port status LED is also green.
NEXIO Farad 2300 Series Service Guide
65
•
The FRU OK LED is green and the other expansion module status LEDs (Unit Locator, OK to Remove,
Fault/Service Required) are off.
•
See the LED descriptions pertaining to your enclosure model(s) within System LEDs for more information.
Other LED behaviors
•
If a connected port’s status LED is off, the link is down. In RAIDar, select View > Event Log. The System
Events table displays. Review the event logs for indicators of a specific fault in a host data path component.
•
If the FRU OK LED is off, one of the following conditions exist. See Diagnostic steps on page 66.
•
The expansion module is not powered on. If it should be powered on, verify that it is fully inserted and
latched in place, and that the enclosure is powered on.
•
The expansion module has failed. In RAIDar, select View > Event Log. The System Events table displays.
Review the event logs for indicators of a specific fault.
•
If the Fault/Service Required LED is steady amber, a fault occurred or a service action is required. See
Diagnostic steps on page 66.
•
If the Fault/Service Required LED is blinking amber, one of the following errors occurred. See Diagnostic
steps on page 66.
•
•
Hardware-controlled power up error
•
Cache flush error
•
Cache self-refresh error
If the OK to Remove LED is blue, the controller module is prepared for removal.
Diagnostic steps
This section describes possible reasons and actions to take when an LED indicates a fault condition during initial
system setup.
As previously discussed, in addition to monitoring LEDs via line-of-sight observation of the racked hardware
components when performing diagnostic steps, you can also monitor the health of the system and its components
using management interfaces. Bear this in mind when reviewing the Actions column in the following diagnostics
tables, and when reviewing the step procedures provided later in this chapter.
Is the enclosure front panel “Fault/Service Required” LED amber?
Table 5
Diagnostics LED status: Front panel “Fault/Service Required”
Answer
Possible reasons
Actions
No
System functioning properly.
No action required.
Yes
A fault condition exists/occurred.
•
If installing an IOM FRU, the module
has gone online and likely failed its
self-test.
•
•
•
66
Troubleshooting using system LEDs
Check the LEDs on the back of the controller to narrow
the fault to a FRU, connection, or both.
Check the event log for specific information regarding
the fault; follow any Recommended Actions.
If installing an IOM FRU, try removing and reinstalling
the new IOM, and check the event log for errors.
If the above actions do not resolve the fault, isolate the
fault and contact an authorized service provider for
assistance. Replacement may be necessary.
Is the controller-back-panel “FRU OK” LED lit?
Table 6
Diagnostics LED status: Rear panel “FRU OK”
Answer
Possible reasons
Actions
Yes
(blinking)
System functioning properly.
System is booting.
No action required.
Wait for system to boot.
No
The controller module is not powered
on.
The controller module has failed.
•
•
Check that the controller module is fully inserted and
latched in place, and that the enclosure is powered on.
Check the event log for specific information regarding
the failure.
Is the controller-back-panel “Fault/Service Required” LED amber?
Table 7
Diagnostics LED status: Rear panel “Fault/Service Required”
Answer
Possible reasons
Actions
No
System functioning properly.
No action required.
Yes (blinking)
One of the following errors
occurred:
•
•
•
•
•
Hardware-controlled power-up
error
Cache flush error
Cache self-refresh error
•
•
Restart this controller from the other controller using
RAIDar or the CLI.
If the above action does not resolve the fault, remove
the controller and reinsert it.
If the above action does not resolve the fault, contact an
authorized service provider for assistance.
It may be necessary to replace the controller.
Are both disk-drive-module LEDs off?
Table 8
Diagnostic LED status: Disk drive module
Answer
Possible reasons
Actions
Yes
•
•
Check that the disk is fully inserted and latched in place,
and that the enclosure is powered on.
There is no power
The disk is offline
Is the disk-drive-module “Fault” LED amber?
Table 9
Diagnostics LED status: Disk drive “Fault” LED (LFF and SFF modules)
Answer
Possible reasons
Actions
Yes, and the
online/activity
LED is off.
The disk is offline. An event message
•
may have been received for this device.
•
•
Check the event log for specific information regarding
the fault.
Isolate the fault.
Contact an authorized service provider for assistance.
Yes, and the
online/activity
LED is blinking.
The disk is active, but an event message •
may have been received for this device.
•
•
Check the event log for specific information regarding
the fault.
Isolate the fault.
Contact an authorized service provider for assistance.
NEXIO Farad 2300 Series Service Guide
67
Is a connected host port’s “Host Link Status” LED lit?
Table 10
Diagnostics LED status: Rear panel “Host Link Status”
Answer
Possible reasons
Actions
Yes
System functioning properly.
No action required.
No
The link is down.
See also Isolating a host-side connection fault on page 69.
•
•
•
•
•
•
•
Check cable connections and reseat if necessary.
If the above action does not resolve the fault, inspect
cable for damage. Replace cable if necessary
If the above action does not resolve the fault, swap
cables to determine if fault is caused by a defective
cable. Replace cable if necessary.
If the above action does not resolve the fault, verify that
the switch, if any, is operating properly. If possible, test
with another port.
If the above action does not resolve the fault, verify that
the HBA is fully seated, and that the PCI slot is
powered on and operational.
If the above action does not resolve the fault, review
event logs for indicators of a specific fault in a host data
path component.
If the above action does not resolve the fault, contact an
authorized service provider for assistance.
Is a connected port’s “Expansion Port Status” LED lit?
Table 11
Diagnostics LED status: Rear panel “Expansion Port Status”
Answer
Possible reasons
Actions
Yes
System functioning properly.
No action required.
No
The link is down.
See also IUsing controller module network port LEDs
– rear panel on page 64.
•
•
•
•
•
•
•
68
Troubleshooting using system LEDs
Check cable connections and reseat if necessary.
If the above action does not resolve the fault, inspect
cable for damage. Replace cable if necessary
If the above action does not resolve the fault, swap
cables to determine if fault is caused by a defective
cable. Replace cable if necessary.
If the above action does not resolve the fault, verify that
the switch, if any, is operating properly. If possible, test
with another port.
If the above action does not resolve the fault, verify that
the HBA is fully seated, and that the PCI slot is
powered on and operational.
If the above action does not resolve the fault, review
event logs for indicators of a specific fault in a host data
path component.
If the above action does not resolve the fault, contact an
authorized service provider for assistance.
Is a connected port’s “Network Port Link Status” LED lit?
Table 12
Diagnostics LED status: Rear panel “Network Port Link Status”
Answer
Possible reasons
Actions
Yes
System functioning properly.
No action required.
No
The link is down.
•
•
Swap cables between the A and B controllers to isolate
the fault.
Use standard networking troubleshooting procedures to
isolate faults on the network.
Is the power supply’s “Input Power Source” LED lit?
Table 13
Diagnostics LED status: Rear panel power supply “Input Power Source”
Answer
Possible reasons
Actions
Yes
System functioning properly.
No action required.
No
The power supply is not receiving
adequate power.
•
•
•
•
Verify that the power cord is properly connected, and
check the power source to which it connects.
If the above action does not resolve the fault, check that
the power supply FRU is firmly locked into position.
If the above action does not resolve the fault, check the
event log for specific information regarding the fault.
If the above action does not resolve the fault, isolate the
fault and contact an authorized service provider for
assistance.
Is the “Voltage/Fan Fault/Service Required” LED amber?
Table 14
Diagnostics LED status: Rear panel power supply “Voltage/Fan Fault/Service Required”
Answer
Possible reasons
Actions
No
System functioning properly.
No action required.
Yes
The PSU or a fan is operating at an
unacceptable voltage/r/min level, or has
failed.
When isolating faults in the power supply, remember that
the fans in both modules receive power through a common
bus on the midplane, so if a PSU fails, the fans continue to
operate normally.
•
•
•
•
Verify that the power supply FRU is firmly locked into
position.
If the above action does not resolve the fault, verify that
the power cable is connected to a power source.
If the above action does not resolve the fault, verify that
the power cable is connected to the enclosure’s PSU.
If the above action does not resolve the fault, FRU
replacement may be necessary; see Troubleshooting
and replacing FRUs.
Isolating a host-side connection fault
During normal operation, when a controller module host port is connected to a data host, the port’s host link status
LED and host link activity LED are green. If there is I/O activity, the host activity LED blinks green. If data hosts
are having trouble accessing the storage system, and you cannot locate a specific fault or cannot access the event
logs, use the following procedure.
This procedure requires scheduled downtime.
NEXIO Farad 2300 Series Service Guide
69
IMPORTANT: Do not perform more than one step at a time. Changing more than one variable at a time can
complicate the troubleshooting process.
Host-side connection troubleshooting featuring FC host ports
The procedure below pertains to NEXIO Farad 2300 Series controller enclosures employing SFP transceiver
connectors in 2/4/8 Gbit FC host interface ports.
1. Check the host activity LED.
2. Inspect the cable for damage.
3. Reseat the SFP and host cable.
Is the host link status LED on?
•
Yes – Monitor the status to ensure that there is no intermittent error present. If the fault occurs again, clean
the connections to ensure that a dirty connector is not interfering with the data path.
•
No – Proceed to the next step.
4. Move the SFP and host cable to a port with a known good link status.
This step isolates the problem to the external data path (SFP, host cable, and host-side devices) or to the
controller module port.
Is the host link status LED on?
•
Yes – You now know that the SFP, host cable, and host-side devices are functioning properly. Return the
SFP and cable to the original port. If the link status LED remains off, you have isolated the fault to the
controller module’s port. Replace the controller module.
•
No – Proceed to the next step.
5. Swap the SFP with the known good one.
Is the host link status LED on?
•
Yes – You have isolated the fault to the SFP. Replace the SFP.
•
No – Proceed to the next step.
6. Re-insert the original SFP and swap the cable with a known good one.
Is the host link status LED on?
•
Yes – You have isolated the fault to the cable. Replace the cable.
•
No – Proceed to the next step.
7. Verify that the switch, if any, is operating properly. If possible, test with another port.
8. Verify that the HBA is fully seated, and that the PCI slot is powered on and operational.
9. Replace the HBA with a known good HBA, or move the host side cable and SFP to a known good HBA.
Is the host link status LED on?
•
Yes – You have isolated the fault to the HBA. Replace the HBA.
•
No – It is likely that the controller module needs to be replaced.
10. Move the cable and SFP back to its original port.
Is the host link status LED on?
•
No – The controller module’s port has failed. Replace the controller module.
•
Yes – Monitor the connection for a period of time. It may be an intermittent problem, which can occur with
SFPs, damaged cables, and HBAs.
Isolating a controller module expansion port connection fault
During normal operation, when a controller module’s expansion port is connected to a drive enclosure, the
expansion port status LED is green. If the connected port’s expansion port LED is off, the link is down. Use the
following procedure to isolate the fault.
70
Troubleshooting using system LEDs
NOTE: Do not perform more than one step at a time. Changing more than one variable at a time can complicate
the troubleshooting process.
1. Reseat the expansion cable, and inspect it for damage.
Is the expansion port status LED on?
•
Yes – Monitor the status to ensure there is no intermittent error present. If the fault occurs again, clean the
connections to ensure that a dirty connector is not interfering with the data path.
•
No – Proceed to the next step.
2. Move the expansion cable to a port on the controller enclosure with a known good link status.
This step isolates the problem to the expansion cable or to the controller module’s expansion port.
Is the expansion port status LED on?
•
Yes – You now know that the expansion cable is good. Return the cable to the original port. If the
expansion port status LED remains off, you have isolated the fault to the controller module’s expansion
port. Replace the controller module.
•
No – Proceed to the next step.
3. Move the expansion cable back to the original port on the controller enclosure.
4. Move the expansion cable on the drive enclosure to a known good expansion port on the drive enclosure.
Is the expansion port status LED on?
•
Yes – You have isolated the problem to the drive enclosure’s port. Replace the expansion module.
•
No – Proceed to the next step.
5. Replace the cable with a known good cable, ensuring the cable is attached to the original ports used by the
previous cable.
Is the host link status LED on?
•
Yes – Replace the original cable. The fault has been isolated.
•
No – It is likely that the controller module must be replaced.
NEXIO Farad 2300 Series Service Guide
71
72
Troubleshooting using system LEDs
6
Troubleshooting and replacing FRUs
This chapter provides procedures for replacing FRUs, including precautions, removal instructions, installation
instructions, and verification of successful installation. Each procedure addresses a specific task. Certain
procedures refer to related documentation. See Available FRUs for figures and lists of FRUs.
ESD
Before you begin any of the procedures, consider the following precautions and preventive measures.
Preventing ESD
To prevent ESD from damaging the system, be aware of the precautions to consider when setting up the system or
handling parts. A discharge of static electricity from a finger or other conductor may damage system boards or
other static-sensitive devices. This type of damage may reduce the life expectancy of the device.
CAUTION:
Parts can be damaged by ESD. Follow these precautions:
•
Avoid hand contact by transporting and storing products in static-safe containers.
•
Keep electrostatic-sensitive parts in their containers until they arrive at static-protected workstations.
•
Place parts in a static-protected area before removing them from their containers.
•
Avoid touching pins, leads, or circuitry.
•
Always be properly grounded when touching a static-sensitive component or assembly.
•
Remove clutter (plastic, vinyl, foam) from the static-protected workstation.
Grounding methods to prevent ESD
Several methods are used for grounding. Adhere to the following precautions when handling or installing
electrostatic-sensitive parts.
CAUTION:
Parts can be damaged by ESD. Use proper anti-static protection:
•
Keep the replacement FRU in the ESD bag until needed; and when removing a FRU from the enclosure,
immediately place it in the ESD bag and anti-static packaging.
•
Wear an ESD wrist strap connected by a ground cord to a grounded workstation or unpainted surface of the
computer chassis. Wrist straps are flexible straps with a minimum of 1 megohm (± 10 percent) resistance in
the ground cords. To provide proper ground, wear the strap snug against the skin.
•
If an ESD wrist strap is unavailable, touch an unpainted surface of the chassis before handling the component.
•
Use heel straps, toe straps, or boot straps at standing workstations. Wear the straps on both feet when standing
on conductive floors or dissipating floor mats.
•
Use conductive field service tools.
•
Use a portable field service kit with a folding static-dissipating work mat.
If you do not have any of the suggested equipment for proper grounding, have an authorized reseller install the
part. For more information on static electricity or assistance with product installation, contact an authorized
reseller.
Replacing chassis FRU components
Chassis FRUs replace a damaged chassis or chassis components. A fully functional chassis requires successful
installation of the following components:
•
One or two controller modules of the same model (for a given controller enclosure)
See Replacing a controller module or expansion module on page 74 for more information.
NEXIO Farad 2300 Series Service Guide
73
•
All disk drives and air management modules
See Replacing a disk drive module on page 80 for more information.
•
Two PSUs of the same type (AC or DC)
See Replacing a power supply module on page 83 for more information.
•
One or two expansion modules of the same model (per optional drive enclosure)*
Replacing a controller enclosure chassis on page 89 for more information.
In addition to the FRUs identified above, replacement procedures are provided to address specific interface
protocols and replacement of the enclosure chassis:
•
Removal and installation of an FC transceiver
•
Removal and installation of a controller enclosure chassis
See Replacing an FC transceiver on page 87 for more information.
See Replacing a controller enclosure chassis on page 89 for more information.
Replacement of chassis FRU components are described within this chapter.
NOTE: NEXIO Farad 2300 Series controller enclosures support hot-plug replacement of redundant controller
modules, fans, power supplies, and IOMs. Hot-add of drive enclosures is also supported.
TIP: Many procedures refer to component LEDs and LED status. See System LEDs for descriptions of
model-specific front panel and rear panel LEDs.
Replacing a controller module or expansion module
Controller and expansion modules are hot-swappable, which means you can replace one module without halting
I/O to vdisks, or powering off the enclosure. In this case, the second module takes over operation of the storage
system until you install the new module.
CAUTION: Removing and inserting a controller module can cause in up to 30 seconds of disruption to I/O to
the affected stack. Remove/replace with caution and only with the advice and consent of Customer Support.
IMPORTANT: When swapping controllers in the same enclosure, special precautions must be taken to ensure
the vdisks do not enter quarantine status. See Swapping controllers in the same enclosure on page 79.
Before you begin
Removing a controller or expansion module from an operational enclosure significantly changes air flow within
the enclosure. Openings must be populated. If you are replacing both controllers use RAIDar to record
configuration settings before installing the new controller modules. See Removing a controller module on
page 77, and Installing a controller module or expansion module on page 78 for instructions on installing an
additional controller module.
CAUTION: When replacing a controller module, ensure that less than 10 seconds elapse between inserting it
into a slot and fully latching it in place. Failing to do so might cause the controller to fail. If it is not latched within
10 seconds, remove the controller module from the slot, and repeat the process.
74
Troubleshooting and replacing FRUs
Configuring PFU
If PFU is enabled, when you update firmware on one controller, the system automatically updates the partner
controller. Change the PFU settings (enable or disable PFU) only with the advice and consent of Customer
Support.
Use RAIDar or the CLI to change the PFU setting.
Using RAIDar
1. Sign-in to RAIDar using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. In the Configuration View panel, right-click the system and select Configuration > Advanced Settings >
Firmware.
3. Select (enable) the Partner Firmware Upgrade option.
4. Click Apply.
Using the CLI
1. Log-in to the CLI using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. To verify that PFU is enabled, run the following command:
show advanced-settings
3. If PFU is disabled, enable it by running the following command:
set advanced-settings partner-firmware-upgrade enabled
Verifying component failure
Select from the following methods to verify component failure:
•
Use RAIDar to check the health icons/values of the system and its components to either ensure that everything
is okay, or to drill down to a problem component. RAIDar uses health icons to show OK, Degraded, Fault, or
Unknown status for the system and its components. If you discover a problem component, follow the actions
in its Health Recommendations field to resolve the problem.
•
As an alternative to using RAIDar, you can run the show system command in the CLI to view the health
of the system and its components. If any component has a problem, the system health will be Degraded, Fault,
or Unknown. If you discover a problem component, follow the actions in its Health Recommendations field to
resolve the problem.
•
Monitor event notification — With event notification configured and enabled, use RAIDar to view the event
log, or use the CLI to run the show events detail command to see details for events.
•
Check Fault/Service Required LED (back of enclosure): Amber = Fault condition
•
Check that the FRU OK LED (back of enclosure) is off
Shutting down a controller module
Shutting down the SC in a controller module ensures that a proper failover sequence is used, which includes
stopping all I/O operations and writing any data in write cache to disk. If the SC in both controller modules is shut
down, hosts cannot access the system’s data. If possible, perform a shut down before removing a controller
module or powering down the system.
CAUTION: You can continue to use the CLI when either or both SCs are shut down, but information shown
might be invalid.
Use RAIDar or the CLI to perform a shut down.
NEXIO Farad 2300 Series Service Guide
75
Using RAIDar
1. Sign-in to RAIDar using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. In the Configuration View panel, right-click the system and select Tools > Shut Down or Restart Controller.
3. In the main panel, set the options:
•
In the Operation field, select Shut down.
•
In the Controller type field, select Storage.
•
In the Controller field, select whether to shut down the processor in controller A, B, or both.
4. Click Shut down now. A confirmation dialog appears.
5. Click Yes to continue; otherwise click No. If you clicked Yes, a second confirmation dialog appears.
6. Click Yes to continue; otherwise click No. If you clicked Yes, a message describes shutdown activity.
Using the CLI
1. Log-in to the CLI using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. Verify that the partner controller is online by running the command:
show redundancy-mode
3. Shut down the failed controller — A or B — by running the command:
shutdown a or shutdown b
The blue OK to Remove LED (back of enclosure) illuminates to indicate that the controller module can be
safely removed.
4. Illuminate the ID LED of the enclosure that contains the controller module to remove by running the
command:
set led enclosure 0 on
Removing a controller module or expansion module
IMPORTANT:
•
You may hot-replace a controller module in an operational enclosure. If possible, you should first shut
down the faulty controller using either RAIDar or the CLI.
See ESD on page 73.
NOTE: Within these procedures, illustrations featuring controller module face plates are generic. They do not
show host ports, and they pertain to all NEXIO Farad 2300 Series controller module models.
1. Locate the enclosure whose Unit Locator LED (front right ear) is illuminated, and within the enclosure, locate
the controller module whose OK to Remove LED is blue (rear panel).
2. Disconnect any cables connected to the controller.
Label each cable to facilitate re-connection.
76
Troubleshooting and replacing FRUs
3. Turn the thumbscrews counterclockwise until they disengage from the controller (see Figure 1 on page 77).
ACT
CLI
CLI
T
HOS
LINK
Figure 1 Disengaging a controller module
4. Press both latches downward to disconnect the controller module from the midplane (see Figure 2).
ACT
CLI
CLI
T
HOS
LINK
Figure 2 Extracting a controller module
5. Pull the controller module straight out of the enclosure (see Figure 3).
ACT
LINK
CLI
CLI
HOST
ACT
LINK
CLI
CLI
HOST
Figure 3 Removing a controller module
NEXIO Farad 2300 Series Service Guide
77
Installing a controller module or expansion module
TIP: You can install a controller module into an enclosure that is powered on, provided you wait 60 seconds
after removing the old controller module. Check controller and midplane power connectors before inserting the
new controller module into the enclosure.
See ESD on page 73.
1. Loosen the thumbscrews; press the latches downward (see Figure 4).
2. Slide the controller module into the enclosure as far as it will go (1).
A controller module that is only partially-seated will prevent optimal performance of the controller enclosure.
Verify that the controller module is fully-seated before continuing.
3. Press the latches upward to engage the controller module (2); turn the thumbscrews clockwise until
finger-tight.
4. Reconnect the cables.
NOTE:
See the NEXIO Farad 2300 Series Setup Guide for cabling information.
ACT
LINK
CLI
CLI
HOST
ACT
LINK
CLI
CLI
HOST
2
1
2
Figure 4 Inserting a controller module
IMPORTANT: If PFU is enabled, when you update firmware on one controller, the system automatically
updates the partner controller.
Verifying component operation
After replacing the controller module, verify that the FRU OK LED (rear panel) illuminates green, indicating that
the controller has completed initializing, and is online/operating normally. It may take two to five minutes for the
replacement controller to become ready. If you are replacing either controller module, and PFU is enabled, you
may need to wait 30 minutes to ensure that the two controllers — with their respective ownership of the vdisks —
have enough time to fully stabilize.
Use RAIDar or the CLI to perform a restart only with the advice and consent of Customer Support.
Using RAIDar
1. Sign-in to RAIDar using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
78
Troubleshooting and replacing FRUs
2. In the Configuration View panel, right-click the system and select Tools > Shut Down or Restart Controller.
3. In the main panel, set the options:
•
In the Operation field, select Restart.
•
In the Controller type field, select Storage.
•
In the Controller field, select whether to restart the processor in controller A, B, or both.
4. Click Restart now. A confirmation dialog appears.
5. Click Yes to continue; otherwise click No. If you clicked Yes, a second confirmation dialog appears.
6. Click Yes to continue; otherwise click No. If you clicked Yes, a message describes restart activity.
Using the CLI
If the enclosure’s Unit Locator LED is on, run the following command to turn it off:
set led enclosure 0 off
If the Fault/Service Required LED is amber, the controller module has not gone online, and likely failed its
self-test. Put the module online by restarting the controller, or by checking the event log for errors. To restart the
controller (A or B), run the following command:
restart sc a or restart sc b
Swapping controllers in the same enclosure
When swapping controllers in the same enclosure, special precautions must be taken to ensure the vdisks do not
enter quarantine offline status. See Quarantined vdisks on page 79 for more information.
1. Check for unwritable cache data.
•
If unwritable cache data is found, and the unwritable cache data is for a volume that has been deleted, use
the CLI command clear cache to clear the unwritable cache data from the system. See clear cache on
page 30 for information on using this command.
•
If the unwritable cache data is for a volume that is offline for other reasons, do not clear the unwritable
cache data and do not proceed with the controller swap. Determine the cause for the presences of the
unwritable cache data and resolve the issue before performing the controller swap.
2. Perform a shutdown of both controllers. See Shutting down a controller module on page 75.
Following these steps ensure the cache is clean and data integrity is maintained.
Quarantined vdisks
When unwritable cache data is present or if a clean shutdown of both controllers is not performed, vdisks will be
quarantined offline (QTOF) and event 485 will be logged for each affected vdisk. Data integrity is not guaranteed.
The vdisks can be dequarantined by shutting down and rebooting each controller separately. This ensures a proper
failover sequence is followed. See Shutting down a controller module on page 75 for instructions.
Updating firmware
You can view the current versions of firmware in controller modules, expansion modules (in drive enclosures),
and disks, and you can also install new firmware versions. Firmware should only be updated with the advice and
consent of Customer Support.
TIP: To ensure success of an online update, select a period of low I/O activity. This helps the update complete as
quickly as possible and avoids disruptions to hosts and applications due to timeouts. Attempting to update a
storage system that is processing a large, I/O-intensive batch job will likely cause hosts to lose connectivity with
the storage system.
A controller enclosure can contain one or two controller modules. Both controllers should run the same firmware
version. You can update the firmware in each controller module by loading a firmware file obtained from the
enclosure vendor.
NEXIO Farad 2300 Series Service Guide
79
If the PFU option is enabled, when you update one controller, the system automatically updates the partner
controller. If PFU is disabled, after updating firmware on one controller, you must log into the partner controller’s
IP address and perform this firmware update on that controller also.
NOTE: If a vdisk is quarantined, firmware update is not permitted due to the risk of losing unwritten data that
remains in cache for the vdisk volumes. Before you can update firmware, you must resolve the problem that is
causing the vdisk to be quarantined, as described in the “Removing a vdisk from quarantine” topic in the NEXIO
Farad 2300 Series RAIDar User Guide or online help.
For best results, the storage system should be in a healthy state before starting firmware update.
You can update firmware using RAIDar or using FTP. See the NEXIO Farad 2300 Series RAIDar User Guide for
more information.
Replacing a disk drive module
A disk drive module consists of a disk in a sled. Disk drive modules are hot-swappable, which means they can be
replaced without halting I/O to the vdisks, or powering off the enclosure. The new disk drive module must be of
the same type, and possess capacity equal to or greater than the smallest disk in the system. Otherwise, the storage
system cannot use the new disk to reconstruct the vdisk (see “About vdisks” and “About disk failure and vdisk
reconstruction” topics in the NEXIO Farad 2300 Series RAIDar User Guide).
Air management modules (not supported)
An air management module looks like a disk drive module; however, it is an empty box — also known as a blank
— used to maintain optimum air flow for proper cooling within an enclosure. Air management modules are
installed in slots missing disk drive modules. If you must remove a disk drive module, but cannot immediately
replace it, you must either leave the faulty module in place, or insert an air management module in its place.
The blank is installed using the same procedure as Installing a disk drive module on page 81. Similarly, the blank is
removed using the same procedures as Removing a disk drive module on page 81.
Before you begin
CAUTION: Removing either a disk drive module or blank impacts the airflow and cooling ability of the
enclosure. If the internal temperature exceeds acceptable limits, the enclosure may overheat, and automatically
shut down or restart. To avoid potential overheating, wait 20 seconds to allow the internal disks to stop spinning,
then insert the new disk drive module or blank.
See ESD on page 73.
Verifying component failure
Before replacing a disk, perform the following steps to ensure that you have correctly identified the module
requiring removal and replacement.
CAUTION: Failure to identify the correct disk drive module could result in data loss if the wrong disk is
removed from the enclosure.
When a disk drive fault occurs, the failed disk’s fault indicator LED, located on the enclosure’s front panel,
illuminates solid amber (see System LEDs for a description of LEDs and disk drive slot numbering for your
enclosure). You can determine from visual inspection which disk in the enclosure is experiencing a fault/failure
using the fault LED for your disk type. If necessary, use the set led command in the CLI to illuminate the disk
locator LED.
80
Troubleshooting and replacing FRUs
Removing a disk drive module
1. Slide the release latch to the left to disengage the disk drive module. (see Figure 5 on page 81):
Moving the latch to the left will provide a clicking sound and cause the spring to move its position inside
the chassis, partially ejecting the disk from its installed position within the disk drive slot.
The enclosure bezel is removed in the
illustration above to show disks.
Figure 5 Disengaging a disk drive module or blank
2. Wait 20 seconds for the internal disks to stop spinning.
Eject and extract LFF disk or blank
The enclosure bezel is removed to show disks
Figure 6 Removing a disk drive module or blank
Installing a disk drive module
1. Follow one of the three sub-steps below, according to your product disk drive type:
a. SFF disk — Squeeze the latch release flanges together, and then pull the latch, rotating it upward until it is
fully open (see Figure 5 on page 81).
b. LFF disk — No action required.
Proceed to step 2 below.
2. Follow one of the three sub-steps below, according to your product disk drive type:
a. SFF disk — with the LEDs oriented to the bottom, slide the disk drive module into the drive slot as far as
it will go (see upper left illustration in Figure 7 on page 82).
b. With the LEDs oriented to the left, slide the disk drive module into the drive slot as far as it will go (see
illustration in Figure 7 on page 82).
NEXIO Farad 2300 Series Service Guide
81
Insert LFF disk or blank
The enclosure bezel is removed in the
illustration at left to show disks.
Figure 7 Installing a disk drive module or blank
c. Verify that you have inserted the disk drive module into the slot as far as it will go, to ensure that the
module is firmly seated in the enclosure midplane.
The installed disk drive module should now appear as shown in Figure 5 on page 81.
NOTE: Allow at least 30 seconds to elapse once you have completed both the “Removing a disk drive
module” and the “Installing a disk drive module” procedures.
If using RAIDar, execute steps 4 and 5 to complete this procedure. If using the CLI, execute steps 6 and 7 to
complete this procedure.
Using RAIDar:
3. Sign in to RAIDar (use default user manage and password !manage, or the appropriate username and
password if they have been changed).
4. View the System Overview panel to determine whether the health of the new disk is OK. If the health is OK,
then the disk drive module installation process is complete. If the health is not OK, then in the Configuration
View panel, select the enclosure that the new disk is in to display the Enclosure Overview panel, then select
the disk and view details about it, such as Status and Health Recommendations.
Using the CLI:
5. Log in to the CLI (use default user manage and password !manage, or the appropriate username and
password if they have been changed).
6. To view information about disks, run the following command:
show disks <disk-ID>
Disks are specified by enclosure ID and slot number. Enclosure IDs increment from 1. Disk IDs increment
from 1 in each enclosure (e.g., show disks 1.7). Entering the command shown above will display the disk
health. If health is not OK, the command output will also display recommended actions.
7. You must assign the new drive as a Global Spare for Rebuild (aka “reconstruction”) to begin.
a. If your are using RAIDar (the recommended method) go to Provisioning > Manage Global Spares and
check the box for the newly inserted drive. Then select “Modify Spares.”
b. If you are using the CLI to create the Global Spare, use the following command:
set spares disks X.Y, A.B-C, a.b-c
where X is the enclosure number and Y is the slot number of the new drive and A.B-C, a.b-c defines the
locations of the other global spares already in the system. The CLI command requires that you always
specify the entire list of global spares. For this reason it is recommended that RAIDar be used to set global
spares after a drive replacement.
82
Troubleshooting and replacing FRUs
Determine if a disk is missing
You can determine whether a disk is missing by using management interfaces.
Using RAIDar
1. Sign-in to RAIDar using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. In the Configuration View panel, right-click on the appropriate enclosure under Physical.
•
Select the Front Graphical tab to display a pictorial representation of disks within slots and the supporting
enclosure table showing properties and values.
•
Select the Front Tabular tab to display the Enclosure’s Front View data table and supporting enclosure
table showing properties and values.
3. Using the graphical and tabular views, look for gaps in the disk location sequence to determine if a disk is
missing.
Using the CLI
1. Log-in to the CLI using default user manage and password !manage.
If the default user or password — or both — have been changed for security reasons, enter the secure login
credentials instead of the system defaults shown above.
2. To determine location of a missing or faulty drive, run the following command:
show disks
The command outputs a listing of detected disks’ properties by location. Review the information, and look for
gaps in the disk location sequence to determine whether any disks are missing.
Verifying component operation
Check that the Power/Activity/Fault LED — (the bottom LED) located on the front face of the disk drive’s
escutcheon — is illuminated green.
TIP:
See the System LEDs for descriptions of disk drive LEDs and other front panel LEDs.
Also see Using management interfaces on page 92 as an alternative to physically observing LEDs to verify
component operation.
Replacing a power supply module
This section provides procedures for replacing a failed AC or DC power supply module, also referred to as a PSU.
CAUTION:
offline.
Power supply FRU replacement activities can cause enclosure cables to disconnect and disks to go
When replacing a power supply FRU, you might accidentally disconnect cables, causing disks to go offline.
Ensure that all cables are securely fastened, and proceed with great caution as you replace the power supply FRU
within the controller enclosure. Be very careful if moving a cabled/operational enclosure during the FRU
replacement process.
A single PSU is sufficient to maintain operation of the enclosure. You do not need to halt operations and
completely power-off the enclosure when replacing only one PSU; however, a complete shutdown is required if
replacing both PSUs.
TIP:
Power supply faults and recommended actions on page 84 provides additional information.
NEXIO Farad 2300 Series Service Guide
83
Before you begin
CAUTION: Removing a PSU significantly disrupts the enclosure’s airflow. Do not remove the PSU until you
have received the replacement module.
See ESD on page 73.
Verifying component failure
When either a fan or power supply component fails, RAIDar provides notification; faults are recorded in the event
log; and the PSU’s status LED color changes to amber to indicate a fault condition.
Table 15
Power supply faults and recommended actions
Problem
Recommended action
Power supply fan warning or failure, or power supply
warning or failure
Event code 168
•
•
•
Power supply module failure status, or voltage event
notification
Event code 168
•
•
•
Verify that all fans are working using RAIDar.
In the Configuration View, expand Physical, right-click
the enclosure and select View > Overview. Select either
Rear Graphical or Rear Tabular to view health attributes.
Ensure that the power supply modules are properly
seated and secured within their slots.
Ensure that no slots are left open for more than 2
minutes. If you must replace the FRU, leave the old
module in place until the replacement arrives to maintain
optimal airflow and avoid overheating.
Verify that the power supply module is powered on.
Verify that the power cables are securely attached to the
power supply module and the appropriate power source.
Replace the FRU if necessary.
AC Power Good LED is off
Same as above.
DC Voltage/Fan Fault/Service Required LED is illuminated
Replace the power supply module FRU.
Alternatively, you can observe power supply component health (PSUs, fans) using management interfaces to
verify component failure or component operation (see Using management interfaces on page 92 for more
information).
PSUs
IMPORTANT: AC PSUs do not have power switches. These PSUs power on when connected to a power source,
and power off when disconnected.
84
Troubleshooting and replacing FRUs
AC PSU
Enclosures configured with AC PSUs that do not have a power switch rely on the power cord for power cycling.
Connecting the cord from the PSU power cord connector to the appropriate power source facilitates power on;
whereas disconnecting the cord from the power source facilitates power off.
Power cord connect
Figure 8 AC PSU
Powering off the PSU
1. Disconnect the power cord’s male plug from the power source.
2. Disconnect the power cord’s female plug from the power cord connector on the PSU.
Removing a PSU
1. If replacing a single power supply module via hot-swap, proceed to step 3.
2. If replacing both power supply modules, verify that the enclosure is powered off.
3. Verify that the power cord is disconnected.
4. Turn the thumbscrew at the top of the latch counterclockwise to loosen and disengage it from the module;
however, do not remove the thumbscrew from the latch.
5. Rotate the latch downward by approximately 45° supplying leverage to disconnect the module from the
internal connector.
See Figure 9 on page 85.
Controller or drive enclosure
PSU
(switchless AC model)
Revolved latch
Thumbscrew
PSU
(installed position)
Figure 9 Removing a PSU
6. Use the latch to pull the module straight out of the chassis.
NEXIO Farad 2300 Series Service Guide
85
7. If replacing two power supply modules, repeat step 3 through step 6.
CAUTION: Do not lift the module by its latch; doing so could damage the latch. Lift and carry the module using
its metal casing.
Installing a PSU
AC model without power switch
Figure 10 Orienting a PSU
To install a power supply module, perform the following steps:
1. Orient the PSU with the AC toward the right as shown in Figure 9 on page 85 and Figure 10 on page 86,
respectively.
2. With the latch in the open position, slide the module into the appropriate power supply slot as far as it will go.
3. Rotate the PSU latch upward until it is flush against the PSU face, ensuring that the connector on the PSU
engages the connector inside the chassis.
4. Turn the thumbscrew located at the top of the power supply latch clockwise, until it is finger-tight, to secure
the latch to the PSU within the enclosure.
5. If replacing two power supply modules, repeat step 1 through step 4.
Connecting a power cable
This section describes how to connect a power cable to an
enclosure.
Connecting an AC power cord
The diagram pertains to AC PSU models.
Power supply
module
1. Install the power cord:
a. Connect the female plug to the AC PSU cord inlet.
b. Connect the male plug to the rack power source.
Verify connection of the primary power cord(s) from the
rack to separate external power sources.
2. Power-on the newly-installed PSU:
•
Connecting the power cord powers on the AC PSU.
Wait several seconds for the disks to spin up.
3. If replacing two power supply modules, repeat step 1
through step 2.
86
Troubleshooting and replacing FRUs
Rack power
source
Verifying component operation
Examine PSU module status as indicated in the table below.
Table 16
PSU LED descriptions
LED No./Description
Color
State
Definition
1 — Input Source Power Good
Green
On
Power is on and input voltage is normal.
Off
Power is off, or input voltage is below the minimum
threshold.
On
Output voltage is out of range, or a fan is operating below
the minimum required r/min.
Off
Output voltage is normal.
2 — Voltage/Fan Fault/
Service Required
Amber
LEDs for a PSU are located in the top right corner of the module, as shown in Figure 10 on page 86.
The top LED corresponds to LED number (1) above, and the bottom LED corresponds to number (2) above. If the
Voltage/Fan Fault/Service Required LED is illuminated amber, the PSU module has not gone online, and likely
failed its self-test. Remove and reinstall the PSU module. In addition to viewing the PSU LEDs, verify that the
cooling fans are spinning. Also see Using management interfaces on page 92 as an alternative to physically
observing LEDs to verify component operation.
Replacing an FC transceiver
This section provides steps for replacing an SFP transceiver connector used in an FC controller host port. An
example SFP connector is shown below.
Figure 11 Sample SFP connector
Before you begin
CAUTION: Mishandling fibre-optic cables can degrade performance. Do not twist, fold, pinch, or step on
fibre-optic cables. Do not bend them tighter than a 2-inch radius.
See ESD on page 73.
CAUTION: To prevent potential loss of access to data, be sure to identify the correct cable and SFP connector
for subsequent removal.
Verifying component failure
Transceivers are part of a data path that includes multiple components, such as the transceiver, a cable, another
SFP, and an HBA. A reported fault can be caused by any component in the data path. To identify the location of
NEXIO Farad 2300 Series Service Guide
87
the fault, check the Link Status and Activity LEDs on the controller enclosure and server. Also, check the cable for
kinks, crimping or other possible damage.
TIP:
See System LEDs for descriptions of rear panel LEDs.
Removing an SFP module
Perform the following procedure to remove an SFP connector. When removing an FC SFP that has previously
limited the port speed — and replacing it with a higher-rated SFP — it is possible, though rare, that
auto-negotiation will not enable the higher port speed. Rebooting the array or the host resolves the problem.
1. Disconnect the fibre-optic interface cable by squeezing the end of the cable connector.
If the SFP does not have a cable, it should have a plug (retained from installation).
Fibre-optic cable attached to SFP
Fibre-optic cable disconnected
Figure 12 Disconnect fibre-optic interface cable from SFP
2. SFPs are commonly held in place by a small wire bail actuator. Flip the actuator up.
Flip actuator/revolve upwards
Figure 13 Flip SFP actuator upwards
3. Grasp the SFP between your thumb and index finger, and carefully remove it from the controller module.
Installing an SFP module
Perform the following procedure to install an SFP connector.
1. To connect to an empty port, slide the SFP connector into the port until it locks into place.
If the SFP has a plug, remove it before sliding the connector into the FC port. Retain the plug.
2. Flip the actuator down.
Flip actuator/revolve downwards
Figure 14 Flip SFP actuator downwards
3. Connect the fibre-optic interface cable into the duplex jack at the end of the SFP connector.
88
Troubleshooting and replacing FRUs
Connect fibre-optic cable to SFP
Fibre-optic cable attached to SFP
Figure 15 Connect fibre-optic interface cable to SFP
Verifying component operation
View the Link Status and Link Activity LEDs on the controller module face plate. A blinking LED indicates that
no link is detected. Also check the link status and link activity LEDs on the host.
Replacing a controller enclosure chassis
The controller enclosure chassis replacement procedure replaces a damaged chassis FRU, which consists of the
structural support metal, the exterior sheet metal housing, and the assembled/installed midplane. The procedure
includes removing all FRU components from a damaged chassis and installing them in a replacement chassis.
A fully functional replacement chassis requires the successful removal and installation of the following
components:
•
All disk drive modules and air management modules
•
Two PSUs
•
Two controller modules (of the same model type)
•
FC transceiver
This procedure makes extensive use of the FRU component procedures described elsewhere in Chapter 2. Perform
this procedure by following the step-by-step process described below.
Before you begin
CAUTION:
Do not remove the enclosure until you have received the replacement enclosure.
See ESD on page 73.
1. Schedule down time that will allow for shutdown; sixty minutes of replacement work; and restart.
2. Verify the existence of a known/good backup of the system.
3. Record system settings for future use.
4. Label all cables.
5. Prepare a suitable work environment to accommodate chassis replacement.
Verifying component failure
The controller enclosure FRU includes the enclosure’s metal housing and the midplane that connects controller
modules, disk drive modules, and power supply modules. This FRU replaces an enclosure that has been damaged,
or whose midplane has been damaged.
Often times, a damaged midplane will appear as though a controller module has failed. If you replace a controller
module, and it does not remedy the fault, you may need to replace the enclosure.
NEXIO Farad 2300 Series Service Guide
89
Alternatively, you can observe enclosure health (front panel and rear panel) using management interfaces to verify
enclosure/component failure or enclosure/component operation (see Using management interfaces on page 92 for
more information).
If necessary, use the set led command in the CLI to illuminate the enclosure locator LED.
Preparing to remove a damaged storage enclosure chassis
Since you are removing and replacing an entire controller enclosure, neither the hot-swap capability that applies to
replacing individual redundant FRUs in an operational controller enclosure, nor the hot-add of a drive enclosure to
an operational storage system, apply to this procedure.
1. If you are using a non-mirrored setup, stop all I/O from hosts to the system. You can verify that all activity has
stopped by observing the LED activity on all the drives from the front of the enclosure. The LEDs blink
periodically when there is activity on the system.
2. Shut down the controller(s). See Shutting down a controller module on page 75.
3. Power off the system (controller enclosure first; drive enclosure(s) next). See PSUs on page 84, and refer to the
power cycling procedures pertaining to your enclosure’s power supply modules.
Table 17
Removing and replacing a controller enclosure chassis and its FRUs
To accomplish this sequential process
1. Remove disk drive modules and air management
modules from the damaged chassis.1
2. Remove the damaged storage enclosure chassis from
the rack.
a. Air management modules (not supported) on page 80.
b. Before you begin on page 80.
c. Removing a disk drive module on page 81.
Removing a damaged storage enclosure chassis from the rack
on page 91.
3. Remove the PSUs from the damaged chassis, and
install them in the replacement chassis.
a.
b.
c.
d.
4. Remove each IOM from the damaged chassis, and
install it in the replacement chassis.2
a. Before you begin on page 74.
b. Removing a controller module or expansion module on
page 76.
c. Installing a controller module or expansion module on
page 78.
5. Remove each FC transceiver from the damaged
chassis, and install it in the replacement chassis (FC
models only).3
a. Before you begin on page 87.
b. Removing an SFP module on page 88.
c. Installing an SFP module on page 88.
6. Install the replacement storage enclosure chassis in the
rack.
7. Install disk drive modules and air management module
in the replacement chassis.1
90
See the following procedures
Before you begin on page 84.
PSUs on page 84.
Removing a PSU on page 85.
Installing a PSU on page 86.
Installing the replacement storage enclosure chassis in the rack
on page 91.
Installing a disk drive module on page 81.
8. Complete the installation process.
a. Connecting a power cable on page 86.
b. Completing the process on page 91.
9. Verify proper operation for all removed and installed
FRU components.
a. Disks—Verifying component operation on page 83.
b. Controller module(s)—Verifying component
operation on page 78.
c. PSUs—Verifying component operation on page 87.
d. SFPs (if applicable)—Verifying component operation
on page 89.
e. SFP+ (if applicable)—Verifying component operation
on page 89.
f. Verify PFU enabled (if applicable)—Configuring PFU
on page 75.
Troubleshooting and replacing FRUs
1
Within the replacement enclosure, reinstall each disk drive into the same disk slot from which it was removed from the damaged enclosure.
2Within the replacement enclosure, the IOM(s) and IOM blank — if applicable — must be reinstalled into the same IOM slots from which they
were extracted from the damaged enclosure.
Removing a damaged storage enclosure chassis from the rack
This section provides a procedure for removing a damaged controller enclosure chassis from its rack location.
CAUTION: It is recommended that all disk drive modules and air management modules be removed before
removing the enclosure. If this is not possible, two people are required to move the enclosure. See Removing a
disk drive module on page 81.
1. Be sure to follow the preparation steps above including labeling and removing all cables and recommended
components.
2. Remove the retaining screws that secure the front and rear of the controller enclosure chassis to the rack and
rails.
3. Carefully slide the controller enclosure chassis from the rack.
4. Place the chassis on a work surface near the replacement controller enclosure chassis, the removed disk drive
modules, ear bezel components, and screws.
5. Remove the side bracket from each side of the damaged controller enclosure chassis.
6. Attach the side bracket to each side of the replacement controller enclosure chassis.
Installing the replacement storage enclosure chassis in the rack
This section provides a procedure for installing the replacement controller enclosure chassis in its rack location.
CAUTION: It is recommended that all disk drive modules and air management modules be removed before
lifting the enclosure. If this is not possible, two people are required to move the enclosure. See Removing a disk
drive module on page 81.
NOTE: Refer to Rackmount Bracket Kit Installation or 2-Post Rackmount Bracket Kit Installation for the
correct installation procedure and mounting hardware.
1. Attach side brackets (standard rackmount installation) or main brackets (2-post rackmount installation) on the
replacement controller enclosure chassis.
2. Support the bottom of the controller enclosure chassis. Carefully lift/align the chassis and slide it into the rack.
3. Using the appropriate mounting hardware, secure the controller enclosure chassis to the rack.
4. Using the applicable retaining screws, secure the front and rear of the controller enclosure chassis to the rack
and rails.
5. Replace all components previously removed (drives, PSUs, controllers, etc.)
Completing the process
This section provides a procedure for ensuring that the FRU components installed in the replacement controller
enclosure chassis function properly.
1. Reconnect data cables between devices, as needed, to return to the original cabling configuration:
•
Between cascaded enclosures.
•
Between the controller and peripheral or SAN devices.
•
Between the controller enclosure and the host.
2. Reconnect power cables to the controller enclosure. See Connecting a power cable on page 86.
NEXIO Farad 2300 Series Service Guide
91
Verifying component operation
Restart system devices in the following sequence. Allow time for each device to complete its POST before
proceeding:
1. Disk drive enclosures
2. Controller enclosure
3. Host (if powered down for maintenance)
Using LEDs
View LEDs on the front and rear of the enclosure (seeTroubleshooting using system LEDs).
Verify front panel LEDs:
•
Verify that the Enclosure ID LED (located on the left ear) is illuminated green.
•
Verify that the FRU OK and Temperature Fault LEDs are illuminated green, and that the Fault/Service
Required LED is off (all three LEDs are located on the right ear).
•
Verify that the Power/Activity/Fault LED (bottom LED on front of disk) is illuminated green or blinking green
(If your product model has an enclosure bezel, remove it to view disk LEDs).
Verify rear panel LEDs:
•
Verify that the each power supply module’s Input Source Power Good LED (top LED on PSU) is illuminated
green.
•
Verify that the FRU OK LED on each IOM face plate is illuminated green, indicating that the module has
completed initializing, and is online.
Using management interfaces
In addition to viewing LEDs as described above, you can use management interfaces to monitor the health status
of the system and its components, provided you have configured and provisioned the system (see “Getting
Started” within the NEXIO Farad 2300 Series RAIDar User Guide for more information).
Select from the following methods to verify component operation:
92
•
Use RAIDar to check the health icons/values of the system and its components to either ensure that everything
is okay, or to drill down to a problem component. RAIDar uses health icons to show OK, Degraded, Fault, or
Unknown status for the system and its components. If you discover a problem component, follow the actions
in its Health Recommendations field to resolve the problem.
•
As an alternative to using RAIDar, you can run the show system command in the CLI to view the health
of the system and its components. If any component has a problem, the system health will be Degraded, Fault,
or Unknown. If you discover a problem component, follow the actions in its Health Recommendations field to
resolve the problem.
•
Monitor event notification — With event notification configured and enabled, you can view event logs to
monitor the health of the system and its components. If a message tells you to check whether an event has been
logged, or to view information about an event in the log, you can do so using either RAIDar or the CLI. Using
RAIDar, you would view the event log and then click on the event message to see detail about that event.
Using the CLI, you would run the show events detail command (with additional parameters to filter
the output) to see the detail for an event (see “Alphabetical list of commands > show events” within the
NEXIO Farad 2300 Series CLI Reference Guide for more information about command syntax and
parameters).
Troubleshooting and replacing FRUs
7
Voltage and temperature warnings
The storage system provides voltage and temperature warnings, which are generally input or environmental
conditions. Voltage warnings can occur if the input voltage is too low, or if a FRU is receiving too little or too
much power from the power supply module. Temperature warnings are generally the result of a fan failure, a FRU
being removed from an enclosure for a lengthy time period, or a high ambient temperature around an enclosure.
This chapter describes the steps to take to resolve voltage and temperature warnings and provides information
about the power supply, cooling fan, temperature and voltage sensor locations, and alarm conditions.
Resolving voltage and temperature warnings
1. Check that all of the fans are working by making sure the Voltage/Fan Fault/Service Required LED on each
power supply module is off, or by using RAIDar to check enclosure health status. In the Configuration View
panel, right click the enclosure and click View > Overview to view the health status of the enclosure and its
components.
See Determining storage-system status on page 15 for a description of health status icons and alternatives for
monitoring enclosure health.
2. Make sure that all modules are fully seated in their slots and that their latches are locked.
3. Make sure that no slots are left open for more than two minutes.
If you need to replace a module, leave the old module in place until you have the replacement or use a blank
module to fill the slot. Leaving a slot open negatively affects the airflow and can cause the enclosure to
overheat.
4. Try replacing each power supply one at a time.
5. Replace the controller modules one at a time.
Sensor locations
The storage system monitors conditions at different points within each enclosure to alert you to problems. Power,
cooling fan, temperature, and voltage sensors are located at key points in the enclosure. In each controller module
and expansion module, the EMP monitors the status of these sensors to perform SES functions.
The following sections describe each element and its sensors.
Power supply sensors
Each enclosure has two fully redundant power supplies with load-sharing capabilities. The power supply sensors
described in Table 18 monitor the voltage, current, temperature, and fans in each power supply. If the power
supply sensors report a voltage that is under or over the threshold, check the input voltage.
Table 18
Power supply sensor descriptions
Description
Event/Fault ID LED condition
Power supply 1
Voltage, current, temperature, or fan fault
Power supply 2
Voltage, current, temperature, or fan fault
Cooling fan sensors
Each power supply includes two fans. The normal range for fan speed is 4,000 to 6,000 r/min. When a fan speed
drops below 4,000 r/min, the EMP considers it a failure and posts an alarm in the storage system’s event log.
Table 19 lists the description, location, and alarm condition for each fan. If the fan speed remains under the 4,000
r/min threshold, the internal enclosure temperature may continue to rise. Replace the power supply reporting the
fault.
NEXIO Farad 2300 Series Service Guide
93
Table 19
Cooling fan sensor descriptions
Description
Location
Event/Fault ID LED condition
Fan 1
Power supply 1
< 4,000 r/min
Fan 2
Power supply 1
< 4,000 r/min
Fan 3
Power supply 2
< 4,000 r/min
Fan 4
Power supply 2
< 4,000 r/min
During a shutdown, the cooling fans do not shut off. This allows the enclosure to continue cooling.
Temperature sensors
Extreme high and low temperatures can cause significant damage if they go unnoticed. Each controller module
has six temperature sensors. Of these, if the CPU or FPGA temperature reaches a shutdown value, the controller
module is automatically shut down. Each power supply has one temperature sensor.
When a temperature fault is reported, it must be remedied as quickly as possible to avoid system damage. This can
be done by warming or cooling the installation location.
Table 20
Controller module temperature sensor descriptions
Description
Normal operating
range
Warning operating
range
Critical operating
range
Shutdown values
CPU temperature
3°C–88°C
0°C–3°C,
88°C–90°C
> 90°C
0°C
100°C
FPGA temperature
3°C–97°C
0°C–3°C,
97°C–100°C
None
0°C
105°C
Onboard temperature 1
0°C–70°C
None
None
None
Onboard temperature 2
0°C–70°C
None
None
None
Onboard temperature 3
(Capacitor temperature)
0°C–70°C
None
None
None
CM temperature
5°C–50°C
≤ 5°C,
≥ 50°C
≤ 0°C,
≥ 55°C
None
When a power supply sensor goes out of range, the Fault/ID LED illuminates amber and an event is logged to the
event log.
Table 21
Power supply temperature sensor descriptions
Description
Normal operating range
Power supply 1 temperature
–10°C–80°C
Power supply 2 temperature
–10°C–80°C
Power supply module voltage sensors
Power supply voltage sensors ensure that an enclosure’s power supply voltage is within normal ranges. There are
three voltage sensors per power supply.
Table 22
94
Voltage sensor descriptions
Sensor
Event/Fault LED condition
Power supply 1 voltage, 12V
< 11.00V
> 13.00V
Voltage and temperature warnings
Table 22
Voltage sensor descriptions (continued)
Sensor
Event/Fault LED condition
Power supply 1 voltage, 5V
< 4.00V
> 6.00V
Power supply 1 voltage, 3.3V
< 3.00V
> 3.80V
NEXIO Farad 2300 Series Service Guide
95
96
Voltage and temperature warnings
A
Event descriptions
Introduction
This appendix is for reference by storage administrators and technical support personnel to help troubleshoot
storage-system issues. It describes event messages that may be reported during system operation and specifies any
actions recommended in response to an event.
Events and event messages
When an event occurs in a storage system, an event message is recorded in the system’s event log and, depending
on the system’s event notification settings, may also be sent to users (using email) and host-based applications (via
SNMP or SMI-S).
Each event has a numeric code that identifies the type of event that occurred, and has one of the following
severities:
•
Critical: A failure occurred that may cause a controller to shut down. Correct the problem immediately.
•
Error: A failure occurred that may affect data integrity or system stability. Correct the problem as soon as
possible.
•
Warning: A problem occurred that may affect system stability but not data integrity. Evaluate the problem and
correct it if necessary.
•
Informational: A configuration or state change occurred, or a problem occurred that the system corrected. No
immediate action is required. In this appendix, this severity is abbreviated as “Info.”
An event message may specify an associated error code or reason code, which provides additional detail for
technical support. Error codes and reason codes are outside the scope of this appendix.
Event format in this appendix
This appendix lists events by event code and severity, where the most severe form of an event is described first.
Events are listed in the following format.
Event code
Severity
Event description.
Recommended actions
•
If the event indicates a problem, actions to take to resolve the problem.
Resources for diagnosing and resolving problems
For further information about diagnosing and resolving problems, see:
•
The troubleshooting chapter and the LED descriptions appendix in your product’s Setup Guide
•
The topics about verifying component failure in your product’s FRU Installation and Replacement Guide
•
For a summary of storage events and corresponding SMI-S indications, see Events sent as indications to
SMI-S clients on page 140.
Event descriptions
1
Warning If the indicated vdisk is RAID 6, it is operating with degraded health due to the failure of two disks.
If the indicated vdisk is not RAID 6, it is operating with degraded health due to the failure of one disk. The vdisk
is online but cannot tolerate another disk failure.
If a dedicated or global spare of the proper type and size is present, that spare is used to automatically reconstruct
the vdisk; events 9 and 37 are logged to indicate this. If no usable spare disk is present, but an available disk of the
NEXIO Farad 2300 Series Service Guide
97
proper type and size is present and the dynamic spares feature is enabled, that disk is used to automatically
reconstruct the vdisk; event 37 is logged.
Recommended actions
•
If no spare was present and the dynamic spares feature is disabled (that is, event 37 was not logged), configure
an available disk as a dedicated spare for the vdisk or replace the failed disk and configure the new disk as a
dedicated spare for the vdisk. That spare will be used to automatically reconstruct the vdisk; confirm this by
checking that events 9 and 37 are logged.
•
Otherwise, reconstruction automatically started and event 37 was logged. Replace the failed disk and
configure the replacement as a dedicated or global spare for future use.
•
If the replacement disk was previously used in another vdisk and has a status of leftover (LEFTOVR), clear
the disk's metadata so you can assign the disk as a spare. Using a drive that was previously part of any Vdisk
as a replacement drive is not supported in NEXIO Farad Gen 2. If a replacement drive spins up and shows a
LEFTOVER status, clearing the metadata will result in a temporary loss of I/O to the drives and may result in
damaged recordings or playback.
•
Confirm that all failed disks have been replaced and that there are sufficient spare disks configured for future
use.
3
Error
The indicated vdisk went offline.
One disk failed for RAID 0 or NRAID, three disks failed for RAID 6, or two disks failed for other RAID levels.
The vdisk cannot be reconstructed.
Recommended actions
•
The CLI 'trust' command may be able to recover some or all of the data in the vdisk .
4
Info.
The indicated disk had an uncorrectable error and the controller reassigned the indicated block.
Recommended actions
•
Monitor the error trend and whether the number of errors approaches the total number of bad-block
replacements available.
6
Warning A failure occurred during initialization of the indicated vdisk. This was probably caused by the failure of a disk
drive. The initialization may have completed but the vdisk is probably in a state of FTDN (fault tolerant with a
down disk), CRIT (critical), or OFFL (offline), depending on the RAID level and the number of disks that failed.
Recommended actions
•
Info.
Look for another event logged at approximately the same time that indicates a disk failure, such as event 55,
58, or 412. Follow the recommended actions for that event.
Vdisk creation failed immediately. The user was given immediate feedback that it failed at the time they attempted
to create the vdisk.
Recommended actions
•
No action is required.
7
Error
In a testing environment, a controller diagnostic failed and reports a product-specific diagnostic code.
Recommended actions
•
98
Perform failure analysis.
Event descriptions
8
Warning The indicated disk in the indicated vdisk failed and the vdisk may have changed to a status of critical (CRIT) or
offline (OFFL). If a spare is present the controller automatically uses the spare to reconstruct the vdisk.
Subsequent events indicate the changes that happen to the vdisk.
When the problem is resolved, event 9 is logged.
Recommended actions
Table 23
Disk error conditions and recommended actions
Condition
Recommended action
Excessive media errors.
Obtain a replacement disk from Customer Support that
has the same type (SAS SSD, enterprise SAS, or
midline SAS) and the same or greater capacity and
replace the disk.
Disk failure is imminent.
Obtain a replacement disk from Customer Support that
has the same type (SAS SSD, enterprise SAS, or
midline SAS) and the same or greater capacity and
replace the disk.
The disk has a possible hardware failure.
Obtain a replacement disk from Customer Support that
has the same type (SAS SSD, enterprise SAS, or
midline SAS) and the same or greater capacity and
replace the disk.
The disk is not supported.
Obtain a replacement disk from Customer Support that
has the same type (SAS SSD, enterprise SAS, or
midline SAS) and the same or greater capacity and
replace the disk.
A user forced the disk out of the vdisk.
If the associated vdisk is offline or quarantined,
contact technical support.; otherwise, clear the disk’s
metadata to reuse the disk.
A previously detected disk is no longer present.
Insert a replacement disk obtained from Customer
Support of the same or greater capacity as the one that
was in the slot.
If the disk then has a status of leftover (LEFTOVR),
clear the metadata to reuse the disk. Clearing metadata
on a LEFTOVER disk is not supported for Farad as it
will cause an interruption to I/O to the affected stack.
If the associated vdisk is offline or quarantined,
contact Customer Support.
Unknown reason
If the associated vdisk is offline or quarantined,
contact technical support; otherwise, clear the disk’s
metadata to reuse the disk.
9
Info.
The indicated spare disk has been used in the indicated vdisk to bring it back to a fault-tolerant status.
Vdisk reconstruction starts automatically. This event indicates that a problem reported by event 8 is resolved.
Recommended actions
•
No action is required.
16
Info.
The indicated disk has been designated a global spare.
NEXIO Farad 2300 Series Service Guide
99
Recommended actions
•
No action is required.
18
Warning Vdisk reconstruction failed.
Recommended actions
•
Info.
Determine whether the reconstruction failed due to a disk failure and whether replacing that disk will enable
reconstruction to start and complete without further errors. To determine this, look for another event logged at
approximately the same time that indicates a disk failure, such as event 55, 58, or 412. Info. Follow the
recommended actions for that event.
Vdisk reconstruction completed.
Recommended actions
•
If the event message indicates that one or more uncorrectable media errors occurred during the reconstruction,
some user data may have been lost. Use backup copies of the data or other means to restore any lost data.
•
Otherwise, no action is required.
19
Info.
A rescan has completed.
Recommended actions
•
No action is required.
20
Info.
Storage Controller firmware update has completed.
Recommended actions
•
No action is required.
21
Error
Vdisk verification completed. Errors were found but not corrected.
Recommended actions
•
Perform a vdisk scrub to find and correct the errors.
Warning Vdisk verification did not complete because of an internally detected condition such as a failed disk.
If a disk fails, data may be at risk.
Recommended actions
Info.
•
Resolve any non-disk hardware problems, such as a cooling problem or a faulty controller module, expansion
module, or power supply.
•
Check whether any disks in the vdisk have logged SMART events or unrecoverable read errors.
If so, and the vdisk is a non-fault-tolerant RAID level (RAID-0 or non-RAID), copy the data to a different
vdisk and replace the faulty disks.
•
If so, and the vdisk is a fault-tolerant RAID level, replace the faulty disks. Before replacing a disk, confirm
that a reconstruction is not currently running on the vdisk. It is also recommended to make a full backup of
all the data in the vdisk before replacing disks. If more than one disk in the vdisk has errors, replace the
disks one at a time and allow reconstruction to complete after each disk is replaced.
Vdisk verification failed immediately, was aborted by a user, succeeded, or parity or mirror errors were detected
and corrected.
•
100
•
No action is required.
Event descriptions
23
Info.
Vdisk creation has started.
Recommended actions
•
No action is required.
25
Info.
The statistics for the indicated vdisk have been reset.
Recommended actions
•
No action is required.
27
Info.
Cache parameters have been changed for the indicated vdisk.
Recommended actions
•
No action is required.
28
Info.
Controller parameters have been changed.
This event is logged when general configuration changes are made; for example, utility priority, remote
notification settings, user interface passwords, and network port IP values. This event is not logged when changes
are made to vdisk or volume configuration.
Recommended actions
•
No action is required.
31
Info.
The indicated disk is no longer a global or dedicated spare.
Recommended actions
•
No action is required.
32
Info.
Vdisk verification has started.
Recommended actions
•
No action is required.
33
Info.
Controller time/date has been changed.
This event is logged before the change happens, so the timestamp of the event shows the old time. This event may
occur often if NTP is enabled
Recommended actions
•
No action is required.
34
Info.
The controller configuration has been restored to factory defaults.
Recommended actions
•
For an FC controller, restart it to make the default loop ID take effect.
37
Info.
Vdisk reconstruction has started.
NEXIO Farad 2300 Series Service Guide
101
When complete, event 18 is logged.
Recommended actions
•
No action is required.
39
Warning The sensors monitored a temperature or voltage in the warning range.
Recommended actions
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5–40° C
(41° F–104° F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that reported the error.
When the problem is fixed, event 47 is logged.
40
Error
The sensors monitored a temperature or voltage in the failure range.
Recommended actions
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5–40° C
(41° F–104° F).
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that reported the error.
When the problem is fixed, event 47 is logged.
41
Info.
A dedicated spare has been added.
Recommended actions
•
No action is required.
43
Info.
The indicated vdisk has been deleted.
Recommended actions
•
No action is required.
44
Warning The controller contains cache data for the indicated volume but the corresponding vdisk is not online.
Recommended actions
•
Determine the reason why the disks comprising the vdisk are not online.
•
If an enclosure is down, determine corrective action.
•
If the vdiskis no longer needed, you can clear the orphan data; this will result in lost data.
•
If the vdiskis missing and was not intentionally removed, see “Resources for diagnosing and resolving
problems” in the WBI help for the event log panel, or the CLI help for the show events command.
47
Info.
102
An error detected by the sensors has been cleared. This event indicates that a problem reported by event 39 or 40
is resolved.
Event descriptions
Recommended actions
•
No action is required.
48
Info.
The indicated vdisk has been renamed.
Recommended actions
•
No action is required.
49
Info.
A lengthy SCSI maintenance command has completed. (This typically occurs during disk firmware update.)
Recommended actions
•
No action is required.
50
Info.
A correctable ECC error occurred in buffer memory.
Recommended actions
•
No action is required.
51
Error
An uncorrectable ECC error occurred in buffer memory.
Recommended actions
•
If the event occurs more than once, replace the controller reporting the event.
52
Info.
Vdisk expansion has started.
This operation can take days, or weeks in some cases, to complete. Allow adequate time for the expansion to
complete.
When complete, event 53 is logged.
Recommended actions
•
No action is required.
53
Warning Too many errors occurred during vdisk expansion to allow the expansion to continue.
Recommended actions
•
Info.
If the expansion failed because of a disk problem, replace the disk with a replacement obtained from Customer
support that has the same or greater capacity. If vdisk reconstruction starts, wait for it to complete and then
retry the expansion.
Vdisk expansion either completed successfully, failed immediately, or was aborted by a user.
Recommended actions
•
If the expansion failed because of a disk problem, replace ta replacement obtained from Customer Support that
has the same or better than the one it is replacing. If vdisk reconstruction starts, wait for it to complete and then
retry the expansion.
•
No action is required.
55
Warning The indicated disk reported a SMART event.
A SMART event indicates impending disk failure.
NEXIO Farad 2300 Series Service Guide
103
Recommended actions
•
Resolve any non-disk hardware problems, especially a cooling problem or a faulty power supply.
•
If the disk is in a vdisk that uses a fault-tolerant RAID level, replace the faulty disk. Before replacing the disk,
confirm that a reconstruction is not currently running on the vdisk. If more than one disk in the vdisk has
reported SMART events, replace the disks one at a time and allow reconstruction to complete after each disk is
replaced.
56
Info.
A controller has powered up or restarted.
Recommended actions
•
No action is required.
58
Error
A disk drive detected a serious error, such as a parity error or disk hardware failure.
Recommended actions
•
Replace the disk with one obtained from Customer Support with one of the same or greater capacity. For
continued optimum I/O performance, the replacement disk should have performance that is the same as or
better than the one it is replacing.
Warning A disk drive reset itself due to an internal logic error.
Recommended actions
Info.
•
The first time this event is logged with Warning severity, if the indicated disk is not running the latest
firmware, update the disk firmware.
•
If this event is logged with Warning severity for the same disk more than five times in one week, and the
indicated disk is running the latest firmware, replace the disk with one obtained from Customer Support that
has the same or greater capacity.
A disk drive reported an event.
Recommended actions
•
No action is required.
59
Warning The controller detected a parity event while communicating with the indicated SCSI device. The event was
detected by the controller, not the disk.
Recommended actions
•
Info.
If the event indicates that a disk or an expansion module is bad, replace the indicated device.
The controller detected a non-parity error while communicating with the indicated SCSI device. The error was
detected by the controller, not the disk.
Recommended actions
•
No action is required.
61
Error
The controller reset a disk channel to recover from a communication error. This event is logged to identify an error
trend over time.
Recommended actions
104
•
If the controller recovers, no action is required.
•
View other logged events to determine other action to take.
Event descriptions
62
Warning The indicated global or dedicated spare disk has failed.
Recommended actions
•
Replace the disk with one obtained from Customer Support with one that has the same or greater capacity.
•
If the failed disk was a global spare, configure the new disk as a global spare.
65
Error
An uncorrectable ECC error occurred in the buffer memory on startup.
The controller is automatically restarted and its cache data are restored from the partner controller’s cache.
Recommended actions
•
Replace the controller module that logged this event.
67
Info.
The controller has identified a new disk or group of disks that constitute a vdisk and has taken ownership of the
vdisk. This can happen when disks containing data have been inserted from another enclosure. This event only
applies to non-Active-Active controllers.
Recommended actions
•
You may need to clear the disks’ metadata if you want to reuse them in one or more new vdisks.
68
Info.
The controller that logged this event is shut down, or both controllers are shut down.
Recommended actions
•
No action is required.
71
Info.
The controller has started or completed failing over.
Recommended actions
•
No action is required.
72
Info.
After failover, recovery has either started or completed.
Recommended actions
•
No action is required.
73
Info.
The two controllers are communicating with each other and cache redundancy is enabled.
Recommended actions
•
No action is required.
74
Info.
The FC loop ID for the indicated vdisk was changed to be consistent with the IDs of other vdisks. This can occur
when disks containing a vdisk are inserted from an enclosure having a different FC loop ID.
This event is also logged by the new owning controller after vdisk ownership is changed.
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
105
75
Info.
The indicated volume’s LUN has been unassigned because it conflicts with LUNs assigned to other volumes. This
can happen when disks containing data for a mapped volume have been moved from one storage system to
another.
Recommended actions
•
If you want hosts to access the volume data in the inserted disks, map the volume with a different LUN.
76
Info.
The controller is using default configuration settings. This event occurs on the first power up, and might occur
after a firmware update.
Recommended actions
•
If you have just performed a firmware update and your system requires special configuration settings, you
must make those configuration changes before your system will operate as before.
77
Info.
The cache was initialized as a result of power up or failover.
Recommended actions
•
No action is required.
78
Warning This occurs when a disk in a vdisk fails and there is no dedicated spare available and all global spares are too small
or, if the dynamic spares feature is enabled, all global spares and available disks are too small, or if there is no
spare of the correct type. There may be more than one failed disk in the system.
Recommended actions
•
Replace each failed disk with one obtained from Customer Support that has the same or greater capacity. For
continued optimum I/O performance, the replacement disk should have performance that is the same as or
better than the one it is replacing.
•
Configure disks as global spares.
•
For a global spare, use a disk obtained from Customer Support.
79
Info.
A trust operation has completed for the indicated vdisk.
Recommended actions
•
Be sure to complete the trust procedure as documented in the CLI help for the trust command.
80
Info.
The controller enabled or disabled the indicated parameters for one or more disks.
Recommended actions
•
No action is required.
81
Info.
The current controller has unkilled the partner controller. The other controller will restart.
Recommended actions
•
No action is required.
83
Info.
106
The partner controller is changing state (shutting down or restarting).
Event descriptions
Recommended actions
•
No action is required.
84
Warning The controller that logged this event forced the partner controller to fail over.
Recommended actions
•
Download the debug logs from your storage system and contact technical support. A service technician can
use the debug logs to determine the problem.
86
Info.
Host-port or disk-channel parameters have been changed.
Recommended actions
•
No action is required.
87
Warning The mirrored configuration retrieved by this controller from the partner controller has a bad cyclic redundancy
check (CRC). The local flash configuration will be used instead.
Recommended actions
•
Restore the default configuration by using the restore defaults command, as described in the CLI
Reference Guide.
88
Warning The mirrored configuration retrieved by this controller from the partner controller is corrupt. The local flash
configuration will be used instead.
Recommended actions
•
Restore the default configuration by using the restore defaults command, as described in the CLI
Reference Guide.
89
Warning The mirrored configuration retrieved by this controller from the partner controller has a configuration level that is
too high for the firmware in this controller to process.
Recommended actions
•
The controller that logged this event probably has down-level firmware. Update the firmware on the
down-level controller. Both controllers should have the same firmware versions.
When the problem is resolved, event 20 is logged.
90
Info.
The partner controller does not have a mirrored configuration image for the current controller, so the current
controller's local flash configuration is being used.
This event is expected if the other controller is new or its configuration has been changed.
Recommended actions
•
No action is required.
91
Error
In a testing environment, the diagnostic that checks hardware reset signals between controllers in Active-Active
mode failed.
Recommended actions
•
Perform failure analysis.
NEXIO Farad 2300 Series Service Guide
107
95
Error
Both controllers in an Active-Active configuration have the same serial number. Non-unique serial numbers can
cause system problems; for example, WWNs are determined by serial number.
Recommended actions
•
Remove one of the controller modules and insert a replacement, then return the removed module to be
reprogrammed.
96
Info.
Pending configuration changes that take effect at startup were ignored because customer data might be present in
cache.
Recommended actions
•
If the requested configuration changes did not occur, make the changes again and then use a user-interface
command to shut down or restart the controller.
103
Info.
The name has been changed for the indicated volume.
Recommended actions
•
No action is required.
104
Info.
The size has been changed for the indicated volume.
Recommended actions
•
No action is required.
105
Info.
The LUN (logical unit number) has been changed for the indicated volume.
Recommended actions
•
No action is required.
106
Info.
The indicated volume has been added to the indicated vdisk.
Recommended actions
•
No action is required.
107
Error
A serious error has been detected by the controller. In a single-controller configuration, the controller will restart
automatically. In an Active-Active configuration, the partner controller will kill the controller that experienced the
error.
Recommended actions
•
Download the debug logs from your storage system and contact technical support. A service technician can
use the debug logs to determine the problem.
108
Info.
The indicated volume has been deleted from the indicated vdisk.
Recommended actions
•
108
No action is required.
Event descriptions
109
Info.
The statistics for the indicated volume have been reset.
Recommended actions
•
No action is required.
110
Info.
Ownership of the indicated vdisk has been given to the other controller.
Recommended actions
•
No action is required.
111
Info.
The link for the indicated host port is up.
This event indicates that a problem reported by event 112 is resolved. For a system with FC ports, this event also
appears after loop initialization.
Recommended actions
•
No action is required.
112
Warning The link for the indicated host port has unexpectedly gone down.
Recommended actions
Info.
•
Look for corresponding event 111 and monitor excessive transitions. If this event occurs more than 8 times per
hour, it should be investigated.
•
This event is probably caused by equipment outside of the storage system, such as faulty cabling or a faulty
switch.
•
If the problem is not outside of the storage system, replace the controller module that logged this event.
The link for the indicated host port has gone down because the controller is starting up.
Recommended actions
•
No action is required.
114
Info.
The link for the indicated disk-channel port is down. Note that events 114 and 211 are logged whenever a
user-requested rescan occurs and do not indicate an error.
Recommended actions
•
Look for corresponding event 211 and monitor excessive transitions indicating disk problems. If more than 8
transitions occur per hour, see “Resources for diagnosing and resolving problems” in the WBI help for the
event log panel, or the CLI help for the show events command.
116
Error
After a recovery, the partner controller was killed while mirroring write-back cache data to the current controller
that logged this event. The controller that logged this event restarted to avoid losing the data in the partner
controller’s cache, but if the other controller does not restart successfully, the data will be lost.
Recommended actions
•
To determine if data might have been lost, check whether this event was immediately followed by event 56
(Storage Controller booted up), closely followed by event 71 (failover started); the failover indicates that the
restart did not succeed.
NEXIO Farad 2300 Series Service Guide
109
118
Info.
Cache parameters have been changed for the indicated volume.
Recommended actions
•
No action is required.
127
Warning The controller has detected an invalid disk dual-port connection. This event indicates that a controller host port is
connected to an expansion port instead of to a port on a host or a switch.
Recommended actions
•
Disconnect the host port and expansion port from each other and connect them to the proper devices.
136
Warning Errors detected on the indicated disk channel have caused the controller to mark the channel as degraded.
Recommended actions
•
Determine the source of the errors on the indicated disk channel and replace the faulty hardware.
When the problem is resolved, event 189 is logged.
139
Info.
The Management Controller (MC) has powered up or restarted.
Recommended actions
•
No action is required.
140
Info.
The Management Controller (MC) is about to restart.
Recommended actions
•
No action is required.
141
Info.
This event is logged when the IP address used for management of the system has been changed by a user or by a
DHCP server (if DHCP is enabled). This event is also logged during power up or failover recovery, even when the
address has not changed.
Recommended actions
•
110
No action is required.
Event descriptions
152
Warning The Management Controller (MC) has not communicated with the Storage Controller (SC) for 15 minutes, which
exceeds the MC communication timeout, and may have failed.
This event is initially logged as Informational severity. If the problem persists, this event is logged a second time
as Warning severity and the MC is automatically restarted in an attempt to recover from the problem. Event 156 is
then logged.
Recommended actions
•
If this event is logged only one time as Warning severity, no action is required.
•
If this event is logged more than one time as Warning severity, do the following:
•
•
Info.
If you are now able to access the management interfaces of the controller that logged this event, do the
following:
•
Check the version of the controller firmware and update to the latest firmware if needed.
•
If the latest firmware is already installed, the controller module that logged this event probably has a
hardware fault. Replace the module.
If you are NOT able to access the management interfaces of the controller that logged this event, do the
following:
•
Shut down that controller and reseat the module.
•
If you are then able to access the management interfaces, check the version of the controller firmware
and update to the latest firmware if needed.
•
If the problem recurs, replace the module.
The Management Controller (MC) has not communicated with the Storage Controller (SC) for 160 seconds.
If communication is restored in less than 15 minutes, event 153 is logged. If the problem persists, this event is
logged a second time as Warning severity.
NOTE:
It is normal for this event to be logged as Informational severity during firmware update.
Recommended actions
•
Check the version of the controller firmware and update to the latest firmware if needed.
•
If the latest firmware is already installed, no action is required.
153
Info.
The Management Controller (MC) has re-established communication with the Storage Controller (SC).
Recommended actions
•
No action is required.
154
Info.
New firmware has been loaded in the Management Controller (MC).
Recommended actions
•
No action is required.
155
Info.
New loader firmware has been loaded in the Management Controller (MC).
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
111
156
Warning The Management Controller (MC) has been restarted from the Storage Controller (SC) for the purpose of error
recovery.
Recommended actions
•
Info.
See the recommended actions for event 152, which is logged at approximately the same time.
The Management Controller (MC) has been restarted from the Storage Controller (SC) in a normal case, such as
when initiated by a user.
Recommended actions
•
No action is required.
157
Error
A failure occurred when trying to write to the Storage Controller (SC) flash chip.
Recommended actions
•
Replace the controller module that logged this event.
158
Info.
A correctable ECC error occurred in the CPU memory.
Recommended actions
•
No action is required.
161
Info.
One or more enclosures do not have a valid path to an enclosure management processor (EMP).
All enclosure EMPs are disabled.
Recommended actions
•
Download the debug logs from your storage system and contact technical support. A service technician can
use the debug logs to determine the problem.
162
Warning The host WWNs (node and port) previously presented by this controller module are unknown. In a dual-controller
system this event has two possible causes:
•
One or both controller modules have been replaced or moved while the system was powered off.
•
One or both controller modules have had their flash configuration cleared (this is where the previously used
WWNs are stored).
The controller module recovers from this situation by generating a WWN based on its own serial number.
Recommended actions
•
If the controller was replaced or someone reprogrammed its FRU ID data, verify the WWN information for
this controller module on all hosts that access it.
163
Warning The host WWNs (node and port) previously presented by the partner controller module, which is currently offline,
are unknown.
This event has two possible causes:
•
The online controller module reporting the event was replaced or moved while the system was powered off.
•
The online controller module had its flash configuration (where previously used WWNs are stored) cleared.
The online controller module recovers from this situation by generating a WWN based on its own serial number
for the other controller module.
112
Event descriptions
Recommended actions
•
If the controller was replaced or someone reprogrammed its FRU ID data, verify the WWN information for the
other controller module on all hosts that access it.
166
Warning The RAID metadata level of the two controllers does not match, which indicates that the controllers have different
firmware levels.
Usually, the controller at the higher firmware level can read metadata written by a controller at a lower firmware
level. The reverse is typically not true. Therefore, if the controller at the higher firmware level failed, the surviving
controller at the lower firmware level cannot read the metadata on disks that have failed over.
Recommended actions
•
If this occurs after a firmware update, it indicates that the metadata format changed, which is rare. Update the
controller with the lower firmware level to match the firmware level in the other controller.
167
Warning A diagnostic test at controller bootup detected an abnormal operation, which might require a power cycle to
correct.
Recommended actions
•
Download the debug logs from your storage system and contact technical support. A service technician can
use the debug logs to determine the problem.
168
Error
The indicated SES alert condition was detected in the indicated enclosure. This event is logged as Error severity
when one of the power supplies in an enclosure has no power supplied to it or when a hardware failure is detected.
Recommended actions
•
Check that all modules in the enclosure are properly seated in their slots and that their latches are locked.
•
If the reported problem is with a power supply, perform these checks:, check that each power cable is firmly
plugged into both the power supply and a functional electrical outlet.
•
If the reported problem is with a temperature sensor or fan or power supply, perform these checks:
•
•
Check that all of the enclosure's fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5°–40°C
(41°–104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module in every module slot in the enclosure.
If none of the above resolve the issue, the indicated FRU has probably failed and should be replaced. The
failed FRU will probably have an amber LED lit.
When the problem is resolved, event 169 is logged.
Warning The indicated SES alert condition was detected in the indicated enclosure.
Recommended actions
•
Check that all modules in the enclosure are properly seated in their slots and that their latches are locked.
•
If the reported problem is with a power supply, make sure that each power cable is firmly plugged into both the
power supply and a functional electrical outlet.
•
If the reported problem is with a temperature sensor or fan or power supply, perform these checks:
•
Check that all of the enclosure's fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5°–40°C
(41°–104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
NEXIO Farad 2300 Series Service Guide
113
•
If none of the above resolve the issue, the indicated FRU has probably failed and should be replaced. The
failed FRU will probably have an amber LED lit.
When the problem is resolved, event 169 is logged.
Info.
The indicated SES alert condition was detected in the indicated enclosure.
Recommended actions
•
No action is required.
169
Info.
The indicated SES alert condition has been cleared in the indicated enclosure. This event indicates that a problem
reported by event 168 is resolved.
Recommended actions
•
No action is required.
170
Info.
The last rescan detected that the indicated enclosure was added to the system.
Recommended actions
•
No action is required.
171
Info.
The last rescan detected that the indicated enclosure was removed from the system.
Recommended actions
•
No action is required.
172
Warning The indicated vdisk has been quarantined because not all of its disks are available. The vdisk does not contain
enough disks to be fault tolerant. The partial vdisk will be held in quarantine until it becomes fault tolerant.
Recommended actions
•
Ensure that all disks are latched into their slots and have power.
•
During quarantine, the vdisk is not visible to the host. If after latching disks into their slots and powering up
the vdisk, the vdisk is still quarantined, you can manually remove the vdisk from quarantine so that the host
can see the vdisk. The vdisk is still critical.
•
If disks have failed, replace them.
When the vdisk has been removed from quarantine, event 173 is logged.
173
Info.
The indicated vdisk has been removed from quarantine.
Recommended actions
•
No action is required.
174
Info.
Enclosure or disk firmware update has succeeded, been aborted by a user, or failed.
If the firmware update fails, the user will be notified about the problem immediately and should take care of the
problem at that time, so even when there is a failure, this event is logged as Informational severity.
Recommended actions
•
114
No action is required.
Event descriptions
175
Info.
The network-port Ethernet link has changed status (up or down) for the indicated controller.
Recommended actions
•
If this event is logged indicating the network port is up shortly after the Management Controller (MC) has
booted up (event 139), no action is required.
•
Otherwise, monitor occurrences of this event for an error trend. If this event occurs more than 8 times per
hour, it should be investigated.
•
This event is probably caused by equipment outside of the storage system, such as faulty cabling or a
faulty Ethernet switch.
•
If this event is being logged by only one controller in a dual-controller system, swap the network-port
Ethernet cables between the two controllers. This will show whether the problem is outside or inside the
storage system.
•
If the problem is not outside of the storage system, replace the controller module that logged this event.
176
Info.
The error statistics for the indicated disk have been reset.
Recommended actions
•
No action is required.
177
Info.
Cache data were purged for the indicated missing volume.
Recommended actions
•
No action is required.
181
Info.
One or more configuration parameters associated with the Management Controller (MC) have been changed, such
as configuration for SNMP, SMI-S, email notification, and system strings (system name, system location, etc.).
Recommended actions
•
No action is required.
182
Info.
All disk channels have been paused. I/O will not be performed on the disks until all channels are unpaused.
Recommended actions
•
If this event occurs in relation to disk firmware update, no action is required. When the condition is cleared,
event 183 is logged.
•
If this event occurs and you are not performing disk firmware update, see “Resources for diagnosing and
resolving problems” in the RAIDar help for the event log panel, or the CLI help for the show events
command.
183
Info.
All disk channels have been unpaused, meaning that I/O can resume. An unpause initiates a rescan, which when
complete is logged as event 19.
This event indicates that the pause reported by event 182 has ended.
Recommended actions
•
No action is required.
185
Info.
An enclosure management processor (EMP) write command has completed.
NEXIO Farad 2300 Series Service Guide
115
Recommended actions
•
No action is required.
186
Info.
Enclosure parameters have been changed by a user.
Recommended actions
•
No action is required.
187
Info.
The write-back cache has been enabled.
Event 188 is the corresponding event that is logged when write-back cash is disabled.
Recommended actions
•
No action is required.
188
Info.
Write-back cache has been disabled.
Event 187 is the corresponding even that is logged when write-back cache is disabled.
Recommended actions
•
No action is required.
189
Info.
A disk channel that was previously degraded or failed is now healthy.
Recommended actions
•
No action is required.
190
Info
The controller module's supercapacitor pack has started charging.
This change met a condition to trigger the auto-write-through feature, which has disabled write-back cache and
put the system in write-through mode. When the fault is resolved, event 191 is logged to indicate that write-back
mode has been restored.
Recommended Actions:
•
If event 191 is not logged within 5 minutes after this event, the supercapacitor has probably failed and the
controller module should be replaced.
191
Info
The auto-write-through trigger event that caused event 190 to be logged has been resolved.
Recommended Actions:
•
No action is required.
192
Info
The controller module's temperature has exceeded the normal operating range.
This change met a condition to trigger the auto-write-through feature, which has disabled write-back cache and
put the system in write-through mode. When the fault is resolved, event 193 is logged to indicate that write-back
mode has been restored.
Recommended Actions:
•
116
If event 193 has not been logged since this event was logged, the over-temperature condition probably still
exists and should be investigated. Another over-temperature event was probably logged at approximately the
Event descriptions
same time as this event (such as event 39, 40, 168, 307, 469, 476, or 477); see the recommended actions for
that event.
193
Info
The auto-write-through trigger event that caused event 192 to be logged has been resolved.
Recommended Actions:
•
No action is required.
194
Info
The Storage Controller in the partner controller module is not up.
This indicates that a trigger condition has occurred that has caused the auto-write-through feature to disable
write-back cache and put the system in write-through mode. When the fault is resolved, event 195 is logged to
indicate that write-back mode has been restored.
Recommended Actions:
•
If event 195 has not been logged since this event was logged, the other Storage Controller is probably still
down and the cause should be investigated. Other events were probably logged at approximately the same time
as this event; see the recommended actions for those events.
195
Info
The auto-write-through trigger event that caused event 194 to be logged has been resolved.
Recommended Actions:
•
No action is required.
198
Info
A power supply has failed.
This indicates that a trigger condition has occurred that has caused the auto-write-through feature to disable
write-back cache and put the system in write-through mode. When the fault is resolved, event 199 is logged to
indicate that write-back mode has been restored.
Recommended Actions:
•
If event 199 has not been logged since this event was logged, the power supply probably does not have a
health of OK and the cause should be investigated. Another power-supply event was probably logged at
approximately the same time as this event (such as event 168); see the recommended actions for that event.
199
Info
The auto-write-through trigger event that caused event 198 to be logged has been resolved.
Recommended Actions:
•
No action is required.
200
Info
A fan has failed.
This indicates that a trigger condition has occurred that has caused the auto-write-through feature to disable
write-back cache and put the system in write-through mode. When the fault is resolved, event 201 is logged to
indicate that write-back mode has been restored.
Recommended Actions:
•
If event 201 has not been logged since this event was logged, the fan probably does not have a health of OK
and the cause should be investigated. Another fan event was probably logged at approximately the same time
as this event (such as event 168); see the recommended actions for that event.
NEXIO Farad 2300 Series Service Guide
117
201
Info
The auto-write-through trigger event that caused event 200 to be logged has been resolved.
Recommended Actions:
•
No action is required.
202
Info.
An auto-write-through trigger condition has been cleared, causing write-back cache to be re-enabled. The
environmental change is also logged at approximately the same time as this event (event 191, 193, 195, 199, 201,
and 241.)
Recommended actions
•
No action is required.
203
Warning An environmental change occurred that allows write-back cache to be enabled, but the auto-write-back preference
is not set. The environmental change is also logged at approximately the same time as this event (event 191, 193,
195, 199, 201, or 241).
Recommended actions
•
Manually enable write-back cache.
204
Error
This event is generated by the hardware-flush firmware when the boot-processing firmware needs to inform the
user about something.
The CompactFlash card is used for backing up unwritten cache data when a controller goes down unexpectedly,
such as when a power failure occurs. This event is generated when the Storage Controller (SC) detects a problem
with the CompactFlash as it is booting up.
Recommended actions
•
Restart the SC that logged this event.
•
If this event is logged again, shut down the Storage Controller, remove the controller module from the
enclosure, replace the CompactFlash card (which is accessible from the rear of the controller module, as
shown in your product's User Guide), and reinstall the controller module.
•
If this event is then logged again, replace the controller module.
Warning This event is generated by the hardware-flush firmware when the boot-processing firmware needs to inform the
user about something.
The CompactFlash card is used for backing up unwritten cache data when a controller goes down unexpectedly,
such as when a power failure occurs. This event is generated when the Storage Controller (SC) detects a problem
with the CompactFlash as it is booting up.
Recommended actions
Info.
•
Restart the storage controller that logged this event.
•
If this event is logged again, shut down the Storage, replace the controller module.
This event is generated by the hardware-flush firmware when the boot-processing firmware needs to inform the
user about something.
When logged as Informational severity, this event contains information that is primarily of interest to engineers.
Recommended actions
•
118
No action is required.
Event descriptions
205
Info.
The indicated volume has been mapped or unmapped.
Recommended actions
•
No action is required.
206
Info.
Vdisk scrub has started.
The scrub checks disks in the vdisk for the following types of errors:
•
Data parity errors for a RAID 3, 5, 6, or 50 vdisk.
•
Mirror verify errors for a RAID 1 or RAID 10 vdisk.
•
Media errors for all RAID levels including RAID 0 and non-RAID vdisk.
When errors are detected, they are automatically corrected.
When the scrub is complete, event 207 is logged.
Recommended actions
•
No action is required.
•
Vdisk Scrub is not supported on NEXIO Farad.
207
Error
Vdisk scrub completed and found an excessive number of errors in the indicated vdisk.
This event is logged as Error severity when more than 100 parity or mirror mismatches are found and corrected
during a scrub or when 1 to 99 parity or mirror mismatches are found and corrected during each of 10 separate
scrubs of the same vdisk.
For non-fault-tolerant RAID levels (RAID-0 and non-RAID), media errors may indicate loss of data.
Recommended actions
•
Resolve any non-disk hardware problems, such as a cooling problem or a faulty controller module, expansion
module, or power supply.
•
Check whether any disks in the vdisk have logged SMART events or unrecoverable read errors.
•
If so, and the vdisk is a non-fault-tolerant RAID level (RAID-0 or non-RAID), copy the data to a different
vdisk and replace the faulty disks.
•
If so, and the vdisk is a fault-tolerant RAID level, replace the faulty disks. Before replacing a disk, confirm
that a reconstruction is not currently running on the vdisk. It is also recommended to make a full backup of
all the data in the vdisk before replacing disks. If more than one disk in the vdisk has errors, replace the
disks one at a time and allow reconstruction to complete after each disk is replaced.
•
Vdisk Scrub is not supported on NEXIO Farad.
Warning Vdisk scrub did not complete because of an internally detected condition such as a failed disk.
If a disk fails, data may be at risk.
Recommended actions
•
Resolve any non-disk hardware problems, such as a cooling problem or a faulty controller module, expansion
module, or power supply.
•
Check whether any disks in the vdisk have logged SMART events or unrecoverable read errors.
•
If so, and the vdisk is a non-fault-tolerant RAID level (RAID-0 or non-RAID), copy the data to a different
vdisk and replace the faulty disks.
•
If so, and the vdisk is a fault-tolerant RAID level, replace the faulty disks. Before replacing a disk, confirm
that a reconstruction is not currently running in the vdisk. It is also recommended to make a full backup of
all the data in the vdisk before replacing disks. If more than one disk in the vdisk has errors, replace the
disks one at a time and allow reconstruction to complete after each disk is replaced.
•
Vdisk Scrub is not supported on NEXIO Farad.
NEXIO Farad 2300 Series Service Guide
119
Info.
Vdisk scrub completed or was aborted by a user.
This event is logged as Informational severity when fewer than 100 parity or mirror mismatches are found and
corrected during a scrub.
For non-fault-tolerant RAID levels (RAID-0 and non-RAID), media errors may indicate loss of data.
Recommended actions
•
No action is required.
•
Vdisk Scrub is not supported on NEXIO Farad.
208
Info.
A scrub-disk job has started for the indicated disk. The result will be logged with event 209.
Recommended actions
•
No action is required.
•
Scrub-disk is not supported on NEXIO Farad.
209
Error
A scrub-disk job logged with event 208 has completed and found one or more media errors, SMART events, or
hard (non-media) errors. If this disk is used in a non-fault-tolerant vdisk, data may have been lost.
Recommended actions
•
Replace the disk with one of the same type (SAS SSD, enterprise SAS, or midline SAS) and the same or
greater capacity. For continued optimum I/O performance, the replacement disk should have performance that
is the same as or better than the one it is replacing.
•
Scrub-disk is not supported on NEXIO Farad.
Warning A scrub-disk job logged with event 208 has reassigned a disk block. These bad-block replacements are reported as
"other errors". If this disk is used in a non-fault-tolerant vdisk, data may have been lost.
Recommended actions
Info.
•
Monitor the error trend and whether the number of errors approaches the total number of bad-block
replacements available.
•
Scrub-disk is not supported on NEXIO Farad.
A scrub-disk job logged with event 208 has completed and found no errors, or a disk being scrubbed (with no
errors found) has been added to a vdisk, or a user has aborted the job.
Recommended actions
•
No action is required.
•
Scrub-disk is not supported on NEXIO Farad.
211
Warning SAS topology has changed; no elements are detected in the SAS map. The message specifies the number of
elements in the SAS map, the number of expanders detected, the number of expansion levels on the native (local
controller) side and on the partner (partner controller) side, and the number of device PHYs.
Recommended actions
•
Perform a rescan to repopulate the SAS map.
•
If a rescan does not resolve the problem, then shut down and restart both controllers.
•
If the problem persists, see “Resources for diagnosing and resolving problems” in the RAIDar help for the
event log panel, or the CLI help for the show events command.
217
Error
120
A supercapacitor failure occurred in the controller.
Event descriptions
Recommended actions
•
Replace the controller module that logged this event.
218
Warning The supercapacitor pack is near end of life.
Recommended actions
•
Replace the controller module reporting this event.
219
Info.
Utility priority has been changed by a user.
Recommended actions
•
No action is required.
232
Warning The maximum number of enclosures allowed for the current configuration has been exceeded.
The platform does not support the number of enclosures that are configured. The enclosure indicated by this event
has been removed from the configuration.
Recommended actions
•
Reconfigure the system.
233
Warning The indicated disk type is invalid and is not allowed in the current configuration.
All disks of the disallowed type have been removed from the configuration.
Recommended actions
•
Replace the disallowed disks with ones that are supported.
235
Error
An enclosure management processor (EMP) detected a serious error.
Recommended actions
•
Info.
Replace the indicated controller module or expansion module.
An EMP reported an event.
Recommended actions
•
No action is required.
236
Info.
A special shutdown operation has started. These special shutdown types are used as part of the firmware-update
process.
Recommended actions
•
No action is required.
237
Info.
A firmware update has started and is in progress. This event provides details of the steps in a firmware-update
operation that may be of interest if you have problems updating firmware.
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
121
238
Warning An attempt to install a licensed feature failed due to an invalid license.
Recommended actions
•
Check the license for what is allowed for the platform, make corrections as appropriate, and reinstall.
239
Warning A timeout occurred while flushing the CompactFlash.
Recommended actions
•
Restart the Storage Controller that logged this event.
•
If this event is logged again, shut down the Storage Controller and replace the controller module.
240
Warning A failure occurred while flushing the CompactFlash.
Recommended actions
•
Restart the Storage Controller that logged this event.
•
If this event is logged again, shut down the Storage Controller and replace the controller module.
241
Info.
The auto-write-through trigger event that caused event 242 to be logged has been resolved.
Recommended actions
•
No action is required.
242
Error
The controller module's CompactFlash card has failed.
This change met a condition to trigger the auto-write-through feature, which has disabled write-back cache and
put the system in write-through mode. When the fault is resolved, event 241 is logged to indicate that write-back
mode has been restored.
Recommended actions
•
If event 241 has not been logged since this event was logged, the CompactFlash probably does not have health
of OK and the cause should be investigated. Another CompactFlash event was probably logged at
approximately the same time as this event (such as event 239, 240, or 481); see the recommended actions for
that event.
243
Info.
A new controller enclosure has been detected. This happens when a controller module is moved from one
enclosure to another and the controller detects that the midplane WWN is different from the WWN it has in its
local flash.
Recommended actions
•
No action is required.
245
Info.
An existing disk channel target device is not responding to SCSI discovery commands.
Recommended actions
•
122
Check the indicated target device for bad hardware or bad cable, then initiate a rescan.
Event descriptions
246
Warning The coin battery is not present, is not properly seated, or has reached end-of-life.
The battery provides backup power for the real-time (date/time) clock. In the event of a power failure, the date and
time will revert to 1980-01-01 00:00:00.
Recommended actions
•
Replace the controller module that logged this event. A service technician must replace or reseat the battery.
247
Warning The FRU ID SEEPROM for the indicated field replaceable unit (FRU) cannot be read; FRU ID data might not be
programmed.
FRU ID data includes the worldwide name, serial numbers, firmware and hardware versions, branding
information, etc. This event is logged once each time a Storage Controller (SC) is started for each FRU that is not
programmed.
Recommended actions
•
Return the FRU to have its FRU ID data reprogrammed.
248
Info.
A valid feature license was successfully installed. See event 249 for details about each licensed feature.
Recommended actions
•
No action is required.
249
Info.
After a valid license is installed, this event is logged for each licensed feature to show the new license value for
that feature. The event specifies whether the feature is licensed, whether the license is temporary, and whether the
temporary license is expired.
Recommended actions
•
No action is required.
250
Warning A license could not be installed.
The license is invalid or specifies a feature that is not supported on your product.
Recommended actions
•
Review the readme file that came with the license. Verify that you are trying to install the license in the system
that the license was generated for.
•
No action is required.
255
Info.
The PBCs across controllers do not match as PBC from controller A and PBC from controller B are from different
vendors. This may limit the available configurations.
Recommended actions
•
No action is required.
259
Info.
In-band CAPI commands have been disabled.
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
123
260
Info.
In-band CAPI commands have been enabled.
Recommended actions
•
No action is required.
261
Info.
In-band SES commands have been disabled.
Recommended actions
•
No action is required.
262
Info.
In-band SES commands have been enabled.
Recommended actions
•
No action is required.
263
Warning The indicated spare disk is missing. Either it was removed or it is not responding.
Recommended actions
•
Replace the disk with one oobtained from Customer Support that has the same or greater capacity.
•
Configure the disk as a global spare.
vdisks269
Info.
A partner firmware update operation has started. This operation is used to copy firmware from one controller to
the other to bring both controllers up to the same version of firmware.
Recommended actions
•
No action is required.
270
Warning Either there was a problem reading or writing the persistent IP data from the FRU ID SEEPROM, or invalid data
were read from the FRU ID SEEPROM.
Recommended actions
•
Check the IP settings (including iSCSI host-port IP settings for an iSCSI system), and update them if they are
incorrect.
271
Info.
The storage system could not get a valid serial number from the controller’s FRU ID SEEPROM, either because it
couldn’t read the FRU ID data, or because the data on it isn’t valid or hasn’t been programmed. Therefore, the
MAC address is derived by using the controller’s serial number from flash. This event is only logged one time
during bootup.
Recommended actions
•
No action is required.
273
Info.
PHY fault isolation has been enabled or disabled by a user for the indicated enclosure and controller module.
Recommended actions
•
124
No action is required.
Event descriptions
274
Warning The indicated PHY has been disabled, either automatically or by a user. Drive PHYs are automatically disabled
for empty disk slots or if a problem is detected. The following reasons indicate a likely hardware fault:
•
Disabled because of error count interrupts
•
Disabled because of excessive PHY change counts
•
PHY is ready but did not pass COMINIT
Recommended actions
•
If none of the reasons listed in the event description is indicated, no action is required.
•
If any of the reasons listed in the event description is indicated and the event occurs shortly after the storage
system is powered up, do the following:
•
•
Shut down the controllers. Then turn off the power for the indicated enclosure, wait a few seconds, and
turn it back on.
•
If the problem recurs and the event message identifies a disk slot, replace the disk in that slot.
•
If the problem recurs and the event message identifies a module, do the following:
•
If the indicated PHY type is Egress, replace the cable in the module's egress port.
•
If the indicated PHY type is Ingress, replace the cable in the module's ingress port.
•
For other indicated PHY types or if replacing the cable does not fix the problem, replace the indicated
module.
•
If the problem persists, check for other events that may indicate faulty hardware, such as an event
indicating an over-temperature condition or power supply fault, and follow the recommended actions for
those events.
•
If the problem still persists, the fault may be in the enclosure midplane. Replace the chassis-and-midplane
FRU.
If any of the reasons listed in the event description is indicated and this event is logged shortly after a failover,
user-initiated rescan, or restart, do the following:
•
If the event message identifies a disk slot, reseat the disk in that slot.
•
If the problem persists after reseating the disk, replace the disk.
•
If the event message identifies a module, do the following:
•
•
If the indicated PHY type is Egress, replace the cable in the module's egress port.
•
If the indicated PHY type is Ingress, replace the cable in the module's ingress port.
•
For other indicated PHY types or if replacing the cable does not fix the problem, replace the indicated
module.
•
If the problem persists, check for other events that may indicate faulty hardware, such as an event
indicating an over-temperature condition or power supply fault, and follow the recommended actions
for those events.
If the problem still persists, the fault may be in the enclosure midplane. Replace the chassis-and-midplane
FRU.
275
Info.
The indicated PHY has been enabled.
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
125
298
Warning The controller's real-time clock (RTC) setting is invalid.
This event will most commonly occur after a power loss if the real-time clock battery has failed. The time may
have been set to a time that is up to 5 minutes before the power loss occurred, or it may have been reset to
1980-01-01 00:00:00.
Recommended actions
•
Check the system date and time. If either is incorrect, set them to the correct date and time.
•
Also look for event 246 and follow the recommended action for that event.
299
Info.
The controller’s real-time clock (RTC) setting was successfully recovered.
This event will most commonly occur after an unexpected power loss.
Recommended actions
•
No action is required, but if event 246 is also logged, follow the recommended action for that event.
300
Info.
CPU frequency has changed to high.
Recommended actions
•
No action is required.
301
Info.
CPU frequency has changed to low.
Recommended actions
•
No action is required.
302
Info.
DDR memory clock frequency has changed to high.
Recommended actions
•
No action is required.
303
Info.
DDR memory clock frequency has changed to low.
Recommended actions
•
No action is required.
304
Info.
The controller has detected I2C errors that may have been fully recovered.
Recommended actions
•
No action is required.
305
Info.
A serial number in Storage Controller (SC) flash memory was found to be invalid when compared to the serial
number in the controller-module or midplane FRU ID SEEPROM. The valid serial number has been recovered
automatically.
Recommended actions
•
126
No action is required.
Event descriptions
306
Info.
The controller-module serial number in Storage Controller (SC) flash memory was found to be invalid when
compared to the serial number in the controller-module FRU ID SEEPROM. The valid serial number has been
recovered automatically.
Recommended actions
•
No action is required.
307
Critical
A temperature sensor on a controller FRU detected an over-temperature condition that caused the controller to
shut down.
Recommended actions
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5° C–40° C
(41° F–104° F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that logged the error.
309
Info.
Normally when the Management Controller (MC) is started, the IP data are obtained from the midplane FRU ID
SEEPROM where it is persisted. If the system is unable to write it to the SEEPROM the last time it changed, a
flag is set in flash memory. This flag is checked during startup, and if set, this event is logged and the IP data that
is in flash memory is used. The only time that this would not be the correct IP data would be if the controller
module was swapped and then whatever data are in the controller’s flash memory is used.
Recommended actions
•
No action is required.
310
Info.
After a rescan, back-end discovery and initialization of data for at least one EMP (Enclosure Management
Processor) has completed. This event is not logged again when processing completes for other EMPs in the
system.
Recommended actions
•
No action is required.
311
Info.
This event is logged when a user initiates a ping of a host via the iSCSI interface.
Recommended actions
•
If the ping operation failed, check connectivity between the storage system and the remote host.
312
Info.
This event is used by email messages and SNMP traps when testing notification settings. This event is not
recorded in the event log.
Recommended actions
•
No action is required.
NEXIO Farad 2300 Series Service Guide
127
313
Error
The indicated controller module has failed. This event can be ignored for a single-controller configuration.
Recommended actions
•
If this is a dual-controller system, replace the failed controller module. The module’s Fault/Service Required
LED will be illuminated (not blinking).
314
Error
The indicated FRU has failed or is not operating correctly. This event follows some other FRU-specific event
indicating a problem.
315
Critical
The controller module is incompatible with the enclosure.
The controller will automatically shut down. If two incompatible controllers are inserted at the same time or
booted at the same time, one controller will crash and the other will hang. This behavior is expected and prevents
data loss.
Recommended actions
•
Move the controller module to a compatible enclosure.
317
Error
A serious error has been detected on the Storage Controller’s disk interface. The controller that logged this event
will be killed by its partner.
Recommended actions
•
Visually trace the cabling between the controller modules and expansion modules.
•
If the cabling is OK, replace the controller module that logged this event.
•
If the problem recurs, replace the expansion module that is connected to the controller module.
319
Warning The indicated available disk has failed.
Recommended actions
•
Replace the disk with one obtained from Customer Support that has the same or greater capacity.
322
Warning The controller has an older Storage Controller (SC) version than the version used to create the CHAP
authentication database in the controller’s flash memory.
The CHAP database cannot be read or updated. However, new records can be added, which will replace the
existing database with a new database using the latest known version number.
Recommended actions
•
Upgrade the controller firmware to a version whose SC is compatible with the indicated database version.
•
If no records were added, the database becomes accessible and remains intact.
•
If records were added, the database becomes accessible but contains only the new records.
352
Info.
Expander Controller (EC) assert data or stack-dump data are available.
Recommended actions
•
No action is required.
353
Info.
128
Expander Controller (EC) assert data and stack-dump data have been cleared.
Event descriptions
Recommended actions
•
No action is required.
354
Warning SAS topology has changed on a host port; at least one PHY has gone down. For example, the SAS cable
connecting a controller host port to a host has been disconnected.
Recommended actions
Info.
•
Check the cable connection between the indicated port and the host.
•
Monitor the log to see if the problem persists.
SAS topology has changed on a host port; at least one PHY has gone up. For example, the SAS cable connecting a
controller host port to a host has been connected.
Recommended actions
•
No action is required.
355
Warning The controller module's debug button was found to be stuck in the On position during boot up.
Recommended actions
•
If the button remains stuck, replace the controller module.
356
Warning This event can only result from tests that are run in the manufacturing environment.
Recommended actions
•
Follow the manufacturing process.
357
Warning This event can only result from tests that are run in the manufacturing environment.
Recommended actions
•
Follow the manufacturing process.
358
Critical
All PHYs are down for the indicated disk channel. The system is degraded and is not fault-tolerant because all
disks are in a single-ported state.
Recommended actions
•
Turn off the power for the controller enclosure, wait a few seconds, and turn it back on.
•
If the condition doesn't persist (that is, if event 359 has been logged for the indicated channel), no further
action is required.
•
If the condition persists, this indicates a hardware problem in one of the controller modules or in the
controller enclosure midplane. For help identifying which FRU to replace, see “Resources for diagnosing
and resolving problems” in the RAIDar help for the event log panel, or the CLI help for the show
events command.
Warning Some, but not all, PHYs are down for the indicated disk channel.
Recommended actions
•
Monitor the log to see whether the condition persists.
•
If the condition doesn't persist (that is, if event 359 has been logged for the indicated channel), no further
action is required.
•
If the condition persists, this indicates a hardware problem in one of the controller modules or in the
controller enclosure midplane. For help identifying which FRU to replace, see “Resources for diagnosing
NEXIO Farad 2300 Series Service Guide
129
and resolving problems” in the RAIDar help for the event log panel, or the CLI help for the show
events command.
359
Info.
All PHYs that were down for the indicated disk channel have recovered and are now up.
Recommended actions
•
No action is required.
360
Info.
The speed of the indicated disk PHY was renegotiated.
Recommended actions
•
No action is required.
361
Critical, The scheduler experienced a problem with the indicated schedule.
Error, or
Recommended actions
Warning
• Take appropriate action based on the indicated problem.
Info.
A scheduled task was initiated.
Recommended actions
•
No action is required.
362
Critical, The scheduler experienced a problem with the indicated task.
Error, or
Recommended actions
Warning
• Take appropriate action based on the indicated problem.
Info.
The scheduler experienced a problem with the indicated task.
Recommended actions
•
No action is required.
363
Error
When the Management Controller (MC) is restarted, firmware versions that are currently installed are compared
against those in the bundle that was most recently installed. When firmware is updated, it is important that all
components are successfully updated or the system may not work correctly. Components checked include the
CPLD, Expander Controller (EC), Storage Controller (SC), and MC.
Recommended actions
•
130
Reinstall the firmware bundle.
Event descriptions
Info.
When the Management Controller (MC) is restarted, firmware versions that are currently installed are compared
against those in the bundle that was most recently installed. If the versions match, this event is logged as
Informational severity. Components checked include the CPLD, Expander Controller (EC), Storage Controller
(SC), and MC.
Recommended actions
•
No action is required.
364
Info.
The broadcast bus is running as generation 1.
Recommended actions
•
No action is required.
365
Warning An uncorrectable ECC error has occurred in the CPU memory.
Recommended actions
•
Replace the controller module that logged this event.
400
Info.
The indicated log has filled to a level at which it needs to be transferred to a log-collection system.
Recommended actions
•
No action is required.
401
Warning The indicated log has filled to a level at which diagnostic data will be lost if not transferred to a log-collection
system.
Recommended actions
•
Transfer the log file to the log-collection system.
402
Error
The indicated log has wrapped and has started to overwrite its oldest diagnostic data.
Recommended actions
•
Investigate why the log-collection system is not transferring the logs before they are overwritten. For example,
you might have enabled managed logs without configuring a destination to send logs to.
412
Warning One disk in the indicated RAID-6 vdisk failed. The vdisk is on line but has a status of FTDN (fault tolerant with a
down disk).
If a global spare is present, that spare is used to automatically reconstruct the vdisk; events 9 and 37 are logged to
indicate this.
Recommended actions
•
If no spare was present (that is, event 37 was NOT logged), configure an available disk as a global spare for
the vdisk or replace the failed disk and configure the new disk as a global spare. That spare will be used to
automatically reconstruct the vdisk; confirm this by checking that events 9 and 37 are logged.
•
Otherwise, reconstruction automatically started and event 37 was logged. Replace the failed disk and
configure the replacement as a global spare for future use.
•
Confirm that all failed disks have been replaced and that there are sufficient spare disks configured for future
use.
NEXIO Farad 2300 Series Service Guide
131
427
Warning A communication error occurred when sending information between storage systems.
Recommended actions
•
Check your network or fabric for abnormally high congestion or connectivity issues.
442
Warning Power-On Self Test (POST) diagnostics detected a hardware error in a UART chip.
Recommended actions
•
Replace the controller module that logged this event.
454
Info.
A user changed the drive-spin-down delay for the indicated vdisk to the indicated value.
Recommended actions
•
No action is required.
455
Warning The controller detected that the configured host-port link speed exceeded the capability of a hardware component
such as a FC SFP. The speed has been automatically reduced to the maximum value supported by all hardware
components in the data path.
Recommended actions
•
Replace the SFP in the indicated port with an SFP that supports a higher speed.
456
Warning The system's IQN was generated from the default OUI because the controllers could not read the OUI from the
midplane FRU ID data during startup. If the IQN is wrong for the system's branding, iSCSI hosts might be unable
to access the system.
Recommended actions
•
If event 270 with status code 0 is logged at approximately the same time, restart the controllers.
464
Warning A user inserted an unsupported cable or SFP into the indicated controller host port.
Recommended actions
•
Replace the cable or SFP with a supported type, as specified in your product’s Setup Guide.
465
Info.
A user removed an unsupported cable or SFP from the indicated controller host port.
Recommended actions
•
No action is required.
468
Info.
FPGA temperature has returned to the normal operating range and the speed of buses connecting the FPGA to
downstream adapters has been restored. The speed was reduced to compensate for an FPGA over-temperature
condition.
This event indicates that a problem reported by event 469 is resolved.
Recommended actions
•
132
No action is required.
Event descriptions
469
Warning The speed of buses connecting the FPGA to downstream adapters has been reduced to compensate for an FPGA
over-temperature condition.
The storage system is operational but I/O performance is reduced.
Recommended actions
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5°C–40°C (41°F–104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that logged the error.
When the problem is resolved, event 468 is logged.
476
Warning The storage system is operational but I/O performance is reduced
Recommended actions:
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5°C–40°C
(41°F–104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that logged the error.
When the problem is resolved, event 478 is logged.
477
Info.
The storage system is operational but I/O performance is reduced.
Recommended actions:
•
Check that the storage system’s fans are running.
•
Check that the ambient temperature is not too warm. The enclosure operating range is 5°C–40°C
(41°F−104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every module slot in the enclosure.
•
If none of the above explanations apply, replace the controller module that logged the error.
When the problem is resolved, event 478 is logged.
478
Info.
This event indicates that a problem reported by event 476 or 477 is resolved.
Recommended actions:
•
No action is required.
479
Error
The controller reporting this event was unable to flush data to or restore data from non-volatile memory.
This mostly likely indicates a CompactFlash failure, but it could be caused by some other problem with the
controller module. The Storage Controller that logged this event will be killed by its partner controller, which will
use its own copy of the data to perform the flush or restore operation.
Recommended actions
•
If this is the first time this event has been logged, restart the killed Storage Controller.
NEXIO Farad 2300 Series Service Guide
133
•
If this event is then logged again, replace the controller module.
480
Error
An IP address conflict was detected for the indicated iSCSI port of the storage system. The indicated IP address is
already in use.
Recommended actions
•
Contact your data-network administrator to help resolve the IP address conflict.
481
Error
The periodic monitor of CompactFlash hardware detected an error. The controller was put in write-through mode,
which reduces I/O performance.
Recommended actions
•
Restart the Storage Controller that logged this event.
•
If this event is logged again, shut down the Storage Controller and replace the controller module.
482
Warning One of the PCIe buses is running with fewer lanes than it should.
This event is the result of a hardware problem that has caused the controller to use fewer lanes. The system works
with fewer lanes, but I/O performance is degraded.
Recommended actions:
Replace the controller module that logged this event.
483
Error
An invalid expansion-module connection was detected for the indicated disk channel. An egress port is connected
to an egress port, or an ingress port is connected to an incorrect egress port.
Recommended actions
•
Visually trace the cabling between enclosures and correct the cabling.
484
Warning No compatible spares are available to reconstruct this vdisk if it experiences a disk failure.
This situation puts data at increased risk because it will require user action to configure a disk as a global spare
before reconstruction can begin on the indicated vdisk if a disk in that vdisk fails in the future.
If the last global spare has been deleted or used for reconstruction, all vdisks are at increased risk
Note that even though there may be global spares still available, they cannot be used for reconstruction of a vdisk
if that vdisk uses larger-capacity disks or a different type of disk; so this event may be logged even when there are
unused global spares.
Recommended actions
•
Configure disks as global spares. Be sure to obtain replacement disks from Customer Support.
485
Warning The indicated vdisk was quarantined to prevent writing invalid data that may exist in the controller that logged this
event.
This event is logged to report that the indicated vdisk has been put in the quarantined offline state (status of
QTOF) to prevent loss of data. The controller that logged this event has detected (via information saved in the
vdisk metadata) that it may contain outdated data that should not be written to the vdisk. Data may be lost if you
do not follow the recommended actions carefully. This situation is typically caused by removal of a controller
module without shutting it down first, then inserting a different controller module in its place. To avoid having this
problem occur in the future, always shut down the Storage Controller in a controller module before removing it.
This situation may also be caused by failure of the CompactFlash card, as indicated by event 204.
134
Event descriptions
Recommended actions
•
If event 204 is logged, follow the recommended actions for event 204.
•
If event 204 is NOT logged, perform the following recommended actions:
•
If event 486 is not logged at approximately the same time as event 485, reinsert the removed controller
module, shut it down, then remove it again.
•
If events 485 and 486 are both logged at approximately the same time, wait at least 5 minutes for the
automatic recovery process to complete. Then sign in and confirm that both controller modules are
operational. (You can determine if the controllers are operational with the show controllers CLI
command or with the WBI.) In most cases, the system will come back up and no further action is required.
If both controller modules do not become operational in 5 minutes, data may have been lost. If both
controllers are not operational, follow this recovery process:
•
Remove the controller module that first logged event 486.
•
Turn off the power for the controller enclosure, wait a few seconds, then turn it back on.
•
Wait for the controller module to restart, then sign in again.
•
Check the status of the vdisks. If any of the vdisks have a status of quarantined offline (QTOF),
dequarantine those vdisks.
•
Reinsert the previously removed controller module. It should now restart successfully.
486
Info.
A recovery process was initiated to prevent writing invalid data that may exist in the controller that logged this
event. The controller that logged this event has detected (via information saved in the vdisk metadata) that it may
contain outdated data that should not be written to the vdisks. The controller will log this event, restart the partner
controller, wait 10 seconds, then kill itself. The partner controller will then unkill this controller and mirror the
correct cache data to it. This procedure will, in most cases, allow all data to be correctly written without any loss
of data and without writing any outdated data.
Recommended actions
•
Wait at least 5 minutes for the automatic recovery process to complete. Then sign in and confirm that both
controller modules are operational. (You can determine if the controllers are operational with the show
redundancy-mode CLI command or the System Redundancy table in the System Overview panel of the
RAIDar.) In most cases, the system will come back up and no action is required.
•
If both controller modules do not become operational in 5 minutes, see the recommended actions for event
485, which will be logged at approximately the same time.
487
Info.
Historical performance statistics were reset.
Recommended actions
•
No action is required.
488
Info
Creation of a group of volumes started.
Recommended actions
•
No action is required.
489
Info
Creation of a group of volumes completed.
Recommended actions
•
No action is required.
490
Info
Creation of a group of volumes failed.
NEXIO Farad 2300 Series Service Guide
135
Recommended actions
•
No action is required.
491
Info
Creation of a local group of volumes started.
Recommended actions
•
No action is required.
492
Info
A local group of volumes was dissolved.
Recommended actions
•
No action is required.
493
Info
A local group of volumes was modified.
Recommended actions
•
No action is required.
495
Warning The algorithm for best-path routing selected the alternate path to the indicated disk because the I/O error count on
the primary path reached its threshold.
The controller that logs this event indicates which channel (path) has the problem. For example, if the B controller
logs the problem, the problem is in the chain of cables and expansion modules connected to the B controller
module.
Recommended actions
•
•
If this event is consistently logged for only one disk in an enclosure, perform the following actions:
•
Replace the disk.
•
If that does not resolve the problem, the fault is probably in the enclosure midplane. Replace the
chassis-and-midplane FRU for the indicated enclosure.
If this event is logged for more than one disk in an enclosure or disks in multiple enclosures, perform the
following actions:
•
Check for disconnected SAS cables in the bad path. If no cables are disconnected, replace the cable
connecting to the ingress port in the most-upstream enclosure with reported failures. If that does not
resolve the problem, replace other cables in the bad path, one at a time until the problem is resolved.
•
If that does not resolve the problem, replace the expansion modules that are in the bad path. Begin with the
most-upstream module that is in an enclosure with reported failures. If that does not resolve the problem,
replace other expansion modules (and the controller module) upstream of the affected enclosure(s), one at
a time until the problem is resolved.
•
If that does not resolve the problem, the fault is probably in the enclosure midplane. Replace the
chassis-and-midplane FRU of the most-upstream enclosure with reported failures. If that does not resolve
the problem and there is more than one enclosure with reported failures, replace the chassis-and-midplane
FRU of the other enclosures with reported failures until the problem is resolved.
496
Warning An unsupported disk type was found.
Recommended actions
•
136
Replace the disk with a supported type.
Event descriptions
497
Info
A disk copyback operation started. The indicated disk is the source disk. NOTE: Slot Affinity and the Copyback
process are not supported on NEXIO Farad.
When a disk fails, reconstruction is performed using a spare disk. When the failed disk is replaced, the data that
was reconstructed on the spare disk (and any new data that was written to it) is copied to the disk in the slot where
the data was originally located. This is known as slot affinity. For the copyback operation, the reconstructed disk is
called the source disk, and the newly replaced disk is called the destination disk. All of the data is copied from the
source disk to the destination disk and the source disk then becomes a spare disk again.
Recommended actions
•
No action is required.
498
Warning A disk copyback operation failed. NOTE: Slot Affinity and the Copyback process are not supported on NEXIO
Farad.
When a disk fails, reconstruction is performed using a spare disk. When the failed disk is replaced, the data that
was reconstructed in the spare disk (and any new data that was written to it) is copied to the disk in the slot where
the data was originally located. However, this copyback operation failed. This is probably because the disk that
was inserted as a replacement for the failed disk is also faulty. This failure could also be caused by a fault in the
midplane that the disk is inserted into.
Recommended actions
Info
•
Replace the destination disk with one of the same type (SAS SSD, enterprise SAS, or midline SAS) and the
same or greater capacity. For continued optimum I/O performance, the replacement disk should have
performance that is the same as or better than the one it is replacing. (See event 499 to identify the destination
disk.)
•
If the problem then recurs for the same slot, replace the chassis-and-midplane FRU.
A disk copyback operation completed. NOTE: Slot Affinity and the Copyback process are not supported on
NEXIO Farad.
Recommended actions
•
If the event message indicates that one or more uncorrectable media errors occurred during the copyback,
some user data may have been lost. Use backup copied of the data or other means to restore any lost data.
•
Otherwise, no action is required.
499
Info
A disk copyback operation started. The indicated disk is the destination disk. NOTE: Slot Affinity and the
Copyback process are not supported on NEXIO Farad.
When a disk fails, reconstruction is performed using a spare disk. When the failed disk is replaced, the data that
was reconstructed in the spare disk (and any new data that was written to it) is copied to the disk in the slot where
the data was originally located. This is known as slot affinity. For the copyback operation, the reconstructed disk is
called the source disk, and the newly replaced disk is called the destination disk. All of the data is copied from the
source disk to the destination disk and the source disk then becomes a spare disk again.
Recommended actions
•
If the event message indicates that one or more uncorrectable media errors occurred during the copyback,
some user data may have been lost. Use backup copies of the data or other means to restore any lost data.
•
Otherwise, no action is required.
500
Info
A disk copyback operation completed. The indicated disk was restored to being a spare. NOTE: Slot Affinity and
the Copyback process are not supported on NEXIO Farad.
When a disk fails, reconstruction is performed using a spare disk. When the failed disk is replaced, the data that
was reconstructed in the spare disk (and any new data that was written to it) is copied to the disk in the slot where
NEXIO Farad 2300 Series Service Guide
137
the data was originally located. This is known as slot affinity. For the copyback operation, the reconstructed disk is
called the source disk, and the newly replaced disk is called the destination disk. All of the data is copied from the
source disk to the destination disk and the source disk then becomes a spare disk again.
Recommended actions
•
No action is required.
501
Error
The enclosure hardware is not compatible with the I/O module firmware.
The Expander Controller firmware detected an incompatibility with the midplane type. As a preventive measure,
disk access was disabled in the enclosure.
Recommended actions
•
Ensure that incompatible components have not been inserted. If this is not the case, contact your vendor.
502
Error
A solid-state disk (SSD) is nearing its end of life. This event is logged with Error severity when the SSD has 1%
of its life remaining. NOTE: Slot Affinity and the Copyback process are not supported on NEXIO Farad.
Recommended actions
•
Replace the SSD with one of the same type and capacity.
Warning A solid-state disk (SSD) is nearing its end of life. This event is logged with Warning severity when the SSD has
5% of its life remaining. NOTE: Slot Affinity and the Copyback process are not supported on NEXIO Farad.
When the device has 1% of its life left, this event will be logged again with a severity of Error.
Recommended actions
Info.
•
Be sure you have a spare SSD of the same type and capacity available.
•
If a spare is available, it is recommended to replace the SSD now.
A solid-state disk (SSD) is nearing its end of life. This event is logged with Informational severity when the SSD
has 20% of its life remaining. NOTE: Slot Affinity and the Copyback process are not supported on NEXIO Farad.
When the device has 5% of its life left, this event will be logged again with a severity of Warning.
Recommended actions
•
You should obtain a replacement SSD of the same type and capacity if you do not already have one available.
Troubleshooting steps for leftover disk drives
Storage systems use metadata on hard drives to identify vdisk members and identify other disk members of the
vdisk.
Hard drives enter a Leftover state for several reasons:
•
Drive spin up was not completed before a controller polled the drive. When the controller queries the drive and
finds the drive is not in a ready state, the controller may place the drive into a Leftover state.
•
Improper power-on sequences.
•
Firmware upgrade (due to a timing issue).
•
Failover taking longer than expected.
•
The drive is swapped from another system, or removed and reinserted in the storage system.
NOTE: Swapping drives between systems or removing and then reinserting a drive is not supported on Farad
systems due to the affects of clearing metadata on a Leftover drive. If clear metadata is used, you can expect a
temporary loss in I/O to the all drives in the same stack.
138
Event descriptions
Metadata on a disk identifies the disk as being a member of a vdisk. Improperly clearing the metadata from a disk
may cause permanent data loss.
CAUTION: Clearing metadata from a leftover drive should be done with extreme care and only with the advice
and consent of Customer Support. Only clear metadata if you are certain the drive has never been associated with
a vdisk in this system or contains no data. This situation most often occurs when inserting a previously used hard
drive into a live system or moving a drive between two systems.
Never clear metadata from a drive if any vdisk in the storage system is in an Offline, Quarantined, or inaccessible
state. Do not clear metadata from a drive if you are unsure this is the correct step to take. Clearing metadata from
a drive permanently clears all data from the drive. In these types of situations, a backup of data should be done if
possible.
Using the trust command
NOTE: Generally, the trust commane is not used on NEXIO Farad. You should only use this command if you
are directed to do so by Customer Support.
The CLI trust command should only be used as a last step in a disaster recovery situation. This command has
the potential to cause permanent data loss and unstable operation of the vdisk. If a vdisk with a single disk is in a
leftover or failed condition, the trust command should never be used. The trust command should only be
used if the vdisk is in an Offline state.
If a single disk in a vdisk has failed or been placed into a Leftover state due to errors, reintegrating the disk into
the same or a different vdisk has the potential to cause data loss. A hard drive that has failed or been placed into a
Leftover state due to multiple errors should be replaced with a new hard drive. Assign the new hard drive back to
the vdisk as a spare and allow reconstruction to complete in order to return the vdisk to a fault-tolerant state.
The trust command attempts to resynchronize leftover disks in order to make any leftover disk an active
member of the vdisk again. The user might need to take this step when a vdisk is offline because there is no data
backup, or as a last attempt to try to recover the data on a vdisk. In this case, trust may work, but only as long
as the leftover disk continues to operate. When the "trusted" vdisk is back online, backup all data on the vdisk and
verify all data to ensure it is valid. The user then needs to delete the trusted vdisk, create a new vdisk, and restore
data from the backup to the new vdisk.
IMPORTANT: Using trust on a vdisk is only a disaster-recovery measure; the vdisk has no tolerance for
additional failures and should never be put back into a production environment.
CAUTION: Before trusting a vdisk, carefully read the cautions and procedures for using the trust command
in the CLI Reference Guide and online help.
Once the trust command has been issued on a vdisk, further troubleshooting steps may be limited towards
disaster recovery. If you are unsure of the correct action to take, contact technical support for further assistance.
NEXIO Farad 2300 Series Service Guide
139
Power supply faults and recommended actions
Table 24
Power supply faults and recommended actions
Fault
Recommended action
Power supply fan warning or failure.
•
Check that all modules in the enclosure are
properly seated in their slots and that their latches
are locked.
•
Check that each power cable is firmly plugged into
both the power supply and a functional electrical
outlet.
•
Check that all of the enclosure’s fans are running.
•
Check that the ambient temperature is not too
warm. The enclosure operating range is 5°–40°C
(41°–104°F).
•
Check for any obstructions to the airflow.
•
Check that there is a module or blank plate in every
module slot in the enclosure.
•
If none of the above resolve the issue, the indicated
power supply has probably failed and should be
replaced. The failed power supply will probably
have an amber LED lit.
Power supply warning or failure.
Power supply module status is listed as failed or you
receive a voltage event notification.
(Event code 168.)
Power LED is off.
Same as above.
Voltage/Fan Fault/Service Required LED is on.
Replace the power supply module.
Events sent as indications to SMI-S clients
If the storage system’s SMI-S interface is enabled, the system will send events as indications to SMI-S clients so
that SMI-S clients can monitor system performance. For information about enabling the SMI-S interface, see the
chapter about configuring the system in the RAIDar User Guide.
The event categories below pertain to FRU assemblies and certain FRU components.
Table 25
140
Events and corresponding SMI-S indications
FRU/Event category Corresponding SMI-S class
Operation status values that would trigger alert
conditions
Controller
DHS_Controller
Down, Not Installed, OK
Hard Disk Drive
DHS_DiskDrive
Unknown, Missing, Error, Degraded, OK
Fan
DHS_PSUFan
Error, Stopped, OK
Power Supply
DHS_PSU
Unknown, Error, Other, Stressed, Degraded, OK
Temperature Sensor
DHS_OverallTempSensor
Unknown, Other, Error, Non-Recoverable Error,
Degraded, OK
Battery/SuperCap
DHS_SuperCap
Unknown, Error, OK
FC Port
DHS_FCPort
Stopped, OK
SAS Port
DHS_SASTargetPort
Stopped, OK
ISCSI Port
DHS_ISCSIEthernetPort
Stopped, OK
Event descriptions
B
System LEDs
This appendix describes system LEDs for NEXIO Farad 2300 Series controller enclosures.
12-disk enclosure front panel LEDs
Figure 16 Farad 2300 Front Bezel
Left ear
Right ear
0
1
2
3
4
5
6
7
8
9
10
11
1
2
4
5
6
3
OK
4
5
6
7
7
(Silk screens on bezel)
Note: Bezel is removed to show front panel LEDs. Integers on disks indicate disk slot numbering sequence.
Figure 17 Farad 2300 Without Bezel
Table 26
LED
LEDs: Enclosure Front Panel
Description
Definition
1 Enclosure ID
Green — On
Enables you to correlate the enclosure with logical views presented by
management software. Sequential enclosure ID numbering of controller
enclosures begins with the integer 0. The enclosure ID for an attached disk
enclosure is nonzero.
2 Disk drive — Upper LED
See Disk drive LEDs on page 142.
3 Disk drive — Lower LED
See Disk drive LEDs on page 142.
4 Unit Locator
White blink — Enclosure is identified
Off — Normal operation
5 Fault/Service Required
Amber — On
Enclosure-level fault condition exists. The event has been acknowledged but
the problem needs attention.
Off — No fault condition exists.
NEXIO Farad 2300 Series Service Guide
141
Table 26
LED
LEDs: Enclosure Front Panel
Description
Definition
6 FRU OK
Green — On
The enclosure is powered on with at least one power supply operating
normally.
Off — Both power supplies are off; the system is powered off.
7 Temperature Fault
Green — On
The enclosure temperature is normal.
Amber — On
The enclosure temperature is above threshold.
Disk drive LEDs
Figure 18 Disk Drive
Table 27
LEDs - Disk drive
LED No./Description
Color
State
Definition
1— Power/Activity
Green
On
The disk drive module is operating normally.
Blink
The disk drive module is initializing; active and processing I/O;
performing a media scan; or the vdisk is initializing or
reconstructing.
Off
If not illuminated and Fault is not illuminated, the disk is not
powered on.
On
The disk has failed; experienced a fault; is a leftover; or the vdisk
that it is associated with is down or critical.
Blink
Physically identifies the disk; or locates a leftover (also see Blue).
Off
If not illuminated and Power/Activity is not illuminated, the disk is
not powered on.
Blink
Leftover disk from vdisk is located (alternates blinking amber).
2— Fault
Amber
Blue
Table 28
Disk Drive LEDs
Disk drive module LED behavior
Bottom
LED
Description
State
Color
Action
Disk drive OK, FTOL
Off
None
None
On (operating normally)
Green
On
OK to remove
Green
Blink
Blue
On
Identifying self — offline/online
Amber
Blink
Initializing
Green
Blink
Active and processing I/O
Green
Blink
Performing a media scan
Green
Blink
Disk drive is a leftover
Amber
On
Identifying a leftover
Amber2
Blink
Blue1
On
Disk drive I/O
Disk drive leftover
142
Top
LED
System LEDs
Table 28
Disk Drive LEDs
Disk drive module LED behavior
1
Top
LED
Bottom
LED
Description
State
Color
Action
Disk drive failed
Fault or failure
Amber
On
Fault and remove disk drive
Amber
On
Blue
On
Fault and identify disk drive
Amber
Blink
Fault, identify, and remove disk drive
Amber
Blink
Blue
On
Top
LED
Bottom
LED
This color may or may not illuminate.
LED blinks amber/green.
2Bitonal
Table 29
LEDs: Vdisk
Vdisk LED behavior
Description
State
Color
Action
FTOL
On (operating normally)
Green
Blink
Vdisk activity
Vdisk is reconstructing
Green
Blink
Vdisk is initializing
Green
Blink
Vdisk is critical/down
See note 1 below
Vdisk degraded
1Individual
disks will display fault LEDs
NEXIO Farad 2300 Series Service Guide
143
Controller enclosure: Rear panel layout
A controller enclosure accommodates two power supply FRUs of the same type (either both AC or both DC)
within the two power supply slots. The controller enclosure accommodates up to two controller module FRUs of
the same type within the IOM slots.
Figure 19 and Table 30 provide descriptions for the controller modules and power supply modules that are used on
a NEXIO Farad 2300 Series controller enclosure. Showing controller modules and power supply modules
separately from the enclosure enables improved clarity in identifying the component items called out in the
diagrams and described in the tables.
LED descriptions are also provided for optional disk enclosures supported by the NEXIO Farad 2300 Series
controller enclosures.
NEXIO Farad 2300 Controller Modules: Rear panel LEDs
4
FC 0
FC 1
FC 2
6
FC 3
6Gb/s
2,4G
8G
2,4G
8G
2,4G
8G
2,4G
8G
CACHE
CLI
ACT
LINK
SERVICE
CLI
1
2
3
5
7
8
9
Figure 19 NEXIO Farad 2300 Controller Modules
Table 30
LEDs: NEXIO Farad Controller modules — rear panel
LED No./Description
Color
State
Definition
1 — Host 2/4/8 Gbit FC
Link Status/
Link Activity
Green
On
Port is connected and the link is up.
2,4 G LED illuminates — link speed is 2 or 4 Gbit/s
8 G LED illuminates — link speed is 8 Gbit/s
Off
Both LEDs off — link speed is 1 Gbit/s1
Blink
1Hz — no link detected
Off
Ethernet link has no I/O activity.
Blink
Ethernet link has I/O activity.
On
The Ethernet link is up.
Off
Ethernet port is not connected or the link is down.
On
The controller module can be removed.
Off
The controller module is not prepared for removal.
Off
Normal operation.
Blink
Physically identifies the controller module.
On
Controller module is operating normally.
Off
Controller module is not OK.
Blink
System is booting.
On
A fault is detected or a service action is required.
Blink
Hardware-controlled power-up, or a cache flush or restore error.
2 — Network Port Activity
3 — Network Port Link Status
4 — OK to Remove
5 — Unit Locator
6 — FRU OK
7 — Fault/Service Required
144
System LEDs
Green
Green
Blue
White
Green
Amber
Table 30
LEDs: NEXIO Farad Controller modules — rear panel
LED No./Description
Color
State
Definition
8 — Cache Status
Green
On
Cache is dirty and operation is normal.
Off
Cache is clean (contains no unwritten data).
Blink
CompactFlash flush or cache self-refresh is in progress, indicating
cache activity.
(See Cache Status LED details on page 145)
On
Port is connected and the link is up.
Off
Port is empty or link is down.
9 — Expansion Port Status
1The
Green
8 Gbit SFP modules do not support 1 Gbit link speeds.
NOTE:
Once a Link Status LED is lit, it remains so, even if the controller is shut down via RAIDar or CLI.
When a controller is shut down or otherwise rendered inactive its Link Status LED remains illuminated, falsely
indicating that the controller can communicate with the host. Though a link exists between the host and the chip
on the controller, the controller is not communicating with the chip. To reset the LED, the controller must be
power-cycled.
Cache Status LED details
If the LED is blinking evenly, a cache flush is in progress. When a controller module loses power and write cache is dirty
(contains data that has not been written to disk), the supercapacitor pack provides backup power to flush (copy) data from write
cache to CompactFlash memory. When cache flush is complete, the cache transitions into self-refresh mode.
If the LED is blinking momentarily slowly, the cache is in a self-refresh mode. In self-refresh mode, if primary power is
restored before the backup power is depleted (3–30 minutes, depending on various factors), the system boots, finds data
preserved in cache, and writes it to disk. This means the system can be operational within 30 seconds, and before the typical
host I/O time-out of 60 seconds, at which point system failure would cause host-application failure. If primary power is
restored after the backup power is depleted, the system boots and restores data to cache from CompactFlash, which can take
about 90 seconds.
The cache flush and self-refresh mechanism is an important data protection feature; essentially four copies of user data are
preserved: one in controller cache and one in CompactFlash of each controller.
Power supply LEDs
Power redundancy is achieved through two independent load-sharing power supplies. In the event of a power
supply failure, or the failure of the power source, the storage system can operate continuously on a single power
supply. Greater redundancy can be achieved by connecting the power supplies to separate circuits. AC power
supplies have a power switch and will power up when the cables are plugged in.
NEXIO Farad 2300 Series Service Guide
145
1
2
AC Model
Figure 20 PSUs
Table 31
LEDs: PSUs — rear panel
LED No./Description
Color
State
Definition
1 — Input Source Power Good
Green
On
Power is on and input voltage is normal.
Off
Power is off, or input voltage is below the minimum threshold.
On
Output voltage is out of range, or a fan is operating below the
minimum required r/min.
Off
Output voltage is normal.
2 — Voltage/Fan Fault/Service
Required
146
System LEDs
Amber
C
Available FRUs
You can determine which FRUs pertain to your controller enclosure using the CLI. Access the controller via a
telnet client; log into the controller over the network (default user name manage and password !manage). If the
default user or password — or both — have been changed for security reasons, enter the secure login credentials
instead of the defaults shown above.
Enter a show frus query.
Execution of the show frus CLI command displays the FRU information pertaining to chassis (with
midplane), controller module(s), and power supplies.
NOTE:
See NEXIO Farad 2300 Series CLI Reference Guide for more information.
You can also determine which FRUs pertain to your controller enclosure by visual inspection of the component,
noting serial number and part number. This method applies to disk drives. FRUs and FRU make-up are subject to
change independent of documentation versions. Contact Customer Support to obtain replacement parts or spares.
Product overview
NOTE:
Tables and companion illustrations show FRUs for NEXIO Farad 2300 Series products.
Table 32 on page 148 shows components for the 3.5" 12-drive enclosure models (also 2U12).
Tables and supporting illustrations (following tables) show components for the NEXIO Farad 2300 Series product
line that can be ordered for replacement in the field. Contact your account manager for packaged FRU numbers
and ordering information. Data addressing NEXIO Farad 2300 Series products is provided to supplement the
illustrated replacement procedures described in Troubleshooting and replacing FRUs.
NEXIO Farad 2300 Series Service Guide
147
FRUs addressing 12-drive enclosures
8
7
6
4
5
2
3
OK
1
Figure 21 Controller enclosure exploded view
0
Figure 22 Controller enclosure assembly
Table 32
Item
148
NEXIO Farad 2300 Series product components
Enclosure component descriptions
1
Disk Drive
2
Ear (Not used on Farad 2300. A bezel is used instead.)
3
Chassis
4
Midplane (included with chassis)
Available FRUs
Table 32
Item
NEXIO Farad 2300 Series product components
Enclosure component descriptions
5
AC Power Supply (one shown)
6
Controller Module for Enclosure (one shown)
7
SFP
8
Enclosure Cover
Not
Shown
NOTE:
Enclosure bezel sub-assembly featuring EMI shield and removable air filter
(see Enclosure bezel for 12-drive model on page 149)
The following illustrations visually describe Table 32 components for the chassis:
•
Exploded view — Figure 23 on page 149
•
Assembly (partial) — Figure 24 on page 150 (shows bezel alignment)
•
Assembly — Figure 25 on page 150 (shows bezel installed)
•
Internal components sub-assembly — Figure 26 on page 151
6
7
5
3
4
2
1
Figure 23 Controller enclosure exploded view
The enclosure illustration above intentionally does not show exploded ear covers. See Figure 24 on page 150 for
an illustration showing how the bezel aligns for attachment to the enclosure front panel, via insertion with bezel
edges inside the front of the chassis with magnets to hold the bezel in place.
Enclosure bezel for 12-drive model
The 12-drive enclosure includes a bezel sub-assembly that attaches to the front of the chassis (see Figure 24 on
page 150). The bezel — comprised of a vented cover attached to an EMI shield — is pre-assembled and
NEXIO Farad 2300 Series Service Guide
149
foam-packed within a box contained in the enclosure master shipping container. The bezel might optionally
include a removable air filter that can be serviced or replaced. Hard copy instructions for attaching/removing the
bezel, and for servicing or replacing the air filter, are provided in the shipping container of a new enclosure.
NOTE:
The air filter is optional and may or may not be used in your product.
CAUTION: Whether configured with or without an air filter, to ensure adequate EMI protection for the disk
drives, the bezel should be properly installed while the enclosure is in operation.
Ball stud on chassis ear
(typical 4 places)
Enclosure bezel sub-assembly
(EMI shield and removable air filter)
NOTE: Not an accurate drawing of the Farad bezel.
Figure 24 Partial controller enclosure assembly showing bezel alignment (2U12)
Figure 25 Controller enclosure assembly with bezel installed
150
Available FRUs
PSU
IOM
Disk drive
module
Midplane
Figure 26 Enclosure architecture — internal components sub-assembly
NEXIO Farad 2300 Series Service Guide
151
152
Available FRUs
Glossary
Additional Sense
Code/Additional Sense
Code Qualifier
ASC/ASCQ.
Advanced Encryption
Standard
AES.
AES
Advanced Encryption Standard. A specification for the encryption of data using a symmetric-key
algorithm.
ASC/ASCQ
Additional Sense Code/Additional Sense Code Qualifier. Information on sense data returned by a
SCSI device.
atomic write
A mode that guarantees if a failure (such as I/O being aborted or a controller failure) interrupts a
data transfer between a host and the storage system, controller cache will contain either all the old
data or all the new data, not a mix of old and new data. This option has a slight performance cost
because it maintains a secondary copy of data in cache so that if a data transfer is not completed,
the old cache data can be restored.
auto-write-through
AWT.
available disk
A disk that is not being used in a vdisk, is not configured as a spare, and is not in the leftover state.
It is available to be configured as a part of a vdisk or as a spare.
AWT
Auto-write-through. A setting that specifies when the RAID controller cache mode automatically
changes from write-back to write-through.
CAPI
Configuration Application Programming Interface. A proprietary protocol used for
communication between the SC and the MC in a controller module. CAPI is always enabled.
chunk size
The amount of contiguous data that is written to a vdisk member before moving to the next
member of the vdisk.
CIM
Common Information Model. The data model for WBEM. It provides a common definition of
management information for systems, networks, applications and services, and allows for vendor
extensions.
CIM Query Language
CQL.
CIMOM
Common Information Model Object Manager. A component in CIM that handles the interactions
between management applications and providers.
CMIP
Common Management Interface Protocol. A model that allows modification of information on
managed objects.
comma separated values CSV.
Common Information
Model
CIM.
Common Information
Model Object Manager
CIMOM.
Common Management
Interface Protocol
CMIP.
compatible disk
A disk that can be used to replace a failed member disk of a vdisk because it both has enough
capacity and is of the same type (SAS SSD, enterprise SAS, or midline SAS) as the disk that
failed.
complex programmable CPLD.
logic device
Configuration
Application
Programming Interface
CAPI.
NEXIO Farad 2300 Series Service Guide
153
Coordinated Universal
Time
UTC.
CPLD
Complex programmable logic device. An electronic component used to build reconfigurable
digital circuits. It can replace large numbers of logic gates.
CQL
CIM Query Language.
CRC
Cyclic Redundancy Check. A mathematical algorithm that, when implemented in software or
hardware, can be used to detect errors in data.
CSV
Comma separated values. A format to store tabular data in plain-text form.
Cyclic Redundancy
Check
CRC.
DAS
Direct Attached Storage. A digital storage system attached to a storage system or workstation
without the use of a network.
Data Encryption
Standard
DES.
DDR
Double data rate. A class of memory integrated circuits use in computers.
dedicated spare
A disk that is reserved for use by a specific vdisk to replace a failed disk. See compatible disk.
default mapping
Host-access settings that are configured when a volume is created, and that apply to all hosts that
are not explicitly mapped to that volume using different settings. See also explicit mapping and
masking.
DES
Data Encryption Standard. An algorithm for the encryption of electronic data.
DHCP
Dynamic Host Configuration Protocol. A network configuration protocol for hosts on IP
networks.
Direct Attach Storage
DAS.
Distributed Management DMTF.
Task Force
DMTF
Distributed management task force. An industry organization that develops and maintains
standards for system management.
double data rate
DDR.
drive spin down
DSD.
DSD
Drive spin down. A power-saving feature that monitors disk activity in the storage system and
spins down inactive SAS disks based on user-selectable policies.
dual-port disk
A disk in a dual-controller environment connected to both controllers so it has two data paths,
achieving fault tolerance.
Dynamic Host
Configuration Protocol
DHCP.
dynamic spare
An available disk that is automatically assigned, if the dynamic spares option is enabled, to
replace a failed disk in a vdisk. See compatible disk.
EC
Expander controller. A processor located in the SAS expander in each controller module and
expansion module that controls the SAS expander and provides SES functionality. See also EMP.
electromagnetic
interface
EMI.
EMI
Electromagnetic interface.
EMP
Enclosure management processor. An EC subsystem that provides SES data such as temperature,
power supply and fan status, and the presence or absence of disks.
enclosure management
processor
EMP.
expander controller
EC.
explicit mapping
Access settings for a host to a volume that override the volume’s default mapping. See also
default mapping and masking.
154
Glossary
extrinsic methods
Methods which are particular to a specific class in SMI-S.
FC-AL
Fibre Channel Arbitrated Loop. The FC topology in which devices are connected in a one-way
loop.
Fibre Channel Arbitrated FC-AL.
Loop
field-programmable gate FPGA.
array
field-replaceable unit
FRU.
FPGA
Field-programmable gate array. An integrated circuit designed to be configured after
manufacturing.
FRU
Field-replaceable unit. A part that can be removed and replaced by the user or support technician
without having to send the product to a repair facility.
global spare
A compatible disk that is reserved for use by any vdisk to replace a failed disk.
HBA
Host Bus Adapter. A device that facilitates I/O processing and physical connectivity between a
host and the storage system.
host port
A port on a controller module that interfaces to a host computer, either directly or through a
network switch.
host bus adapter
HBA.
Input/Output Module
IOM.
intrinsic methods
Methods inherited from CIM and present in all classes such as getclass, createinstance,
enumerateinstances, and associatorNames in SMI-S.
IOM
Input/Output Module. An IOM can be either a controller module or an expansion module.
large form factor
LFF.
LBA
Logical Block Address. The address used for specifying the location of a block of data.
leftover
The state of a disk that the system has excluded from a vdisk because the timestamp in the disks’s
metadata is older than the timestamp of other disks in the vdisk, or because the disk was not
detected during a rescan. A leftover disk cannot be used in another vdisk until the disk’s metadata
is cleared; for information and cautions about doing so, see documentation topics about clearing
disk metadata.
LFF
Large form factor. A type of disk drive.
LIP
Loop Initialization Primitive. An FC primitive used to determine the loop ID for a controller.
Logical Block Address
LBA.
Logical Unit Number
LUN.
loop
FC-AL topology.
Loop Initialization
Primitive
LIP.
LUN
Logical Unit Number. A number that identifies a mapped volume to a host.
MAC Address
Media Access Control Address. A unique identifier assigned to network interfaces for
communication on a network.
management controller
MC.
Management
Information Base
MIB.
masking
A volume-mapping setting that specifies no access to that volume by hosts. See also default
mapping and explicit mapping.
MC
Management controller. A processor located in a controller module that is responsible for
human-computer interfaces and computer-computer interfaces, including the WBI, CLI, and FTP
interfaces, and interacts with the SC.
Media Access Control
Address
MAC Address
NEXIO Farad 2300 Series Service Guide
155
Memory reference code MRC.
metadata
Data in the first sectors of a disk drive that stores all disk-, vdisk-, and volume-specific
information including vdisk membership or spare ID, vdisk ownership, volumes host mapping of
volumes, and results of the last media scrub.
MIB
Management Information Base. A database used for managing the entities in SNMP.
MMF
Multimode fiber.
MRC
Memory reference code.
Multimode fiber
MMF.
native command queuing NCQ.
NCQ
Native command queuing.
network port
An ethernet port on a controller module through which its MC is connected to the network.
network time protocol
NTP.
NTP
Network time protocol.
object identifier
OID.
OID
Object Identifier. A number assigned to devices in a network for identification purposes.
orphan data
See unwritable cache data.
Partner Firmware
Upgrade
PFU.
PCBA
Printed circuit board assembly. A printed circuit board populated with electronic components.
persistent group
reservations
PGR.
PFU
Partner Firmware Upgrade. The automatic update of the partner controller when the user updates
firmware on one controller in a dual-controller system.
PGR
Persistent group reservations.
point-to-point
The FC topology where two ports are directly connected.
POST
Power-On Self Test. Tests that run immediately after a device is powered on.
Power-on Self Test
POST.
Power Supply Unit
PSU.
printed circuit board
assembly
PCBA.
PSU
Power Supply Unit. The power supply FRU.
real-time clock
RTC.
recipe
A pseudo-client code added to SMI-S to demonstrate usage of methods and associations.
RTC
Real-time clock. A circuit in the controller module that maintains the date and time. The RTC has
a battery backup that maintains the time even when there is no power attached to the module.
SC
Storage Controller. A processor located in a controller module that is responsible for RAID
controller functions. The SC is also referred to as the RAID controller.
SCSI Enclosure Services SES.
secure hash algorithm
SHA.
secure shell
SSH.
Secure Sockets Layer
SSL.
SEEPROM
Serial electrically erasable programmable ROM. A type of nonvolatile (persistent if power
removed) computer memory used as FRU ID devices.
Self-Monitoring
Analysis and Reporting
Technology
SMART.
156
Glossary
serial electrically
erasable programmable
ROM
SEEPROM.
service Location
Protocol
SLP.
SES
SCSI Enclosure Services. The protocol that allows the initiator to communicate with the
enclosure using SCSI commands.
SFCB
Small Footprint CIM Broker.
SFF
Small form factor. A type of disk drive.
SHA
Secure Hash Algorithm. A cryptographic hash function.
SIM
Systems Insight Manager. HP-owned and operated technology supporting servers and storage
devices.
single-port disk
A disk with a single data path not connected to both controllers so it is not fault tolerant.
Single-port disk types are identified with the suffix -S.
SLP
Service Location Protocol. Enables computers and other devices to find services in a local area
network without prior configuration.
Small Footprint CIM
Broker
SFCB.
small form factor
SFF.
SMART
Self-Monitoring Analysis and Reporting Technology. A monitoring system for disk drives that
monitors reliability indicators for the purpose of anticipating disk failures and reporting those
potential failures.
SMI-S
Storage Management Initiative - Specification. The SNIA standard that enables interoperable
management of storage networks and storage devices.
The interpretation of CIM for storage. It provides a consistent definition and structure of data,
using object-oriented techniques.
SNIA
Storage Networking Industry Association. An association regarding storage networking
technology and applications.
SSH
Secure Shell. A network protocol for secure data communication.
SSL
Secure Sockets Layer. A cryptographic protocol that provides security over the internet.
storage controller
SC.
Storage Management
SMI-S.
Initiative - Specification
Storage Networking
Industry Association
SNIA.
Systems Insight
Manager
SIM.
UCS Transformation
Format - 8-bit
UTF-8.
ULP
Unified LUN Presentation. A RAID controller feature that enables a host to access mapped
volumes through either controller’s host ports. ULP incorporates ALUA extensions.
Unified LUN
Presentation
ULP.
Uninterruptible Power
Source
UPS.
unwritable cache data
Cache data that has not been written to disk and is associated with a volume that no longer exists
or whose disks are not online. If the data is needed, the volume’s disks must be brought online. If
the data is not needed it can be cleared. Unwritable cache data is also called orphan data.
UPS
Uninterruptible Power Source.
NEXIO Farad 2300 Series Service Guide
157
UTC
Coordinated universal time. The primary time standard by which the world regulates clocks and
time. It replaces Greenwich Mean Time.
UTF-8
UCS transformation format - 8-bit. A variable-width encoding that can represent every character
in the Unicode character set used for the CLI and WBI interfaces.
vdisk
A virtual disk comprised of the capacity of one or more physical disks. The number of disks that a
vdisk can contain is determined by its RAID level.
virtual disk
vdisk.
volume
A portion of the capacity of a vdisk that can be presented as a storage device to a host.
WBEM
Web-Based Enterprise Management. A set of management and internet standard technologies
developed to unify the management of enterprise computing environments. See also CIM.
web-based
interface/web-browser
interface
WBI.
WBI
Web-based interface/web-browser interface. The primary interface for managing the system. A
user can enable the use of HTTP, HTTPS for increased security, or both.
Web-Based Enterprise
Management
WBEM.
Windows Management
Instrumentation Query
Language
WQL.
World Wide Name
WWN.
World Wide Node Name WWNN.
World Wide Port Name
WWPN.
WQL
Windows Management Instrumentation Query Language.
WWN
World Wide Name. A globally unique 64-bit number that identifies a device used in storage
technology.
WWNN
World Wide Node Name. A globally unique 64-bit number that identifies a node.
WWPN
World Wide Port Name. A globally unique 64-bit number that identifies a port.
158
Glossary
Index
A
ALUA 153
ASC/ASCQ 153
Atomic Write 153
audience 11
C
cache
clearing 30
CAPI 153
chunk 153
CIM 153
CIMOM 153
CLI
accessing 147
default password 147
default user name 147
more information 147
show FRUs (show frus) command 147
CLI help, view command 29
CMPI 153
CompactFlash properties 20
controller module properties 19
controller redundancy mode, showing 50
conventions
document 12
CPLD 154
CRC 154
D
DAS 154
data paths
isolating faults 27
debug interface
enable/disable 40
debug log parameters
setting 36
viewing 41
dedicated spare 154
default configuration settings, restoring 35
DES 154
disabled PHY 28
disk
state (how used) values 17
disk drive
LEDs
general 142
disk drives
air management modules 80
identifying faulty disks 23
locating 23
disk metadata
clear 31
clearing 22
disk properties 17, 18
disks
clear metadata 31
show data transfer rate 19
DMTF 154
document
conventions 12
prerequisite knowledge 11
related documentation 11
DSD 154
dual-port disk 154
E
EC 154
electrostatic discharge 73
grounding methods 73
precautions 73
EMP 154
enclosure
viewing information about 18
enclosure properties 18
enclosure status, showing 51
enclosures, re-evaluate IDs 25
error code 97
errors
PHY 28
event code 60
event log
clear 32
viewing 21
event logs
viewing using RAIDar 59
event severity 59, 60
event severity icons 21
events, showing 42
expander fault isolation, enabling or disabling 37
expander PHYs, enabling or disabling 38
expander status and error counters, clearing 32
expander status, showing 44
expansion module properties 20
expansion port properties 20
Extrinsic Methods 155
F
fault isolation 28
faults
identifying
disk drive 23
isolating
a host-side connection 69
expansion port connection fault 70
methodology 13
NEXIO Farad 2300 Series Service Guide
159
faults and error conditions
PSU faults and recommended actions 84
FC-AL 155
firmware
dual controller 75
single controller 75
update 79
FPGA 155
FRU information, showing 47
FRUs
available for 3000 Series
determining FRU identifiers 147
enclosure assembly
2U12 (reduced-depth) 150
illustrated parts breakdown
2U12 (reduced-depth) 149
internal components sub-assembly
2U12 (reduced-depth) 151
P/N tables 147
FTP interface
enable/disable 40
G
global spares 155
H
HBA 155
host channel
See host ports
host link
See host ports
host port properties 20
host ports
reset 33
HTTP interface
enable/disable 40
HTTPS interface
enable/disable 40
I
I/O module properties 20
icons, event severity 21
In port properties 21
Intrinsic Methods 155
L
LBA 155
LED
illuminating drive module Power/Activity/Fault 39
illuminating enclosure Unit Locator 39
LEDs
2U12 front panel
Disk drive 141
Enclosure ID 141
FRU OK 142
Temperature Fault 142
Unit Locator 141
3000 Series controller enclosure rear panel 144
160
Index
3720/3730 face plate
Cache Status 145
Expansion Port Status 145
Fault/Service Required 144
FRU OK 144
Link Activity 144
Link Status 144
Network Port Activity 144
Network Port Link Status 144
OK to Remove 144
Unit Locator 144
controller module 75, 92
disk drive module
LFF 83
Disk without dongle
Fault 142
Power/Activity 142
enclosure
rear panel 92
enclosure status
front panel 92
Power Supply Unit (PSU)
AC 146
DC 146
power supply unit (PSU)
AC 87
DC 87
leftover 155
leftover disk 22, 31
link rate adjustment 19
LIP 155
log file
event messages 97
events 60
LUN 155
M
MAC address 155
Management Controller 155
Management Controller, restarting 34
Management Information Base 155
MC 155
Media Access Control Address 155
metadata 156
clear disk 31
clearing disk 22
MIB 156
N
network port 156
network port properties 20
O
OID 156
orphan data 156
Out port properties 20, 21
P
partner firmware update (PFU) 75
PCBA 156
PFU 156
PHY
disabled 28
errors 28
fault isolation 28
fencing 28
rescan disks 28
point-to-point 156
POST 156
power supply properties 19
power supply unit (PSU) 83
AC PSU without power switch 84
power cable 86
AC 86
verifying component failure 84
power-on, problems after 25
prerequisite knowledge 11
procedures
general precaution 73
replacing a controller enclosure chassis 89
damaged chassis removal 91
replacement chassis installation 91
replacing a controller or expansion module 74
replacing a disk drive module 80
replacing a Fibre Channel SFP 87
replacing a PSU 83
replacing chassis FRUs 73
protocols, service and security
enabling or disabling 40
showing status of 50
R
RAIDar
locating a disk drive 23
Recipe 156
reconstruct 23, 63
redundancy mode, showing 50
related documentation 11
rescan
disks 28
restart, problems after 25
S
SAS expander. See expander and Expander Controller
SC 156
scheduling tasks 24
scrub
abort 30
See Advanced Encryption Standard 153
SEEPROM 156
sensors
locating 93
power supply 93
temperature 94
voltage 94
SES 157
SES interface
enable/disable 40
SFP transceiver
fibre-optic cable 87
small form-factor pluggable 87
SHA 157
SIM 157
single-port disk 157
SLP 157
SMART 157
SMI-S 157
event logs 97
SMI-S interface
enable/disable secure 40
enable/disable unsecure 40
SNIA 157
SNMP
enable/disable interface 40
event logs 97
SSH 157
SSH interface
enable/disable 40
SSL 157
statistics
show vdisk performance 53
show volume performance 57
Storage Controller, restarting 34
system
viewing event log 21
T
task scheduling 24
Telnet interface
enable/disable 40
troubleshooting
expansion port connection fault 70
host-side connection fault 69
U
ULP 157
UTC 158
V
vdisk
abort scrub 30
properties 16
viewing information about 16
vdisk status values 16
vdisks
show performance statistics 53
viewing information about all 16
virtual disk
reconstructing 24
volumes
show performance statistics 57
NEXIO Farad 2300 Series Service Guide
161
W
warnings
temperature 93
voltage 93
WBEM 158
WBI 158
WWN 158
WWPN 158
162
Index
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement