Ultrastar SSD400M OEM Specification

Add to my manuals
326 Pages

advertisement

Ultrastar SSD400M OEM Specification | Manualzz

18.15.2 Recommendations for System Error Log

The system error log should contain information about the Drive error that will allow recovery actions. The system error logs should contain all the error information returned in the sense data. At a minimum, the following information about each error occurrence should be logged.

• Valid bit and error code (Sense byte 0)

• Sense Key (Sense byte 2)

Information bytes (Sense bytes 3 through 6)

Command specific information (Sense bytes 8 through 11)

Additional Sense Code (Sense byte 12)

Additional Sense Code Qualifier (Sense byte 13)

Field Replaceable Unit (Sense byte 14)

Sense Key Specific (Sense bytes 15, 16, and 17)

Vender Unique error information (Sense bytes 20 through 23)

18.15.3 Data Recovery Procedure

No action can be taken on hard or soft read errors. Block retirement happens automatically based on the block retirement policy in the firmware. LBAs that report a hard read error will become readable after a write. Until a write command is received for the affected LBAs, a hard error will be reported on a read to the affected LBAs.

18.15.4 Nondata Error Recovery Procedure

The Drive will follow a logical recovery procedure for nondata errors. The initiator options for non-data errors are limited to logging the error, retrying the failing command, or replacing the drive.

These recovery procedures assume the initiator practices data back-up and logs errors at the system level for interrogation by service personnel.

18.15.4.1 Drive Busy

The Drive is busy performing an operation. This is not an error condition. The initiator can test for completion of the operation by issuing Test Unit Ready (00) (or media access) commands.

• If the Test Unit Ready (00) (or media access) command completes with Check Condition Status then issue a Request

Sense (03)

- If the specified recovery procedure for the sense data is for a condition other than drive busy, follow the recovery procedure for the condition reported.

- If the specified recovery procedure for the sense data is for a drive busy condition, then continue re-issuing the Test

Unit Ready (00) and Request Sense commands for the duration of a media access time-out or until the drive returns

Good Status.

- If the drive has been busy for longer than the limit specified in Section 18.14, “Command Time out Limits” on

page 249, then service the drive using the service guidelines recommended in Section 18.15.1, “Drive Service

Strategy” on page 251. Otherwise return to normal processing.

If the Test Unit Ready (00) (or media access) command completes with Good Status, then return to normal processing.

18.15.4.2 Unrecovered Drive Error

The initiator should retry the failing command.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

252

5. If the retry of the failing command completes with Good Status or recovered Sense Key, follow the recovery procedure

in Section 18.15.4.3, “Recovered Drive Error” on page 253.

6. If the retry of the failing command completes with hardware error sense, verify there is no outside cause (e.g., power supply) for the failure, then retry the failing command. a. If the retry of the failing command completes with Good Status, follow the recovery procedure in next Section

18.15.4.3, “Recovered Drive Error” on page 253.

b. If the retry of the failing command completes with Recovered sense or Hardware error sense, then service the drive

using the service guideline recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

18.15.4.3 Recovered Drive Error

The Initiator should log the error as soft with the recovery level.

18.15.4.4 Drive Not Ready

The initiator should do the following:

1. Issue a Start Stop Unit (1B) command.

2. Verify that the drive comes ready within the time specified in Section Table 8: , “SSD Response time” on page 9.

3. If the drive fails to come ready within the specified time, service the drive using the service guidelines specified in Sec-

tion 18.15.1, “Drive Service Strategy” on page 251.

4. Retry the failing command. a. If the failing command completes with Good Status, log the error as recovered.

b. If the failing command completes with Not Ready sense, verify there is no outside cause (for example, the power

supply). Then service the drive using the service guidelines specified in Section 18.15.1, “Drive Service Strategy” on page 251..

18.15.4.5 Degraded Mode

Refer to Section 18.1.8, “Degraded Mode” on page 236, for the definition of this state. There are three causes for entering

degraded mode. In all cases the Sense Key is Not Ready. The causes are the following:

1. Sense Code/Qualifier of Logical Unit Not Ready, initializing command required. The media is not accessible. This may not be an error condition. The initiator should issue a Unit start (1B) command to enable media access. If the Drive

fails to come ready in the time specified in Section 18.14, “Command Time out Limits” on page 249, service the drive

using the service guideline recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

2. Sense Code/Qualifier of Diagnostic Failure. Failure of a Send Diagnostic self test, a start up sequence, or other internal target failures.

- Failure of a send diagnostic self test or a start up sequence.

This failure is the result of the diagnostics that are executed during power on or when the Send Diagnostic (1D) command is executed detecting a failure. As with the RAM code not loaded and the configuration data not loaded, the recovery is either a power cycle or issuing the Send Diagnostic (1D) command with the self test bit set active.

Recovery for a failed Send Diagnostic (1D) is achieved in one of the following ways:

Executing the Send Diagnostic (1D) command

Power cycling the drive

If the failure repeats, service the drive using the service guideline recommended in Section 18.15.1, “Drive Service

Strategy” on page 251.

Recovery for a failed power up sequence is achieved in one of the following ways:

Issuing a Unit start (1B) command

Power cycling the drive.

If the failure repeats, service the drive using the service guideline recommended in Section 18.15.1, “Drive Service

Strategy” on page 251.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

253

- Internal target failures

Recovery of this condition is either a power cycle or successful completion of the Send Diagnostic (1D). Service the

drive using the recommended service guidelines specified in Section 18.15.1, “Drive Service Strategy” on page 251, if

the power cycle or the Send Diagnostic (1D) command fail to complete successfully.

3. Sense Code/Qualifier of Format Command Failed Format Unit (04).

Recovery from a failed Format Unit (04) is achieved by retrying the command. If the command fails a second time, ser-

vice the drive following the procedure defined in Section 18.15.1, “Drive Service Strategy” on page 251.

If the above defined recovery procedures fail to clear the degraded mode condition, the Drive should be replaced. Follow the

procedure in Section 18.15.1, “Drive Service Strategy” on page 251, when replacing the drive.

18.15.4.6 Interface Protocol

For all interface protocol errors, the initiator should complete the following steps:

1. Correct the parameter that caused the Illegal Request

2. Retry the failing command

3. If the first retry of the failing command completes with

- Good Status, log the error as recovered

- Check Condition Status with sense data for an Illegal Request, verify there is no outside cause (for example, the power supply) for the failure

- Other, follow the recommendations for the error condition reported. Retry the failing command. If this retry of the failing command completes with

Good Status, log the error as recovered

Check Condition Status with sense data for an Illegal Request, service the drive using the service guideline rec-

ommended in Section 18.15.1, “Drive Service Strategy” on page 251.

Other, follow the recommendations for the error condition reported.

18.15.4.7 Aborted Command

The initiator should determine the cause from the Additional Sense Code (byte 12):

• Sense Key = B (Aborted Command) with Additional Sense Codes of 1B, 25, 43, 49, and 4E are initiator caused abort conditions. The initiator should correct the condition that caused the abort and retry the failing command.

Sense Key = B (Aborted Command) with Additional Sense Code of 44 or 48 are drive caused abort conditions. The initiator should:

1. Retry the failing command.

2. If the retry of the failing command completes with

- Good Status, log the error as recovered.

- Abort Command Sense, verify there is no outside cause (e.g. power supply) for the failure.

3. Retry the failing command.

4. If the retry of the failing command completes with

- Good Status, log the error as recovered.

- Abort command sense, then service the drive using the service guideline recommended in Section 18.15.1,

“Drive Service Strategy” on page 251.

Sense Key = B (Aborted Command) and an Additional Sense Code of 47 can be an initiator or Drive caused abort condition. The initiator should follow the above procedure for initiator caused abort conditions if the Drive detected the SCSI bus parity error. The initiator should follow the above procedure for Drive caused abort conditions if the initiator detected the SCSI bus parity error.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

254

18.15.4.8 Unit Attention Condition

Unit Attention Conditions are not errors. They alert the initiator that the drive had an action that may have changed an initiator controlled state in the drive. These conditions are the following:

Not Ready to Ready Transition

Not ready to ready transition, unit formatted. This Unit Attention Condition will not be reported to the initiator that issued the

Format Unit (04).

Reset

Reset - This means the drive was reset by either a power-on reset, LIP Reset, Target Reset or an internal reset.

Mode Parameters Changed

A Mode Select (15) command successfully completed. This means that the mode parameters that are the current value may have changed. The parameters may or may not have changed but the command to change the parameters successfully completed. The Drive does not actually compare the old current and the new current parameters to determine if the parameters changed. This Unit Attention Condition will not be reported to the initiator that issued the Mode Select (15).

Microcode Has Changed

Write Buffer (3B) to download microcode has successfully completed. This means that the microcode that controls the Drive has been changed. The code may or may not be the same as the code currently being executed. The Drive does not compare old level code with new code.

Commands Cleared by Another Initiator

Tagged commands cleared by a clear queue message. This means that the command queue has been cleared. The Unit Atten-

tion Condition is not reported to the initiator that issued the clear queue message. Unit Attention Condition is reported to all initiators that had commands active or queued.

Reissue any outstanding command.

Log Select Parameters Changed

A Log Select (4C) command successfully completed. This means that the Log Select command cleared statistical information

successfully (See Section 16.6, “LOG SELECT (4C)” on page 77). Unit Attention Condition is reported to all initiators

excluding the initiator that issued the Log Select command.

Device Identifier Changed

A Set Device Identifier (A4) command successfully completed. This means that the Set Device Identifier information field has

been updated. (See 16.41, “SET DEVICE IDENTIFIER (A4/06)” on page 198) A Unit Attention Condition is reported to all

initiators excluding the initiator that issued the Set Device Identifier command.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

255

18.15.4.9 Components Mismatch

The compatibility test is performed at a power cycle. The compatibility test verifies the microcode version of the electronics.

When the Drive detects the microcode version mismatch, the most likely cause is the result of incorrect parts used during a service action.

If the error reported is Sense Key/code/qualifier 4/40/80, Diagnostic failure, bring-up fail, the initiator should do the following:

1. Retry Power cycle

2. Check the send diagnostic end status. If the status is

- GOOD, Return to normal processing

- Check Condition Status, issue a Request Sense (03) and follow the recommendations for the sense data returned unless the sense data is for a component mismatch. If the sense data is for component mismatch, service the drive

using the service guideline recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

18.15.4.10 Self Initiated Reset

The Drive will initiate a self reset when the condition of the Drive cannot be determined. The internal reset will terminate any outstanding commands, release any reserved initiators, and reset the firmware. The initiator can recover by

1. Logging the error

2. Retrying the failing command. If the failing command completes with:

- Good Status, return to normal processing

- Self initiated reset sense, service the drive according the guidelines recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

- Other, follow the recommendations for the error reported.

18.15.4.11 Defect List Recovery

This is not an error condition.

The initiator either requested a defect list in a format (block or vendor specific) that the Drive does not support or the requested defect list(s) exceed the maximum list length that can be returned. If the Sense Key/Code/Qualifier are:

1/1F/00, the requested list(s) exceed the maximum length that can be supported. The initiator should request one list at a time.

If a single list exceeds the maximum returnable length, this may be an indication of a marginally operational drive. Service the

drive following the service guidelines in Section 18.15.1, “Drive Service Strategy” on page 251.

1/1C/01 or 1/1C/02, the requested defect list is not in the format that the Drive supports. The requested defect list is returned in the physical format. This is the default format. There is no initiator action required for this condition.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

256

18.15.4.12 Miscompare Recovery

A miscompare can occur on a Verify (2F) command or a Write and Verify (2E) with the byte check (BytChk) bit active. Recovery for a miscompare error is different for the two commands.

Verify Command

The initiator should do the following:

1. Verify that the data sent to the drive is the correct data for the byte-by-byte compare.

2. Read the data from the media with a Read (08) or Read (28) command and verify that the data from the media is the expected data for the byte-by-byte compare.

- If all data are correct, this is an indication that the data may have been read from the media incorrectly without an

error detected. Service the drive using the procedure specified in Section 18.15.1, “Drive Service Strategy” on page 251.

- If all data are not correct, this is an indication that the data on the media is not the data the initiator expected.

Rewrite the correct data to the media.

Write and Verify Command

The drive uses the same data in the data buffer to write then read and compare. A miscompare error on the Write and Verify

(2E) command is an indication that the drive cannot reliably write or read the media. Service the drive using the procedures

specified in Section 18.15.1, “Drive Service Strategy” on page 251.

18.15.4.13 Microcode Error

The microcode from the interface is validated before the device operates using that microcode. When the validation detects incorrect or incomplete data, the Drive enters degraded mode.

If the initiator attempted to load microcode using the Write Buffer (3B) retry the Write Buffer (3B). If the command completes with

Good Status - return to normal processing

Check Condition Status - service the drive using the service guidelines recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

If the check sum error occurred during normal processing, the initiator may attempt to load microcode before deciding to ser-

vice the drive using the service guidelines recommended in Section 18.15.1, “Drive Service Strategy” on page 251.

To load new microcode, the initiator should issue a Write Buffer (3B) command with the download and save option. If the

Write Buffer (3B) command completes with

• Good Status, return to normal processing. Retry the failing command. If the task complete with

- Good Status - Continue normal processing.

- Check Condition Status for check sum error - Service the drive using the service guidelines recommended in Sec-

tion 18.15.1, “Drive Service Strategy” on page 251.

- Check Condition Status for any other error - follow the recommended recovery procedure for the error reported.

Check Condition Status for Check sum error, service the drive using the service guidelines recommended in Section

18.15.1, “Drive Service Strategy” on page 251.

Check Condition Status for any other error, follow the recommendations for the returned sense data.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

257

18.15.4.14 Predictive Failure Analysis

The Drive performs error log analysis and will alert the initiator of a potential failure. The initiator should determine if this device is the only device with error activity.

If this drive is the only drive attached to the initiator with error activity, service the drive using the procedures specified in Sec-

tion 18.15.1, “Drive Service Strategy” on page 251.

Note:

Service for this drive can be deferred. The longer service is deferred, the more probable a failure can occur that will require immediate service.

If more than this drive is experiencing error activity, the drive is probably not at fault. Locate and service the outside source causing error activity on this drive.

HGST Ultrastar SSD400M (SAS) Solid State Drive Specification

258

advertisement

Was this manual useful for you? Yes No
Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Related manuals

Download PDF

advertisement

Table of contents