Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1009745.1
Update Date:2010-06-28
Keywords:

Solution Type  Troubleshooting Sure

Solution  1009745.1 :   Troubleshooting Sun StorEdge[TM] T3 and 6120 Disk Failures  


Related Items
  • Sun Storage 6120 Array
  •  
  • Sun Storage T3+ Array
  •  
  • Sun Storage T3 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - Other
  •  
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - 6xxx Arrays
  •  

PreviouslyPublishedAs
213390


Applies to:

Sun Storage T3+ Array
Sun Storage T3 Array
Sun Storage 6120 Array - Version: Not Applicable and later    [Release: NA and later]
All Platforms

Purpose

This document addresses the identification of failed or failing disk drive(s) 
in the array via various symptoms provided.
Symptoms:

    •    Performance degraded
    •    Disk Fault LED lit/on
    •    Global Fault LED lit/on

Please validate that each troubleshooting step below is true for your 
environment. The steps will provide instructions or a link to a document, for
validating the step and taking corrective action as necessary. The steps are
ordered in the most appropriate sequence to isolate the issue and identify the
proper resolution. Please do not skip a step.

Last Review Date

June 17, 2010

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

1. Validate that you can telnet into your 6120/T3.

    •    If you cannot log into your array via telnet, <Document: 1012660.1> Troubleshooting Sun StorEdge[TM] T3, T3+ and 6120 Access Problems
    •    Otherwise continue to Step 2.

2. Validate disk drive status against a detailed FRU status report by executing the command fru stat.

example:


CTLR    STATUS   STATE       ROLE        PARTNER    TEMP
------  -------  ----------  ----------  -------    ----
u1ctr   ready    enabled     master      u2ctr      41.0
u2ctr   ready    enabled     alt master  u1ctr      37.0

DISK    STATUS   STATE       ROLE        PORT1      PORT2      TEMP  VOLUME
------  -------  ----------  ----------  ---------  ---------  ----  ------
u1d1    ready    enabled     data disk   ready      ready      26    v0
u1d2    ready    enabled     data disk   ready      ready      35    v0
u1d3    ready    enabled     data disk   ready      ready      42    v0


  •   If status is ready-enabled, go to Step 5.
  •   If status is substituted, go to Step 7.
  •   If status is ready-disabled,  go to Step 3.
  •   If status is fault-disabled, go to Step 4.
  •   If there are more than one drive in a state other than ready-enabled, go to Step 5.

3. Validate local and/or global hot spare presence and state.

Verify the presence and status of a hotspare by:
a) executing the command vol list to confirm the existence of local hotspare under the column "standby"
b) executing the command global_standby list to confirm the existence of global hotspare


NOTE:  The command global_standby list is not available on arrays running firmware lower than a 3.x version. You can use the command ver, to see the version.


    ▪    If a hot spare is present, go to Step 4, to verify if a reconstruction is in progress.
    ▪    If there are no hot spares configured, go to Step 8.

4.  Verify the presence of an ongoing reconstruction by:

Executing the command proc list to confirm the existence of a process vol recon


Example:
myarray:/:<1>proc list
VOLUME          CMD_REF PERCENT    TIME COMMAND
 tray0_pool1             21568      74 53928:47 vol verify
 tray1_pool2             25666      27  178:04 vol recon  <--- reconstruction process.


  • If there is no reconstruction process, AND the drive is ready-disabled, go to Step 8.
  • If there is no reconstruction process, AND the drive is fault-substituted, go to Step 7.
  • If a reconstruction is ongoing, allow it to complete before preceding, and re-evaluate the drive status in Step 2.
  • If there is no ongoing reconstruction and the drive isn't in a fault-substituted state, go to Step 8.

5.  Check the status of the volume associated to the drive by executing the command vol stat.


myaray:/:<3>vol stat

v0            u1d1   u1d2   u1d3   u1d4   u1d5   u1d6   u1d7   u1d8   u1d9
mounted        0      0      0      0      0      0      0      0      0  
myarray:/:<4>


  • If the volume is "mounted", but more than one drive has a non-zero status, go to Step 8.
  • If volume is "unmounted", you have sustained a drive failure beyond the capabilities of your RAID level for the volume.
  • Otherwise continue to Step 6.

6.  Validate LED existence against disk drive in ready-enabled state.

  • If there is an amber fault LED or any other LED lit for the disk drive, go to Step 8.
  • If there is no LED's lit.  You have verified that the disk drive is healthy. 


7.  You have validated that a drive has failed in the array, and requires replacement. 

Collect the the following information and contact Oracle Support for a drive replacement:

The output of:
fru stat
vol stat
proc list
fru list

OR

Collect the array data from a Solaris host by running:  /opt/SUNWexplo/explorer -w !default,t3extended

8.  At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. 

Please open a Service Request with Oracle Support.

Please include:

    •    Statement of Symptoms you see that pertain to the disk drive
    •    Collect the array data from a Solaris host by running:  /opt/SUNWexplo/explorer -w !default,t3extended


Internal Comments
This document contains normalized content and is managed by the the Domain Lead(s) of the respective domains. To notify content owners of a knowledge gap contained in this document, and/or prior to updating this document, please contact the domain engineers that are managing this document via the “Document Feedback” alias(es) listed below:

[email protected]

Place Sun Internal-Use Only content here. This content will be

published to internal SunSolve only.


The Knowledge Work Queue for this article is KNO-STO-MIDRANGE_DISK.

T3, T3+, normalized, failed hard drive, vol verify, multiple disk failure, Audited
Previously Published As
86534

Change History
Date: 2007-11-13
User Name: 7058
Action: Approved
Comment: Internal link referenced in external section.
Fixed.
Version: 7
Date: 2007-11-13
User Name: 7058
Action: Update Started
Comment: Fix link
Version: 0
Date: 2007-07-16
User Name: 7058
Action: Approved
Comment: Notes for Normalizaton:
Subset of: N/A
Subset Root path: N/A
References: 86540, 52569
Project: Minnow Normalization



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback