Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1001280.1
Update Date:2011-02-28
Keywords:

Solution Type  Sun Alert Sure

Solution  1001280.1 :   Under Certain Conditions, Power Cycling Sun StorEdge 6920 Rack May Cause Array to Become Inaccessible  


Related Items
  • Sun Storage 6920 System
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
201740


Product
Sun StorageTek 6920 System

Bug Id
<SUNBUG: 6357740>

Date of Workaround Release
13-DEC-2005

Date of Resolved Release
30-NOV-2006

Impact

Under certain conditions, power cycling an SE6920 rack may render the array inaccessible, causing loss of access to volumes and data, loss of management access to volumes, and/or incorrect volume status.


Contributing Factors

This issue can occur on the following platform:

  • Sun StorEdge 6920 array with DSP firmware patch before 118396-52

Notes:

  1. This issue can occur with all current DSP firmware for the Sun StorEdge 6920 with DSP firmware patch before 118396-52.
  2. Firmware versions are not Solaris dependent.

This issue can occur when the DSP is rebooted before or about the same time as the 6920 array. This condition may be triggered by loss of power (power black-out) to the SE6920 or partial or complete power shutdown initiated from SE6920 configuration services.


Symptoms

Should the described issue occur, SE6920 volumes will be in an incomplete or missing state. From the Sun StorEdge 6920 Configuration Service GUI (under the "Logical Storage Volumes" tab) the state of some or all volumes will be missing or incomplete. Under this same tab, the condition may be one of the following (depending on the configuration):

  • Broken - The volume is not currently operational
  • Degraded - The volume is degraded, and one or more input or output data paths is not operating properly; however, the redundant failover paths are still intact
  • Incomplete - The volume configuration is incomplete, most likely due to previous failures, possibly in its configuration
  • Lost Communications - The volume is offline, most likely due to a loss of communications with part or all of the storage that makes up that volume

Directions for collecting a "solution extract" can be found in the Help section of the Configuration Services GUI. To get to the the help section:

  1. Log in to the web console and select the "Help" button on the upper right hand corner of the GUI.
  2. In the new window that pops up, select the "Search" tab on the upper left hand side.
  3. In the search window that appears type in "solution extract" and select the search button. A list of options will be displayed in GUI including "Generating a Solution Extract."
  4. Select the "Generating a Solution Extract" link for directions on how to collect the extract.

From the solution extract output, the show volumes -detailed file can have entries similar to the following:

    vol/ORGA_archive   SAN    N/A    N/A MISSING    N/A
    N/A    N/A    N/A    ORGA    60:00:15:D0:00:03:5D:00:00:00:00:00:00:00:11:10
    vol/ORGA_backup    SAN    N/A    N/A MISSING    N/A
    N/A    N/A    N/A    ORGA    60:00:15:D0:00:03:5D:00:00:00:00:00:00:00:11:18

Also, vdisks may have path information missing or no active path listed and the state will be "Free" instead of "In-Use", as in the following example (from the extractor):

    Vendor Name    Product ID Raw/Initialized      Size
    -----------    --------------------------      ----
       SUN                   T4                136.52 GB/136.40 GB
    Proc      Path         Flags      State        Port WWN
    ----      ----         -----      -----        --------
    3-3      disk/3/6/2/0            Blocked       20:03:00:03:BA:68:F1:15
    4-3      disk/4/6/3/0    P       Initialized   20:03:00:03:BA:CC:BD:09
    Slice    State    Slice Size   Free Space Total/Max
    -----    -----    ----------   --------------------
      0       Free     136.40            N/A

 


Workaround

To work around the described issue, a reboot of the DSP will temporarily resolve this issue until a permanent fix can be applied.

Note: If the SE6920 has SSRR enabled, a remote services engineer can log in to the array and reboot the DSP (Log in and password information is typically only available to SUN personnel).

To reboot the DSP, log in to the NTC through the serial connecton, and connect to the DSP from the NTC by using the following commands:

    ntc0: connect local port_3
    Local protocol emulation 1.0  - Local Switch: <\1B>
    dsp00#

Once connected to the DSP, execute a "reboot now". You will be asked if you want to overwrite the configuration file; accept the default response of yes [Y]:

    dsp00# reboot now
    Are you sure you want to reboot the system[N]? y
    OK to overwrite configuration file[Y]? y

 


Resolution

This issue is addressed on the following platform:

  • Sun StorEdge 6920 with DSP firmware patch 118396-52 or later


Modification History
Date: 30-NOV-2006

30-Nov-2006:

  • Updated Contributing Factors and Resolution sections


References

<SUNPATCH: 118396-52>

Previously Published As
102094
Internal Comments


 



See also BugID 6214084.



See also InfoDoc 83343 at http://sunsolve.central.sun.com/search/document.do?assetkey=1-9-83343-1



Optional Workaround for qualified personnel only:



If a planned power reset is to take place then before the array is power back up, unplug the two power cables from the DSP, wait about 7 minutes / long enough for every component apart from the DSP to come up on it's own, and plug the power-cables back into the DSP.



This process is to avoid the problem by not allowing the DSP to come online before the arrays (6020's) are up and running. The DSP doesn't have a power switch so the only 2 ways to power off the DSP is to trip the circuit breakers on the rack (which turns off all the components) or to pull the power cords from the DSP. By pulling the power cords on the DSP and powering on the rest of the rack, the arrays in the rack (6020's) will come up and be available before the DSP. Once the arrays are up (after seven minutes) the DSP can be brought online by plugging in the power and all the components (volumes) should be discovered.


Internal Contributor/submitter
[email protected]

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
[email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Escalation ID
1-13463737, 1-13049846, 1-12084793, 1-11999817, 1-12803372, 1-12691525

Internal Resolution Patches
118396-52

Internal Sun Alert Kasp Legacy ID
102094

Internal Sun Alert & FAB Admin Info
Critical Category: Availability ==> Pervasive
Significant Change Date: 2005-12-13, 2006-11-30
Avoidance: Patch
Responsible Manager: [email protected]
Original Admin Info: [WF 30-Nov-2006, dave m: patch released, resolved]
[WF 13-Dec-2005, Dave M: sending for release]
[WF 07-Dec-2005, Dave M: draft created, will be held over for KASP 2 day outage]

Product_uuid
67794720-356d-11d7-8ef2-ce2ac2bc9136|Sun StorageTek 6920 System

References

SUNPATCH:118396-52

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback