Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1019245.1
Update Date:2011-02-17
Keywords:

Solution Type  Sun Alert Sure

Solution  1019245.1 :   Sun StorageTek T3+ (T3B) Array and Sun StorageTek 6120, 6320 and 6920 Arrays May Reboot Unexpectedly and Lose Host Connectivity after 994 Days of Continuous Operation  


Related Items
  • Sun Storage 6960 Array
  •  
  • Sun Storage 6910 Array
  •  
  • Sun Storage T3+ Array
  •  
  • Sun Storage 6120 Array
  •  
  • Sun Storage 3960 Array
  •  
  • Sun Storage 6920 System
  •  
  • Sun Storage 3910 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Data Loss
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
237605


Bug Id
<SUNBUG: 6643328>

Product
Sun StorageTek T3+ Array
Sun StorageTek 6120 Array
Sun StorageTek 6320 Array
Sun StorageTek 3910
Sun StorageTek 3960
Sun StorageTek 6910
Sun StorageTek 6920 System
Sun StorageTek 6960

Date of Resolved Release
08-May-2008

Sun StorageTek T3+ (T3B) Array and Sun StorageTek 6120, 6320 and 6920 Arrays May Reboot Unexpectedly and Lose Host Connectivity after 994 Days of Continuous Operation

1. Impact

Firmware version 2.1.4 (and later) for Sun StorageTek T3+ (T3B) arrays, firmware version 3.0.0 (and later) for Sun StorageTek 6120, baseline firmware 2.3.2 (and later) for the Sun StorageTek 3910/3960/6910/6960, baseline firmware 1.1 (and later) for Sun StorageTek 6320 and baseline firmware 2.0.3 (and later) for Sun StorageTek 6920 are subject to the following issue which could affect array availability and possibly data.

The above listed arrays may reboot unexpectedly and lose host connectivity for several minutes if the array has run continuously for 994 days without a complete power cycle.  Data may be inaccessible, with a possible loss of data integrity.

2. Contributing Factors


This issue can occur on the following platforms:
  • Sun StorageTek T3+ (T3B) array with firmware 2.1.4 or later
  • Sun StorageTek6120 array with firmware 3.0.0 or later
  • Sun StorageTek 3910/3960/6910/6960 arrays with baseline firmware 2.3.2 or later
  • Sun StorageTek 6320 array with baseline firmware 1.1 or later
  • Sun StorageTek 6920 array with baseline firmware 2.0.3 or later

To determine the firmware revision on one of these systems, the following command can be run directly on the T3B or 6120:

6120:/:<1>ver
6120 Release 3.1.6 Thu Feb  3 16:48:03 PST 2005 (10.16.10.131)
Copyright (C) 1997-2003 Sun Microsystems, Inc., All Rights Reserved

The 3910, 3960, 6910, 6960, 6320 and 6920 would require a telnet connection to the T3B or 6120 internal array to run 'ver'.

3. Symptoms


If this issue occurs, systems may experience similar events as listed below:

22709 Apr 22 19:46:27 array00 ISR1[1]: W: ISP2200[1] LOOP DOWN detected.
...
22762 Apr 22 19:51:46 array00 LPCT[2]: N: u2d13 Bypassed on loop 2
22763 Apr 22 19:51:46 array00 LPCT[2]: N: u2d14 Bypassed on loop 2
22764 Apr 22 19:51:51 array00 ROOT[2]: N: Initializing loop 1 ISP2200 ... firmware status = 3
22765 Apr 22 19:51:51 array00 ROOT[2]: N: Detected 15 FC-AL ports on loop 1
22766 Apr 22 19:51:51 array00 ROOT[2]: N: loop 1 TARGET_ID = 0xf (ALPA = 0xce)
22767 Apr 22 19:52:18 array00 ROOT[2]: N: Initializing loop 2 ISP2200 ... firmware status = 3
22768 Apr 22 19:52:18 array00 ROOT[2]: N: Detected 29 FC-AL ports on loop 2
22769 Apr 22 19:52:18 array00 ROOT[2]: N: loop 2 TARGET_ID = 0xf (ALPA = 0xce)
22770 Apr 22 19:53:05 array00 ROOT[2]: N: u2ctr found 28 disks in the system
22771 Apr 22 19:53:24 array00 ROOT[2]: N: 6120 Release 3.2.6 Mon Feb  5 02:26:22 MST 2007 (192.168.0.40)
22772 Apr 22 19:53:24 array00 ROOT[2]: N: u2ctr Reset (3000) lpc_hbt.c line 290, Assert(0) => 0

Note: Although the event "uXctr Reset (3000) lpc_hbt.c line xxx, Assert(0) => 0" is a good indicator for this issue, the complete array logs should be analyzed to confirm this.

4. Workaround

To avoid this issue, perform the following on each array no later than every 994 days (The recommendation is to perform the procedure every 2 years).

Loopcards:

for a 2x2 array: u1l1, u1l2, u2l1, u2l2

for a 2x4 array: u1l1, u1l2, u2l1, u2l2, u3l1, u3l2, u4l1, u4l2

for a 2x6 array: u1l1, u1l2, u2l1, u2l2, u3l1, u3l2, u4l1, u4l2, u5l1, u5l2, u6l1, u6l2

Procedure for the T3B and 6120:
  1. Stop the I/O access to the array.
  2. Wait 2 min.
  3. lpc reboot u#l# (where u# is unit number l# is loopcard designator) do this command for each loopcard. 
  4. Run 'reset -y on the array. 
  5. Resume the I/O access once you confirm that the array is up.
Procedure for the 3910, 3960, 6910 and 6960

For each array inside of the 3910, 3960, 6910, 6660:
  1. Stop the I/O access to the array.
  2. Wait 2 min. 
  3. lpc reboot u#l# (where u# is unit number l# is loopcard designator) do this command for each loopcard. 
  4. Run 'reset -y on the array. 
  5. Resume the I/O access once you confirm that the array is up.
Procedure for the 6320 (serial/SSRR access required)

For each array in the 6320:
  1. Stop the I/O access to the array.
  2. Wait 2 min. 
  3. lpc reboot u#l# (where u# is unit number l# is loopcard designator) do this command for each loopcard. 
  4. Run 'reset -y on the array. 
  5. Resume the I/O access once you confirm that the array is up.
Procedure for the 6920(serial/SSRR access required)

For each array in the 6920 :
  1. Stop the I/O access to the array.
  2. Wait 2 min. 
  3. lpc reboot u#l# (where u# is unit number l# is loopcard designator) do this command for each loopcard. 
  4. Run 'reset -y on the array. 
  5. Resume the I/O access once you confirm that the array is up and visible to the DSP.

5. Resolution

Please see the "Workaround" section above.

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2009 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.


Modification History
15-Jul-2008: Updated Workaround section
28-Jan-2009: Updated Title and Impact


Internal Comments
Please send technical questions to the following email:
[email protected]
and CC the following persons:
Internal Contributor/Submitter
Internal Eng Responsible Engineer
Internal Services Knowledge Engineer


There will be no fix from Engineering for this issue. The only way to
avoid this issue is to perform a complete and full power cycle of the
system as described in
the "Workaround" section.
Internal Eng Responsible Engineer
[email protected]

Internal Eng Responsible Engineer
[email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Sun Alert & FAB Admin Info
07-May-2008, david m: draft created, send for review
08-May-2008, david m: review completed, send to publish
28-Jan-2009, david m: per mgmt request, update Title and Impact statements



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback