Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1019006.1
Update Date:2011-02-17
Keywords:

Solution Type  Sun Alert Sure

Solution  1019006.1 :   Collecting Support Data Using Common Array Manager while Drives are Bypassed May Cause Controllers to Reboot  


Related Items
  • Sun Storage 6540 Array
  •  
  • Sun Storage Flexline 280 Array
  •  
  • Sun Storage 6130 Array
  •  
  • Sun Storage 6140 Array
  •  
  • Sun Storage Flexline 240 Array
  •  
  • Sun Storage Flexline 380 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  

PreviouslyPublishedAs
231802


Bug Id
<SUNBUG: 6649629>

Product
Sun StorageTek 6130 Array
Sun StorageTek 6140 Array
Sun StorageTek 6540 Array
Sun StorageTek Flexline 240 Array
Sun StorageTek Flexline 280 Array
Sun StorageTek Flexline 380 Array
Sun StorageTek Common Array Manager Software 6.0

Date of Workaround Release
15-Feb-2008

Date of Resolved Release
04-Mar-2008

Collecting Support Data Using Common Array Manager While Drives are Bypassed May Cause Controllers to Reboot

1. Impact

Sun StorageTek 6130, 6140, and 6540 (with firmware 06.19.25.16) and Flexline 240, 280, and 380 arrays (with firmware 06.19.25.26) and collecting support data using Common Array Manager (CAM) 6.0 while drives are bypassed may cause controllers to reboot without warning. This may cause device path failovers or a loss of access due to both primary and secondary paths to a device being inaccessible.

2. Contributing Factors

This issue can occur on the following platforms:
  • Sun StorEdge 6130 Array with firmware 06.19.25.16 or later
  • Sun StorageTek 6140 Array with firmware 06.19.25.16 or later
  • Sun StorageTek 6540 Array with firmware 06.19.25.16 or later
  • Sun StorageTek Flexline 240 Array with firmware 06.19.25.26 or later
  • Sun StorageTek Flexline 280 Array with firmware 06.19.25.26 or later
  • Sun StorageTek Flexline 380 Array with firmware 06.19.25.26 or later
with:
  • Sun StorageTek Common Array Manager 6.0
Notes:
  1. For this issue to occur, one or more drives must be removed or in a "bypassed" state (removal of an otherwise optimal drive will place it into a state of Bypassed).
  2. This issue can only happen during array support data collection using Common Array Manager(CAM) 6.0, specifically with the "Capture State" function of the collection which generates the stateCaptureData.dmp file.  This includes collection of data by both the CAM Service Advisor and supportData collection utilities.

3. Symptoms

At minimum, systems will experience device path fail-overs.  At worst, there will be a loss of access due to both primary and secondary paths to a device being inaccessible.

This issue can only occur during array support data collection (see Contributing Factors).  The system must also have at least one drive in a "Bypassed" state.  This condition can be reviewed by looking at the following:

Browser:
CAM->Array_Name->Physical Devices->Disks

SSCS:
sscs list -a <array> disk

Host message events showing SCSI or Fibre Channel messages indicating loss of access to one or both array controllers are the most common symptom associated with this issue.  Hosts can also log path failovers as one or both controllers go through the boot cycle.

CAM may throw an error during collection, similar to the following:
Exception while accessing hids,108 on Controller A:
devmgr.versioned.jrpc.RPCError: TIMEOUT
Exception while accessing hids,108 on Controller B:
devmgr.versioned.jrpc.RPCError: TIMEOUT
This can be found during the collection using the Service Advisor, or by reviewing the "stateCaptureData.dmp" file included in the support data collection.

4. Workaround

This issue can be avoided by not collecting support data when you know that a drive has been removed or bypassed by the storage subsystem.  For this case, you can collect the data using the "service" command.

The following CLI commands can be run to obtain support data for CAM:
service -d <array_name> -c print -t arrayprofile > arrayProfileSummary.txt
service -d <array_name> -c print -t mel > majorEventLog.txt
service -d <array_name> -c read -q nvsram region=0xEE > NVSRAMdata.txt
service -d <array_name> -c print -t rls > readLinkStatus.csv
sscs list alarm > alarms.txt

5. Resolution

This issue is addressed in the following release:
  • Common Array Manager 6.0.1 and later
for all affected arrays listed in Contributing Factors.

CAM 6.0.1 is available for download at:

https://cds.sun.com/is-bin/INTERSHOP.enfinity/WFS/CDS-CDS_SMI-Site/en_US/-/USD/ViewProductDetail-Start?ProductRef=SSTKCAM-6.0.1.11-OTH-G-F@CDS-CDS_SMI

This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements.

Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved.


Modification History
04-Mar-2008: Updated Resolution section, RESOLVED
26-mar-2008: Updated Workaround section
10-Jul-2008: Updated Workaround for clarification


Internal Comments
Please send technical questions to the following email:
[email protected]
and CC the following persons:
Internal Contributor/Submitter
Internal Eng Responsible Engineer
Internal Services Knowledge Engineer


Pending release on the SDLC, CAM 6.0.1.11 will have the "hids 108"
command removed from the support data collection.  Avoid running
'hids' altogether during serial sessions, especially when any drive is
in a non-optimal state.



The excLogShow will show:

---- Log Entry #29 JAN-11-2008 09:13:02
AM ----

Exception: Data Abort

cpsr:  a0000013   pc:  0x01348d50  
toString__CQ23hid2LUi + 0x5a4

Alignment Fault   fsr: f3      far:
e7fddf0a   biusr: 10000059    bear:

ffffe1ac

Registers:

   r0    
=        0  
r1     =       
9   r2     =  a280244  
r3     =

e7fddefe

   r4    
=        0  
r5     =  a280244  
r6     =  9a28a00  
r7     =

a35c7d8

   r8    
=        5  
r9     =  a35c7d8  
r10    =  a28022c   r11/fp =

a280498

   r12/ip =  a280154   r13/sp = 
a280154   r14/lr =  1348d28  
pc     =

1348d50

   cpsr   = a0000013




Stack Trace:

======== STACK SHOW ========

Showing for task id = 0xa280840 (symTask2), Running

FP=0xa280498, SP=0xa280154, PC=0x1348d50

Current executing task id = 0xa280840 (symTask2); not interrupted



Frame Ptr   Ret Addr  Return Name +
Offset             
Called Name + Offset

========== ========== ================================

========================

0x0a2807f0 0x00139bc8 vxTaskEntry +
0x2c               
vkiTask

0x0a28078c 0x0005d824 vkiTask +
0xd4                   
srcOpTask

0x0a280748 0x012f29c0 srcOpTask +
0x120                
cmdProcess

0x0a280730 0x014e5dbc cmdProcess + 0x74

symSYMbolCommandHandler

0x0a2806e0 0x016caa8c symSYMbolCommandHandler + 0x54   
svciov_dispatch

0x0a280630 0x016dc3b4 svciov_dispatch +
0x3ec          
stateCapture_1

0x0a280610 0x016c8ac8 stateCapture_1 +
0x38            
systemStateCapture

0x0a280530 0x0171e8c0 systemStateCapture +
0x5a0        hids

0x0a280510 0x01355f44 hids + 0x4c

show__Q23hid10HidManagerUlUlUl

0x0a2804e8 0x01356048 show__Q23hid10HidManagerUlUlUl + 0x90

showFunc__Q23hid10HidManagerP9HIDS_INFO

0x0a2804d0 0x013530e4 showFunc__Q23hid10HidManagerP9HIDS_INFO + 0x2ac

traverseSingleList__Q23hid10HidManagerP9HIDS_INFO

0x0a2804a8 0x013539b8 traverseSingleList__Q23hid10HidManagerP9HIDS_INFO

+ 0xe4  show__CQ23utl14ShowableObjecti

0x0a28048c 0x0182e138 show__CQ23utl14ShowableObjecti + 0xcc

toString__CQ23hid2LUi

           0x01348d50
toString__CQ23hid2LUi + 0x5a4

********

Task Id:         0xa280840

Name:           
"symTask2"

Status:          0x00
(ready)

Options:         0x0004
(dealloc_stk)

Priority:        125

Stack base:      0xa280840

Stack end:       0xa27cea4 (adjusted for
name)

Stack size:      0x399c (14748)

Stack margin:    0x1708 (5896)

Stack limit:     0xa27ceb0

Pend queue:      0x98c1b10

Last errno:      0x1c0001
Internal Contributor/submitter
[email protected]

Internal Eng Responsible Engineer
[email protected], [email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Escalation ID
1-23211257

Internal Sun Alert & FAB Admin Info
WF 15-Feb-2008, david m: send for release
WF 14-Feb-2008, david m: draft created


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback