Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-77-1000088.1
Update Date:2011-02-25
Keywords:

Solution Type  Sun Alert Sure

Solution  1000088.1 :   500GB SATA Drives in Sun StorageTek 6140 and 6540 Arrays May Have the Incorrect Interposer Card Firmware  


Related Items
  • Sun Storage 6540 Array
  •  
  • Sun Storage 6140 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Availability
  •  
  • GCS>Sun Microsystems>Sun Alert>Release Phase>Resolved
  •  
  • GCS>Sun Microsystems>Sun Alert>Criteria Category>Data Loss
  •  

PreviouslyPublishedAs
200104


Product
Sun StorageTek 6140 Array
Sun StorageTek 6540 Array

Bug Id
<SUNBUG: 6502044>

Date of Workaround Release
11-DEC-2006

Date of Resolved Release
16-JAN-2007

Impact

Systems might experience a high number of SATA drive failures possibly resulting in data loss.


Contributing Factors

This issue can occur on the following platforms:

  • X-Option XTA-ST1CF-500G7K / 500GB7200RPM SATA Drive with StorageTek 6140/6540 Arrays
  • X-Option XTA-ST1CF-500G7K supplied as Part 540-6635

Note: This issue only occurs if the interposer firmware is not "LP1145". All drives shipped originally in trays are not affected.

To identify possible affected drives, the Common Array Manager(CAM) will show the interposer firmware as LP1131B-K2A0AJ0A.

Users can check this by reviewing the Array->Physical->Drives page in the Browser Interface or the following using the sscs(1M) command:

   sscs list -a array disk t51d10,t51d11
   Tray: 51    Disk: t51d10
   Capacity:       465.761 GB
   Type:           SATA
   Speed (RPM):    7200
   Status:         Optimal
   State:          Enabled
   Role:           Data
   Virtual Disk:   6
   Firmware:       LP1131b-K2AOAJ0A <--- Affected
   Serial number:  KRVN65ZAJGMAAF
   WWN:            20:00:00:A0:B8:25:CB:34
   Tray: 51    Disk: t51d11
   Capacity:       465.761 GB
   Type:           SATA
   Speed (RPM):    7200
   Status:         Optimal
   State:          Enabled
   Role:           Data
   Virtual Disk:   6
   Firmware:       LP1145-K2AOAJ0A  <--- Not Affected
   Serial number:  KRVP65ZAJPT04F
   WWN:            20:00:00:A0:B8:25:9F:DE

The first string in the Firmware field before the "-" is the interposer firmware. Drives with LP1131b for interposer firmware are affected.


Symptoms

Systems experience drive failures due to communication failures between the Interposer, Tray IO Module, and Array RAID Controller.

Failures include normal drive read or write IO, failures during drive replacement, or during hot sparing of Virtual Disk data to one of the allocated Global Hotspares in the array.


Workaround

Please see the Resolution below.


Resolution

This issue is addressed in the following releases:

  • Common Array Manager(CAM) 5.0.2 on Solaris with patch 124945-01 or later
  • Common Array Manager(CAM) 5.0.2 on Windows with patch 124955-01 or later

Please read the patch README for instructions on how to upgrade the SATA Interposer firmware for your affected drives to version "LP1145".

This firmware update is an offline process, IO must be quiesced prior to updating the drive firmware. See the patch README for details.



Modification History
Date: 16-JAN-2007
  • State: Resolved
  • Updated Resolution section

 



References

<SUNPATCH: 124945-01>
<SUNPATCH: 124955-01>

Previously Published As
102748
Internal Comments


The firmware upgrade will require an outage, as the drive update process is an offline process. The upgrade will also cause one or both of the Array RAID Controllers to panic, but will result in the appropriate firmware update. This expectation must be set by PTS/Backline or escalating Front Line engineers and Field Support.



Additional Symptoms:



The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives:



Example:




  Date/Time: 11/28/06 1:02:54 PM
  Sequence number: 1538
  Event type: 100A
  Event category: Error
  Priority: Informational
  Description: Drive returned CHECK CONDITION
  Event specific codes: b/88/1
  Component type: Drive
  Component location: Tray 85, Slot 6
  Logged by: Controller in slot A


The Major Event Log can show event type 100A(Drive CHECK CONDITION) with event specific codes: b/88/1 and 4/88/0 for SATA drives:



Example:




  Date/Time: 11/28/06 1:02:54 PM
  Sequence number: 1538
  Event type: 100A
  Event category: Error
  Priority: Informational
  Description: Drive returned CHECK CONDITION
  Event specific codes: b/88/1
  Component type: Drive
  Component location: Tray 85, Slot 6
  Logged by: Controller in slot A

Internal Contributor/submitter
[email protected]

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Eng Responsible Engineer
[email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Escalation ID
1-20683373, 43535125, 43536769, 65255955

Internal Resolution Patches
124945-01, 124955-01

Internal Sun Alert Kasp Legacy ID
102748

Internal Sun Alert & FAB Admin Info
Critical Category: Data Loss, Availability ==> Pervasive
Significant Change Date: 2006-12-11, 2007-01-16
Avoidance: Patch
Responsible Manager: [email protected]
Original Admin Info: [WF 16-Jan-2006, karened: i verified the CR was closed and updated before releasing as resolved]

[WF 08-Dec-2006, karened: created - was submitted as FAB]

Internal SA-FAB Eng Submission
-------- Original Message --------
Subject: Draft Sun Alert: 500GB SATA Drives in Sun StorageTek[TM] 6140 and 6540 Arrays,Can Have the Wrong Interposer Card Firmware
Date: Fri, 08 Dec 2006 12:52:47 -0500
From: Curtis DeCotis
To: [email protected], [email protected], [email protected]

I have spoken to Karen Edwards about this issue, and we agree that
it is a Sun Alert.


Synopsis: 500GB SATA Drives in Sun StorageTek[TM] 6140 and 6540 Arrays
Can Have the Wrong Interposer Card Firmware

Category: [X] Availability

[ ] Diagnosis
[ ] HA-Failure
[X] Pervasive (reported by four or more external customers


Product: StorageTek 6140 and StorageTek 6540

BugID: 6502044

Avoidance: [ ] Workaround
[ ] Binary
[ ] T-Patch
[ ] Patch
[ ] Upgrade
[ ] FCO
[ ] HW
[ X] None (Preliminary)

State: [X ] Preliminary
[ ] Workaround
[ ] Resolved


1. Impact:

Customers can experience high number of SATA drive failures, possibly
resulting in data loss.


2. Contributing Factors:

-Sun StorageTek 6140 Arrays with X-Option XTA-ST1CF-500G7K / 500GB
7200RPM SATA Drive
-Sun StorageTek 6540 Arrays with X-Option XTA-ST1CF-500G7K / 500GB
7200RPM SATA Drive
-X-Option XTA-ST1CF-500G7K supplied as Part 390-0247 / Hitachi
HDS725050KLA360
without interposer firmware LP1145


3. Symptoms:

Customers will experience large drive failures due to communication
failures between the Interposer, Tray IO Module, and Array RAID Controller.
Failures include normal drive read or write IO, failures during drive
replacement, or during hot sparing of Virtual Disk data to one of the
allocated Global Hotspares in the array.

To identify possible drives, Common Array Manager(CAM) will show the
interposer firmware as LP1131B-K2A0AJ0A.

Users can check this by reviewing the Array->Physical->Drives page for the
Browser Interface, and the following using sscs(1M) Command Line Interface:

sscs list -a array disk t51d10,t51d11
Tray: 51 Disk: t51d10
Capacity: 465.761 GB
Type: SATA
Speed (RPM): 7200
Status: Optimal
State: Enabled
Role: Data
Virtual Disk: 6
Firmware: LP1131b-K2AOAJ0A <---BAD!!!
Serial number: KRVN65ZAJGMAAF
WWN: 20:00:00:A0:B8:25:CB:34

Tray: 51 Disk: t51d11
Capacity: 465.761 GB
Type: SATA
Speed (RPM): 7200
Status: Optimal
State: Enabled
Role: Data
Virtual Disk: 6
Firmware: LP1145-K2AOAJ0A <---GOOD!!
Serial number: KRVP65ZAJPT04F
WWN: 20:00:00:A0:B8:25:9F:DE

The first string in the Firmware field before the "-" is the interposer
firmware. Drives with LP1131b for interposer firmware meet the criteria
for this Sun Alert.



4. Relief/Workaround:

No Workaround is available at this time.

5. Resolution:

Should observe the above symptoms, please contact Sun Services
for help in correcting this issue, until a final resolution can be created.

A final Resolution is pending completion.

6. Internal Section:

Escalation IDs: 1-20683373, 43535125, 43536769, 65255955
Pending Patches:
Resolution Patches:
FIN:
FCO:
Submitter: [email protected]
Responsible Engineer:
Responsible Manager: [email protected]
PTS/Engineering organization:

[ ] SSG WGS (Workgroup Systems)
[ ] SSG NSN (Netra Systems and Networking)
[ ] SSG ES (Enterprise Systems)
[ ] SSG SW (Platform Software)
[ ] SSG PNP (Processor)
[ ] NSG (Network Systems Group)
[X ] NWS (Network Storage)
[ ] OP/N1 RPE (Operating Platforms/N1 Revenue Product Engin.)
[ ] JPSE (Java Platform Sustaining Engineering)
[ ] JWSSE (Java Web Services Sustaining Engineering)
[ ] USG (User Software Group)
[ ] SSG HS (Horizontal Systems - T2000/Ontario)

Distribution: [ ] Public SunSolve
[ X] Contract SunSolve


Comments:

The RSL's for these drive types have been purged.

PTS/TSC Backline Escalations should be filed if the customer has the
criteria and symptoms addressed above. PTS/Backline Engineers have
access to a firmware download utility and interposer firmware for
updating customer systems.

The firmware upgrade *will* require an outage, as the drive update
process is an offline process. The upgrade will also cause one or
both of the Array RAID Controllers to panic, but will result in the
appropriate firmware update. This expectation must be set by PTS/Backline
or escalating Front Line engineers and Field Support.

Additional Symptoms:

The Major Event Log can show event type 100A(Drive CHECK CONDITION)
with event specific codes: b/88/1 and 4/88/0 for SATA drives:

Example:

Date/Time: 11/28/06 1:02:54 PM
Sequence number: 1538
Event type: 100A
Event category: Error
Priority: Informational
Description: Drive returned CHECK CONDITION
Event specific codes: b/88/1
Component type: Drive
Component location: Tray 85, Slot 6
Logged by: Controller in slot A


The Major Event Log can show event type 100A(Drive CHECK CONDITION)
with event specific codes: b/88/1 and 4/88/0 for SATA drives:

Example:

Date/Time: 11/28/06 1:02:54 PM
Sequence number: 1538
Event type: 100A
Event category: Error
Priority: Informational
Description: Drive returned CHECK CONDITION
Event specific codes: b/88/1
Component type: Drive
Component location: Tray 85, Slot 6
Logged by: Controller in slot A

PTS Reviewer (approved by): [email protected]




1) This process requires a complete outage of data to the array.
2) No activity can take place on the array
3) use the csmservice -s command as defined in the README

This can and will cause a controller panic, so you should choose
"option 6" to upgrade, then verify the change to the drive firmware
using "option 1".

4) check status of both controllers in CAM. Online any failed controller
using the Service Advisor in CAM.

5) Repeat for all drives with the AJOA firmware

6) After ensuring that all drives have been updated, and controllers are
online and optimal,
reset each controller:

-CAM->Array->Physical Devices->Controllers
-Click "Reset Controller" button for controller A
-Wait 2 minutes
-Click "Reset Controller" button for controller B

7) collect supportData via the Service Advisor, and forward it to Sun
Services.
Product_uuid
8ac7dca5-a8bd-11da-85b4-080020a9ed93|Sun StorageTek 6140 Array
e35cfcfc-a31a-11da-85b4-080020a9ed93|Sun StorageTek 6540 Array

References

SUNPATCH:124945-01
SUNPATCH:124955-01

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback