Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1020970.1
Update Date:2010-08-27
Keywords:

Solution Type  FAB (standard) Sure

Solution  1020970.1 :   Hard media errors on Sun Storage 7000 systems.  


Related Items
  • Sun Storage 7110 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Reactive
  •  

PreviouslyPublishedAs
268248


Bug Id
<SUNBUG: 6878294>

Product
Sun Storage 7110 Unified Storage System
Sun Storage 7210 Unified Storage System
Sun Storage 7310 Unified Storage System
Sun Storage 7410 Unified Storage System

Date of Preliminary Release
29-Sep-2009

Hard media errors on Sun Storage 7000 systems (see details below).

Impact

A single hard media error on a drive can lead to unnecessary drive replacement.  This fault does not automatically trigger any ZFS action (hot spare or resilvering) and does not impact the system in any way other than encouraging the user to replace the drive before it needs to be.

Contributing Factors

This impacts the Sun Storage 7000 series with firmware older than 2009.Q3.

This issue has been more prevalent on the Seagate 1TB ST31000NSSUN1.0T drives (primarily used in the Sun Storage 7410/7310), but is possible for all drive types.

Symptoms

A fault is generated (see the logs in the BUI, or look in the bundle: fmdump fm/fltlog | grep DISK-8000; fmdump -V -u UUID fm/fltlog) with the message ID "DISK-8000-4Q" and the problems page will show the following description:

   "The command was terminated with a non-recovered error condition
   that may have been caused by a flaw in the media or an error in
   the recorded data."

   CLI> maintenance hardware show (or bundle/hw/hw.aksh) shows a
   drive as faulted instead of ok.

   The pool has no errors displayed (all drives are ONLINE in
   bundle/zfs/status.out)

Root Cause

Seagate drives are exhibiting a higher than normal number of hard media errors.  While the disk firmware indicates that this error is unrecoverable, the self-healing capabilities of a redundant ZFS configuration will re-write these bad blocks, at which point the drive firmware will remap the LBA to a spare block, thereby repairing the problem without requiring a complete disk replacement.  Complete details are available in CR 6878294.

Corrective Action

Workaround:

This issue is corrected in the 2009.Q3 software by disabling this particular diagnosis until the root cause can be fixed.  This workaround is covered by CR 6878521 until CR 6878294 is fixed.

Customers can upgrade to 2009.Q3 software release to avoid seeing these faults in the future.
This release can be found via the below URL;

  http://wikis.sun.com/display/FishWorks/Sun+Storage+7000+Series+Software+Updates

If this problem has already been diagnosed, or upgrading to the new release is not feasible, it is safe to mark the problem repaired through the Maintenance -> Problems page.  It is important that this is done only for this particular fault (as described in the symptoms above).  Other disk failures may indicate fatal problems with the hardware.

Resolution:

A complete fix that still allows for diagnosis of persistent or widespread media failures will be available in a future release.

Comments

There have been many reports of disk failures in the Sun Storage 7000 series.  These failures have a number of root causes that are actively being investigated, of which this is only one.  Any failure must be carefully analyzed to verify that it matches the symptoms of this fault before applying the workaround or communicating to the customer that their particular issue has been fixed.

References:

  BugID:  6878294


For information about FAB documents, its release processes, implementation strategies and billing information, go to the following URL:

For Sun Authorized Service Providers go to:

In addition to the above you may email:


Internal Contributor/submitter
[email protected], [email protected]

Internal Eng Responsible Engineer
[email protected] Responsible Manager: [email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Eng Business Unit Group
NWS (Network Storage)

Internal Sun Alert & FAB Admin Info
25-Sep-2009: Completed draft and sent to Extended Review.
29-Sep-2009: Feedback from Ext Rvw resolved - sending to Publish.


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback