Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1018045.1
Update Date:2010-10-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1018045.1 :   Proper sequence of replacing disks in the event of disabled/substituted disks on Sun StorEdge[TM] T3 Raid5 configuration.  


Related Items
  • Sun Storage T3 Array
  •  
  • Sun Storage T3+ Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - Other
  •  

PreviouslyPublishedAs
229351


Symptoms
In the event that we have a situation where two disks indicate a problem on a RAID 5 configuration on the following arrays:-
    Sun StorEdge[TM] T3
Sun StorEdge[TM] T3+
Sun StorEdge[TM] 6x20
The proper disk replacement sequence must be followed to prevent data loss.
If the status indicated by one disk is "disabled" and the other disk is
"substituted", then in order to keep the volume intact and mounted, we should
replace the disk showing as "disabled" first, followed by the disk showing up
as "substituted". Not following this sequence could make the volume dead and
hence result in the loss of data.
Detailed Description of the problem
===================================
(This has been reproduced in the lab and described in Bug ID 4751163.
1) This is the output of a stable Sun StorEdge[TM] T3. Both volumes v0 and v1 are showing "mounted" and all drives are labeled "0" meaning healthy. Good to go.
T3B Release 2.01.00 2002/03/22 18:35:03 (10.1.1.20)
Copyright (C) 1997-2001 Sun Microsystems, Inc.
All Rights Reserved.
t3a:/:<1>vol stat
v0            u1d1   u1d2   u1d3   u1d4   u1d5   u1d6   u1d7   u1d8   u1d9
mounted        0      0      0      0      0      0      0      0      0
v1            u2d1   u2d2   u2d3   u2d4   u2d5   u2d6   u2d7   u2d8   u2d9
mounted        0      0      0      0      0      0      0      0      0
2) We disable one of the drives, this will fail the drive but not activate
the hot spare.
t3a:/:<1>vol stat
v0            u1d1   u1d2   u1d3   u1d4   u1d5   u1d6   u1d7   u1d8   u1d9
mounted        4D     0      0      0      0      0      0      0      0
v1            u2d1   u2d2   u2d3   u2d4   u2d5   u2d6   u2d7   u2d8   u2d9
mounted        0      0      0      0      0      0      0      0      0
Now the RAID 5 volume is degraded but still available. Note drive u1d1 has
a status of "4D" indicating that the drive has failed.
3) Now we "Substitute" drive u1d4 and copy its contents to the HotSpare.
so the status shows:-
v0            u1d1   u1d2   u1d3   u1d4   u1d5   u1d6   u1d7   u1d8   u1d9
mounted        4D     0      0      0S     0      0      0      0      0
v1            u2d1   u2d2   u2d3   u2d4   u2d5   u2d6   u2d7   u2d8   u2d9
mounted        0      0      0      0      0      0      0      0      0
Drive u1d4 is "Substituted" and we have a degraded volume that consists
of drives u1d2, u1d3, u1d5, u1d6, u1d7, u1d8 and u1d9(which is the HotSpare).
We are still good. Volume is still available.
4) Now if we replace drive u1d4 in such condition, we will see the status as
t3a:/:<1>vol stat
v0            u1d1   u1d2   u1d3   u1d4   u1d5   u1d6   u1d7   u1d8   u1d9
unmounted      4D     0      0      4S     0      0      0      0      0
v1            u2d1   u2d2   u2d3   u2d4   u2d5   u2d6   u2d7   u2d8   u2d9
mounted        0      0      0      0      0      0      0      0      0
The volume v0 immediately gets "unmounted" and the drive u1d4 shows as
"failed" (status 4S) and the volume becomes unavailable to the host.
Ideally, we should still have a degraded RAID5 volume with a parity group
capable of rebuilding the original failed drive, but instead we now have a
crashed volume.


Resolution
In this condition, we should first replace the "failed" drive ie u1d1 in the above example, followed by u1d4 which is the substituted drive. The volume would then be mounted and available.

Additional Information
For details on the disk states for firmware versions 2.x and 3.x
Please see Technical Instruction <Document: 1012433.1> Disk States from Vol Stat output using 3.x firmware


Product
Sun StorageTek 6120/6320 Controller Firmware 3.2
Sun StorageTek T3 Multi-Platform 1.1
Sun StorageTek T3 Array
Sun StorageTek T3+/6X20 Controller Firmware 3.1
Sun StorageTek T3+ Array Controller FW 2.1
Sun StorageTek T3+ Array

Internal Comments
For internal Sun use only.

Please see Bug ID 4751163
Case numbers 63159110 and 37206400.
T3, T3+, T3B, double failure
Previously Published As
79384

Change History
Date: 2004-12-02
User Name: 71396
Action: Approved
Comment: Reviewed document, publishing.
Version: 3
Date: 2004-12-02
User Name: 71396
Action: Accept
Comment:
Version: 0
Date: 2004-12-02
User Name: 71396
Action: Accept
Comment:
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback