and later [Release: N/A and later ]
Sun Storage 3510 FC Array - Version: Not Applicable and later [Release: N/A and later]
Sun Storage 3310 Array - Version: Not Applicable and later [Release: N/A and later]
Sun Storage 3320 SCSI Array - Version: Not Applicable and later [Release: N/A and later]
All Platforms
Purpose
Description
Troubleshooting Sun Storage[TM] 351x and 33x0 Controllers and Array Health.
Symptoms:
- "show config" shows a failed controller
-"show redundancy" shows Redundancy status: Failed or Scanning
on a dual controller array
-"show disks" shows only one channel on a dual controller
array
-"show channel" shows only one channel on a dual channel
array
-Controller Alert: Redundant Controller Failure Detected
-Amber controller LED indicating failure
-Chassis sounds audible alarm
-Controller appear hung
-DRAM Parity errors
Please refer to the Sun StorEdge 3000 Family RAID Firmware 4.x
User Guide, Appendix E for additional Controller related Event
Messages.
Last Review Date
May 13, 2011
Instructions for the Reader
A Troubleshooting Guide is provided to assist
in debugging a specific issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting Details
Steps to Follow
NOTE: This is a sub-set of DocID 1011431.1 : "Troubleshooting Sun StorEdge[TM] 33x0 /351x Hardware". The steps below will help verify and resolve controller problems.
Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to the document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
Step 1 - Verify the Redundancy status of the controllers is Enabled and the correct number of controller serial numbers are visible by issuing a "show redundancy" command from the sccli .
Refer to Chapter 6 of the Sun StorEdge Family FRU Installation Guide or the Sun StorEdge 3000 Family CLI 2.x User's Guide for an explanation of possible states.
Redundancy enabled on dual controller array:
sccli> show redundancy
Primary controller serial number: 8003752
Primary controller location: Lower
Redundancy mode: Active-Active
Redundancy status: Enabled
Secondary controller serial number: 8000596
Example of failed state:
sccli> show redundancy
Primary controller serial number: 8021942
Primary controller location: Lower
Redundancy mode: Active-Active
Redundancy status: Failed
Secondary controller serial number: 8025533
Step 2 - Verify the FRU status of the FC_RAID_IOM is OK by issuing a "show fru" command from the sccli .
Note: In a dual controller array, two FC_RAID_IOMs are listed as FRUS:
3510 Examples:
Name: FC_RAID_IOM
Description: SE3510 I/O w/SES + RAID Cont 1GB
Part Number: 370-5537
Serial Number: 000665
Revision: 04
Initial Hardware Dash Level: 04
FRU Shortname: 370-5537-04
Manufacturing Date: Wed Sep 17 05:36:35 2003
Manufacturing Location: Milpitas California, USA
Manufacturer JEDEC ID: 0x0301
FRU Location: UPPER FC RAID IOM SLOT ,
Chassis Serial Number: 000F26
FRU Status: OK
Lower controller has a Fault state:
Name: FC_RAID_IOM\
Description: SE3510 I/O w/SES + RAID Cont 1GB
Part Number: 370-5537
Serial Number: 003115
Revision: 04
Initial Hardware Dash Level: 04
FRU Shortname: 370-5537-04
Manufacturing Date: Thu Sep 11 14:27:35 2003
Manufacturing Location: Milpitas California, USA
Manufacturer JEDEC ID: 0x0301
FRU Location: LOWER FC RAID IOM SLOT
Chassis Serial Number: 000F26
FRU Status: Fault
Step 3 - Verify the health of the enclosure and all listed components are OK by issuing a "show enclosure-status" command from the sccli:
3510 Example:
sccli> show enclosure-status
Ch Id Chassis Vendor/Product ID Rev PLD WWNN WWPN
-------------------------------------------------------------------------------
2 12 000F26 SUN StorEdge 3510F A 1080 1000 204000C0FF000F26 214000C0FF000F26
3 12 000F26 SUN StorEdge 3510F A 1080 1000 204000C0FF000F26 224000C0FF000F26
Enclosure Component Status:
Type Unit Status FRU P/N FRU S/N Add'l Data
Fan 0 OK 370-5398 006568 --
Fan 1 OK 370-5398 006568 --
Fan 2 OK 370-5398 006573 --
Fan 3 OK 370-5398 006573 --
PS 0 OK 370-5398 006568 --
PS 1 OK 370-5398 006573 --
Temp 0 OK 370-5535 000F26 temp=33
Temp 1 OK 370-5535 000F26 temp=35
Temp 2 OK 370-5535 000F26 temp=33
...........
DiskSlot 7 OK 370-5535 000F26 addr=7,led=off
DiskSlot 8 OK 370-5535 000F26 addr=8,led=off
DiskSlot 9 OK 370-5535 000F26addr=9,led=off
DiskSlot 10 OK 370-5535 000F26 addr=10,led=off
DiskSlot 11 OK 370-5535 000F26 addr=11,led=off
Step 4 - Verify there are no critical controller events by issuing the sccli> "show persistent" command (if 4.x firmware installed ( determined via sccli> "show inq" command), issue the command out of band, otherwise issue "show events" .
See Appendix E of the Sun StorEdge 3000 Family RAID Firmware 4.x User's Guide for a list of controller events.
Examples:
# sccli -o 129.153.49.188 show persistent
or
sccli> show events
Step 5 - Verify there are no mis-seated or marginal controllers/IOMs
Refer to Doc ID 1006856.1 "Troubleshooting StorEdge [TM] 351x Redundant Loop Failures".
Step 6 - Verify the SES or PLD firmware is not mis-matched by issuing the sccli "show ses" command .
If a SES or PLD mis-match is detected, refer to Doc ID 1012024.1 "How To Sun StorEdge[TM] 351x Array: SES or PLD Firmware Mismatches".
Example of matched PLD and SES:
sccli> show ses
Ch Id Chassis Vendor/Product ID Rev PLD WWNN WWPN
-------------------------------------------------------------------------------
2 12 000F26 SUN StorEdge 3510F A 1080 1000 2040000FF000F26 214000C0FF000F26
3 12 000F26 SUN StorEdge 3510F A 1080 1000 204000C0FF000F26 224000C0FF000F26
Example of unmatched PLD:
sccli> show ses
Ch Id Chassis Vendor Product ID Rev PLD WWPN
-----------------------------------------------------------------------
2 124 SUN StorEdge 3510F A 1040 A000 204000C0FF000008
3 124 SUN StorEdge 3510F A 1040 8B00* 204000C0FF000008
* indicates SES or PLD firmware mismatch.
Step 7 - If physical access to the array is possible, verify the controller LED is Solid or blinking Green.
If both controller LEDs are flashing or solid green, refer to Doc ID 1017618.1 "How to Resolve RAID Controller "Race Conditions" on a StorEdge 3310, SE 3320, SE 3510, or 3511 Array".
Step 8 - Verify SFP Link status LED is Solid Green.
If not lit, re-seat and/or replace SFP and/or cable. If SFP LED continues to remain unlit, insert cable to another port on the HBA.
For further information, see Doc ID 1009556.1: "Verifying HBA Connectivity".
Step 9 - In a dual controller array, if redundancy status is failed, but redundancy mode is Active-Active and primary serial number is seen:
Issue sccli> sec unfail command and reply y when prompted. (issue sccli> show redund to verify)
Wait up to 5 minutes for device detection before the controller redundancy status is Enabled.
Issue sccli> show redundancy command to confirm redundancy status is Enabled .
Step 10 - If FC_RAID_IOM modules (controller or IOM) are missing from the sccli "show frus" command...
see Doc ID 1012692.1: " Sun StorEdge [TM] 351x Array: How to Resolve Devices Missing from The Se3kxtr "show_frus" and "show_ses-devices" Output
".
Step 11 - If symptoms remain, stop i/o to the array and reset controller
see Doc ID 1010657.1: "The Proper Way to De-stage Cache in a Sun StorEdge[TM] 351x/33x0 Array is to Use the "shutdown" Command"
Issue the sccli> show redundancy command to confirm redundancy status is Enabled.
Step 12 - If step 11 above fails, power off and on the array.
Issue the sccli> "show redundancy" command to confirm redundancy status is Enabled.
Step 13 - If controller remains in a Failed state, replace the controller using the following :
Doc ID 1018906.1: " Sun StorEdge[TM] 3510 FC Array and Sun StorEdge[TM] 3511 SATA Array: Replacing the I/O Controller Module" .
Sun StorEdge 3310 SCSI Array Controller Module Replacement Guide
Sun StorEdge 3320 SCSI Array Controller Module Replacement Guide
Sun StorEdge 3510 FC Array and 3511 SATA Array Controller Replacement Guide
Sun StorEdge 3000 Family FRU Installation Guide
FAB 1017358.1: Sun StorEdge 3310/3510/3511 controllers must be allowed to complete the firmware cross loading process during controller replacement.
If other problems are found during the course of this document refer back to
DocID: 1011431.1: " Troubleshooting Sun StorEdge [TM] 33x0/351x Hardware."
At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required. Please engage the next level of support.
http://pts-storage.us.oracle.com/products/SE33xx/toi/nvram.html
Change History
Date: 2010-12-08
User Name: [email protected]
Action: Currency & Update links
Attachments
This solution has no attachment