Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1005456.1
Update Date:2010-12-22
Keywords:

Solution Type  Problem Resolution Sure

Solution  1005456.1 :   Sun Fire[TM] V20z/v40z may show false amber light on disk drivers after system board replacement  


Related Items
  • Sun Fire V20z Server
  •  
  • Sun Fire V40z Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>x64 Servers
  •  




Symptoms
If the system is configured with hardware raid using the onboard LSI1030 and the system board is replaced it takes a long time for the disk to sync up. During this time the disk drives may illuminate an amber light until the syncing process is complete.

Resolution
Check the hardware layer and the LSI bios and see if it is listed as degraded there.
From the service processor you would
# ssh -l user <ip_sp>
hostname_sp# platform console
login to get into the OS
# init 6
control+c when LSI controller is initializing prompts you to get in and check to see if the status is degraded and the reason is syncing.
(ie)
LSI Logic MPT SCSI Setup Utility   Version  MPTBIOS-IME-5.10.02.04
<Boot Adapter List>  <Global Properties>
LSI Logic Host Bus Adapters
Adapter  PCI  Dev/    Port    IRQ  NVM  Boot   LSI Logic  RAID
Bus  Func    Number            Order  Control    Status
<LSI1030   2   20>    2000    11   Yes  0      Enabled    Optimal
or
Bus  Func    Number            Order  Control    Status
<LSI1030   2   20>    2000    11   Yes  0      Enabled    Resyncing
or
Bus  Func    Number            Order  Control    Status
<LSI1030   2   20>    2000    11   Yes  0      Enabled    Degraded
click on the controller above [enter]
you will see scanning devices
scroll down and select "RAID Properties" [enter]

You will see:
RAID Properties     Array:  IM  SCSI ID:   0    Size(MB):    35003
SCSI  Device Identifier             Array  Hot    Status       Predict  Size
ID                                  Disk?  Spare               Failure  (MB)
0    SEAGATE ST336607LC      0007  Yes    No     Primary      No        35003^
1    SEAGATE ST336607LC      0006  Yes    No     Ok           No        35003:
,br> (see what status says and if you are unsure try to delete the mirror and recreate the raid. Once the disks resync you will get a true status of the raid and or hardware failure if there is one.)
or
Check the Operating Environment to see if the raid shows degraded Solaris Normally you would expect to see:
# raidctl
RAID            RAID            RAID            Disk
Volume          Status          Disk            Status
------------------------------------------------------
c1t0d0          RESYNCING       c1t0d0          OK
c1t1d0          OK
If the system board is changed out often times it could misleadingly show as degraded and failed in the Operating Environment:
# raidctl
RAID            RAID            RAID            Disk
Volume          Status          Disk            Status
------------------------------------------------------
c1t0d0          DEGRADED        c1t0d0          FAILED
c1t1d0          FAILED
You could use raidctl to delete and recreate the raid: # raidctl -d c1t0d0 # raidctl -c c1t0d0 c1t1d0 Then you will see the disks are syncing as above:
The duration of the re-sync is dependent on the volume size and whether the OS and LSI drivers are loaded or not.
Note: In the event the server is rebooted before the volume synchronization is completed the re-sync will resume when the server is restarted.
OS and LSI drivers are loaded
When the OS is loaded with the proper SCSI drivers, the data transfer is done in synchronous mode, allowing for transfer speeds up to 3 MB/sec.
The re-sync is faster because the I/Os can run at speed, however, because I/Os are running, the time is split between re-sync I/Os and Host I/Os.
The method used to calculate the approximate time necessary to re-sync two disk drives is as follows (Vol represents the Volume size in Gigabytes):

resync time (Hours) = (Vol * 1024)/ 3 MB/sec)/ 3600

In the case of two 73 GB disk drives, the re-sync time would be:
(73 * 1024)/ 3)/ 3600 = 6.92 hours or 6 hours 55 minutes (as opposed to 73 hours previously).


Additional Information
If you are not running Solaris[TM] there are tools that LSI makes to view and monitor the raid. LSI provides a Java application called CIM browser that is typically installed on a separate system and is used to monitor multiple v20z/v40z RAID configurations.
CIM Solution consists of two software components, the CIM Provider and the CIM Browser. CIM Solution is available for Linux and Windows.
CIM Provider

The CIM Provider is a daemon thread that runs on the systems being managed (i.e. Sun Fire[TM] V20z or V40z servers). It provides the information about the controller and the devices. It must be installed on every server to be managed.

CIM Browser

The CIM Browser is a Java application that may be installed on any machine on the network and that will connect over the network to the servers being managed. A single CIM Browser can monitor multiple elements on the network. The monitored elements include host adapters, peripheral devices, and device drivers.
Download CIM Browser:

To Download CIM Browser, go to the URL:

http://www.lsi.com/cm/DownloadSearch.do?locale=EN

CIM Browser will be at the bottom of the Web page.

An open-source tool that is available is mpt-status

mpt-status is a little open-source tool that provides status of the physical and logical drives attached to a LSI 1020/1030 RAID controller. Here is what the output looks like:

  ioc0 vol 0 type IM, 2 phy, 68 GB, flags ENABLED, state OPTIMAL
  ioc0 phy 0 IBM-ESXS MAS3735NC     FN C901, 68 GB, state ONLINE
  ioc0 phy 1 IBM-ESXS ST373453LC    FN B85D, 68 GB, state ONLINE

IMPORTANT: mpt-status is open-source software. Sun does not support open-source software.

Download mpt-status:

To Download mpt-status, go to the URL:

http://www.red-bean.com/~mab/mpt-status.html

Product
Sun Fire V40z Server
Sun Fire V20z Server

Internal Comments
reference for the lsi controller:


http://nsg-tm.sfbay/nsg-tm-root/stinger-LSI-RAID-sync.html
For creating a raid instance refer to: Technical Instruction <Document: 1010776.1>
Title:    Sun Fire[TM] V20z/V40z Server: How to Set up Mirroring (RAID 1) on Internal Disks running Linux
stinger, hardware raid, raidctl, mptutil, LSI
Previously Published As
87786

Change History
Date: 2006-11-27
User Name: 71396
Action: Approved
Comment: Performed final review of article.
Updated product name and trademarking.
Changed audience from FREE to CONTRACT. For more information regarding audience selection please see:
http://kmo.central/howto/content/voyager-contributor-standards.html
reference # T07
Publishing.
Version: 3

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback