Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1000288.1
Update Date:2010-09-02
Keywords:

Solution Type  FAB (standard) Sure

Solution  1000288.1 :   LDEV Blockade may occur during Microcode upgrade, or PDEV installation, leading to Host System unscheduled outage.  


Related Items
  • Sun Storage 9970 System
  •  
  • Sun Storage 9910 System
  •  
  • Sun Storage 9960 System
  •  
  • Sun Storage 9980 System
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Reactive
  •  

PreviouslyPublishedAs
200403


Product
Sun StorageTek 9910
Sun StorageTek 9960 System
Sun StorageTek 9970 System
Sun StorageTek 9980 System

Xoption
  • Xoption Number: -
  • Xoption Description: Sun StorEdge 9910 Array
Xoption
  • Xoption Number: -
  • Xoption Description: Sun StorEdge 9960 Array
Xoption
  • Xoption Number: -
  • Xoption Description: Sun StorEdge 9970 Array
Xoption
  • Xoption Number: -
  • Xoption Description: Sun StorEdge 9980 Array

Impact
Due to a Microcode bug, a LDEV Blockade may occur during a Microcode
upgrade, or PDEV installation, this can result in the unscheduled outage
to connected Host Systems.
A Microcode bug has been identified that can cause an out of
synchronization condition, between the SVP and the DKC, if a SVP to DKC
communication time-out occurs during a DKU Microcode exchange.
Restarting a DKU microcode exchange, or installing additional HDD's
after this out of synchronization condition has occurred, can lead to
the subsystem attempting to load the wrong DKU Microcode for the HDD
model(s) installed, ** when there are two or more different HDD models
installed and/or present **.  The subsystem will detect this
inconsistency during prechecking, and begin blocking HDD ports and
HDD's.  When two or more HDD's in a parity group become blocked,
logical devices (LDEV's) then become blocked.
   -ALL- Microcode versions, other than fixed version's are affected.
Subsystems with only 1 x model type of HDD installed, are NOT impacted.
The following SIM Reference Codes will be recorded:
   DF8xxx/DF9xxx - Drive Port Blockade (port 0/1)
   EF1xxx - Drive Blockade.
   DFAxxx/DFBxxx - LDEV Blockade.
As a result of these SIM's, and the associated hardware / HDD blocking,
some Parity Groups will be in a Blocked status.  And some host systems
may have suffered an unscheduled outage.
The sequence of events during a Microcode update, that can lead to the
out of synchronization condition are as follows:
1. Using the FC wizard, a status message (left side of bottom line of FC
   Wizard) is posted: SVP-DKC Communication Time Out. At this point, due
   to the Microcode bug, an out of synchronization condition has
   occurred between the SVP and the DKC.
2. An SVP message is displayed: [SMT2435E - An error occurred when
   replacing a microprogram.  Please check status by using the
   Maintenance window and check logs by using the Information window.]
   The Microcode exchange stopped. Status was checked and was found to
   be normal. There were no SIM's or SSB's posted that would indicate
   a issue.
3. During an attempt to restart the Microcode exchange manually, without
   using the FC Wizard, an SVP message was displayed: [INS2268E -
   Exclusive task (Install, Diagnosis, Replace, etc.) is already running
   on the SVP.  Please try this operation after finishing the task.]
   This SVP message can be displayed several times, when attempting to
   restart the Microcode exchange.
4. Eventually the Microcode exchange restarts, without seeing the INS2268E
   message (also, without waiting to check the appropriate sense information
   (SSB A673) has been produced).
5. The Microcode exchange then appears to continue normally, including
   loading code to the CHA's/DKA's.
6. "Exchanging DKU microprogram" messages are then observed. At this
   time, the SVP message SMT2435E were again displayed.  The DKU
   Microcode upgrade is stopped. Status is checked and is found to be
   normal.  There are no SIM's or SSB's posted, that indicate a issue.
7. The DKU portion of the Microcode exchange is restarted. Shortly
   thereafter, HDD ports began to block, leading to the LDEV blockade
   situation, as per the details above.

Symptoms


Root Cause

Resolution
INTERIM CIRCUMVENTION: For circumvention until fixed Microcode is
installed on a subsystem:
1. Perform a manual, non FC Wizard, DKU Microcode exchange separately
   from other portions of the new Microcode set.
2. Before beginning a DKU Microcode exchange, verify the health of
   the SVP to DKC communication by checking status and version
   (both "Running" and "FM").  If there are no SVP - DKC communication
   errors and all MP's display version for both "Running" and "FM",
   proceed with the Microcode exchange.  If not, troubleshoot by
   SVP reboot, LAN Check diagnosis, and selfreplace / replacement
   of the PCB.
3. In the event that a DKU Microcode exchange is performed on a
   subsystem with more than one model type of HDD installed and/or present,
and the DKU Microcode exchange is stopped with SVP messages SMT2435E or
   INS2268E or SVP-DKC Communication Time-out, please wait for a
minimum of 10 minutes TIMES the number of different installed HDD
   models. (For example, if DKR2D-J72 and DKR2E-72GB HDD's are
installed, 10 x 2 = 20 minutes minimum wait time.) Check for SSB
   EC=A673 to be logged with a time stamp after the wait period. (to
view SSB's, use the SVP -> Information -> Log -> SSB then "List" and
   view the SSB screen). Only once SSB EC=A673 has been generated then
retry the DKU code exchange, or install additional HDD's.  The reason
   of waiting for these "minutes" is because both the SVP and DKC will
detect the time-out, and the out of synchronization condition will
   eventually complete, as evidenced by the SSB A673. (NOTE: The
SE9990 subsystem is NOT affected by this Microcode bug, as only one
   model of HDD is currently released at this time for SE9990 subsystem
installation.  In addition, by design, the SE9990 will not block both
   ports of an HDD.)
In the event that a subsystem is impacted by multiple HDD port blockade
failures, resulting in a LDEV blockade condition, please follow these
recovery steps.
RECOVERY:
1. Stop all host system I/O to the subsystem.
2. Turn off AC BOX Main breakers.
3. Do not unplug any PCBs or jumpers.
4. After waiting 10 seconds, perform normal subsystem power on.
5. Check subsystem status. Previously blocked parity groups should
      now be in Correction Access status.
6. If all parity groups are in Normal or Correction Access status,
      return subsystem to customer use.
7. Recover failed HDDs by performing HDD self-replacement.
Use the "Replace (Inline)" button. When prompted by SVP message,
DO NOT diagnose the device;
DO NOT update the microprogram in the device;
DO recover the device.
NOTE: There are two parts to this issue:
1. The communication issue between DKC and SVP during Microcode upgrade.
2. The blockade of both HDD ports due to the communication problem.
For (9900V) SE9970 / SE9980: Microcode DKCMAIN 21-12-01-00/00 (and any later
versions of Microcode) contain countermeasures for both problems #1 and #2
described above.
For (9900) SE9910 / SE9960 : Microcode DKCMAIN 01-19-89-00/00 (and any later
versions of Microcode) contain a countermeasure for problem #2 described above.
A countermeasure for problem #1 will be included in a future SE9910 / SE9960
Microcode version.

Modification History
Date: 21-MAR-2007
  • Changed Product from KASP to SE9910/9960/9970/9980


Previously Published As
100691
Internal Comments




Acronyms Used in this FIN.

CHA - CHannel Adapter

DKA - DisK Adapter

DKC - DisK Controller

DKU - DisK Unit

HDD - Hard Disk Drive

LDEV - Logical DEVice

MP - MicroProcessor

PCB - Printed Circuit Board

SIM - Service Information Message

SSB - System Sense Byte

SVP - SerVice Processor

PDEV - Physical DEVice



============================================================================


Related Information
  • URL: http://webhome.sfbay/sejsc/AL/9900v_01_034755R5.htm
    http://webhome.sfbay/sejsc/AL/9900_01_034795R3.htm
  • Other: HDS Alert's : 9900V_034755, 9900_034795

Internal Eng Business Unit Group
KE Authors

Internal Services Knowledge Engineer
[email protected] (as of 3/21/07)

Internal Kasp FAB Legacy ID
100691, I1169-1 (FIN)

Internal Sun Alert & FAB Admin Info
Critical Category:
Significant Change Date: 2005-06-15
Avoidance: Patch
Responsible Manager: null
Original Admin Info: WF - chgd Product fm KASP to SE9910/9960/9970/9980. - Joe 3/21/07

Internal SA-FAB Eng Submission
LDEV Blockade may occur during Microcode upgrade, or PDEV installation, leading to Host System unscheduled outage.

Product_uuid
2a918ae2-0a18-11d6-834a-c679537eebe7|Sun StorageTek 9910
2a94fb3c-0a18-11d6-90a8-c9c08656284f|Sun StorageTek 9960 System
4ea4b951-9fc9-4f1f-b64e-69572a400fb4|Sun StorageTek 9970 System
c2428fbe-8ab7-41d0-8b6e-ab489823c9d4|Sun StorageTek 9980 System

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback