Document Audience:INTERNAL
Document ID:I1147-1
Title:Sun StorEdge 3511 FC Array with SATA and JBOD with RAID fw prior to 4.11, SES fw prior to 0413, and SSCS sw prior to 2.0 may experience various downtime, drive offline, and inaccurate component status report.
Copyright Notice:Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved
Update Date:2005-05-18

------------------------------------------------------------
            - Sun Proprietary/Confidential: Internal Use Only -
------------------------------------------------------------------------

  ***  Sun Confidential:  Internal Use and Authorized VARs Only  ***
________________________________________________________________________

  This message including any attachments is confidential information
  of Sun Microsystems, Inc.  Disclosure, copying or distribution is
  prohibited without permission of Sun.  If you are not the intended
  recipient, please reply to the sender and then delete this message.
________________________________________________________________________

                        FIELD INFORMATION NOTICE
               (For Authorized Distribution by Sun Service)
FIN #: I1147-1
Synopsis: Sun StorEdge 3511 FC Array with SATA and JBOD with RAID fw prior to 4.11, SES fw prior to 0413, and SSCS sw prior to 2.0 may experience various downtime, drive offline, and inaccurate component status report.
Create Date: May/05/05
SunAlert: No
Top FIN/FCO Report: No
Products Reference: Sun StorEdge 3511 FC Array with SATA and JBOD
Product Category: Storage / Diag-Doc-Service
Product Affected: 
Systems Affected:
-----------------
Mkt_ID      Platform    Model     Description                    Serial Number
------      --------    -----     -----------                    -------------
  -          Anysys       -       System Platform Independent          -


X-Options Affected:
-------------------
Mkt_ID      Platform    Model    Description                   Serial Number
------      --------    -----    -----------                   -------------
  -          SE3511      ALL     Sun StorEdge 3511 FC Array          -
  -          SE3511      ALL     Sun StorEdge 3511 FC JBOD           -
Parts Affected: 
----------------------
Part Number              Description                          Model
-----------              -----------                          -----
     -                        -                                 -
References: 
PatchId: 113724-04

URL: http://sdpsweb.central/FIN_FCO/FIN/FINI1147-1_dir/SPE/SE3511_Cust_List.sxc
     http://sdpsweb.central/FIN_FCO/FIN/FINI1147-1_dir/Customer_Letter.sxw

Manual: 816-7290: Sun StorEdge 3000 Family Installation; Operation; 
                  and Service Manual.
        816-7293: Sun StorEdge 3000 Family Best Practices Manual for
                  the Sun StorEdge 3310 SCSI Array.
        816-7326: Sun StorEdge 3000 Family FRU Installation Guide.
        817-3629: Sun StorEdge 3000 Family Rack Installation Guide for 
                  2U Arrays.
        817-3711: Sun StorEdge 3000 Family RAID Firmware 4.0 Users 
                  Guide.
        817-3764: Sun StorEdge 3000 Family Software Installation Manual.
        817-3337: Sun StorEdge 3000 Family Configuration Service 2.0 
                  Users Guide.
        817-3338: Sun StorEdge 3000 Family Diagnostic Reporter 2.0  
                  Users Guide.
        817-4951: Sun StorEdge 3000 Family CLI 2.0 Users Guide.
        816-7930: Sun StorEdge 3000 Family Safety Regulatory and  
                  Compliance Manual.

Sun Alerts: 57612: DRAM Parity Errors or SDRAM ECC Errors on Sun StorEdge 
                   3510 Sun StorEdge 3510 or 3511 FC Array May Cause File 
                   System Integrity Issue.

            57702: Sun StorEdge 3310/3510/3511 FC Array Controllers May 
                   Incorrectly Offline Drives.

            57644: Changing the Cache Optimization Mode Incorrectly on 
                   Sun StorEdge 3310, 3510, 3511 May Cause Issues 
                   Affecting Filesystem Availability and Data Integrity.

            57690: Sun StorEdge 3310/3510/3511 Disk Rebuild Operation 
                   Fails to Complete.

            57604: Using Certain sccli(1M) Commands to Manage a Sun  
                   StorEdge 3510 Fiber Channel Array May Cause the 
                   Controller to Hang.

            57588: SNMP Functionality on Sun StorEdge 3510 With Certain 
                   Firmware Revisions May Not Function.

            57589: Sun StorEdge 3510 and 3511 FC Array in Loop Mode May 
                   Cause Attached Hosts to Experience Excessive SCSI  
                   Timeouts Upon a Reboot.
Issue Description: 
Sun StorEdge 3511 Arrays with firmware (FW) versions earlier than 4.11
CLI earlier than 2.0, and SES earlier than 1046 may cause system
downtime and data integrity issues as explained in above SunAlert
cases. The revision of FW, SES code, and the software can be checked
through sccli as explained in the patch README file. Detailed
information to check the revision through other interfaces is provided
in installation manual as well.

Below are more details about these specific issues:

DRAM Parity Errors or SDRAM ECC Errors on Sun StorEdge 3310, Sun
StorEdge 3510 or 3511 FC Arrays May Cause File System Integrity Issue
This issue can occur when the controller firmware fails to distinguish
between single-bit ECC errors and multi-bit ECC errors. The controller
seems to continue to work normally even for multi-bit errors, which
leads to loss in file system integrity. A single-bit ECC error is
recoverable, while a multi-bit ECC error is not.  With 4.11 FW if this
issue happens the controller will shutdown itself.

---------

Sun StorEdge 3310/3510/3511 FC Array Controllers May Incorrectly
Offline Drives.  During the recovery from a failure, Sun StorEdge
3310/3510/3511 FC array controllers may incorrectly offline good drives
causing multiple drive failures. As a result, logical devices may
become degraded thereby causing applications to stop running.  The 4.11
FW has proper procedure for fault handling and will not cause this
issue.

---------

Sun StorEdge 3310/3510/3511 Disk Rebuild Operation Fails to Complete.
In the event of a disk failure, disk rebuilding would commence on the
spare drive (if configured) and the rebuilding may stop after 99 % and
not complete. The rebuild will remain incomplete and the logical device
state would remain as degraded. Should another drive failure occur,
this condition could result in loss of data integrity.

---------

Changing the Cache Optimization Mode Incorrectly on Sun StorEdge 3310,
3510, 3511 may cause issues affecting filesystem availability and data
integrity.  The 1.6.2 CLI release and subsequent releases prevent the
user from changing the cache optimization mode while there is an
existing LD.  Also, 4.11 controller firmware release has the section of
code rewritten so that different mode LDs can exist in a controller,
thus making problem nonexistent.

---------

Using certain sccli(1M) commands to manage a Sun StorEdge 3510 or 3511
Fiber Channel Array that is configured with more than 16 LUN masks (or
filters) per LUN, may cause the controller to hang. As a result, the
host (system) could experience a loss of access to the Sun StorEdge
3510 or 3511 LUNs There are two SW bug fixes for this issue:  A. CLI
1.5.0 only support 256 filter map entries because of a bug in the code,
which uses wrong macro definition. Even the controller F/W support more
than 256, it only show the first 256 entries. The limit will be
extended to 2048 in 2.0 release.  B. CLI 2.0.0 fixed this one by
fetching at most 64 filter map entries, and the f/w allow 64 entires at
most for one .

---------

SE3511 now has the ability to send SNMP traps without the use of the
Sun StorEdge Configuration Service Console (SSCS) software.  The
primary agent did not properly handle trap requests causing loss of the
out-of-band connection and potential controller hang-up condition.  The
primary agent and IP stack in the 4.11 release were redesigned and have
eliminated the problem.

---------

A specific loop configuration of a Sun StorEdge 3510 or 3511 FC Array
may cause attached hosts to experience excessive SCSI timeouts when a
single host is rebooted.  The root cause is that the firmware did not
always handle interrupts properly during login/logout and/or loop
initialization resulting in command timeouts. Each time a LIP or
LOGIN/LOGOUT occurs on the loop the firmware must validate the WWNs for
LUN filtering. This validation process causes various internal commands
to be issued resulting in host commands being discarded intermittently.
This specific validation process occurs in loop mode only, with or
without LUN filtering in use. The problem will not occur in
point-to-point mode since a different LUN filter validation process is
used by the firmware. The problem with validation process has been
fixed in FW 4.11.


More info about 4.11/2.0 release
--------------------------------

Firmware version 4.11 and CLI software version 2.0 add the following 
new features to StorEdge 3511 RAID arrays: 
 

1. Common source code for RAID controller firmware with separate  
   bindings specific to FC, SATA, and SCSI 

2. Improve the interoperability with StorADE in regards to:
 
      2.1. Instrumentation - Discover arrays, gather telemetry data
	   and retrieve event logs; 
      2.2. Fault Management - Identify FRU faults by applying 
	   pre-established thresholds and policies from the 
	   instrumentation data; 
      2.3. Diagnostics - Ability to invoke diagnostic tools in order  
	   to isolate to a single failing FRU.
	   
3. New features:
 
      3.1. Cache specific
      -------------------
       
	   3.1.1. Independent policies for logical drives (LD) 
		  user-configurable cache policy per LD; currently
		  the cache policy is per RAID.
		     
	   3.1.2. Write-behind cache mode.
	      
      3.2. Fault management specific
      ------------------------------
       
	   3.2.1. Automatically switch to write-through mode based upon: 
		  Low Battery level, AC loss, Fan Failure, Power supply  
		  failure, Notification of high temperature in  
		  controller or enclosure.
		  
	   3.2.2. Enhanced SNMP trap
	   
	   3.2.3. Automatic system shutdown based on critical  
	          environmental conditions.
	          
      3.3. Logical Device and Logical Volume specific
      -----------------------------------------------
      
	   3.3.1. Variable stripe size support (4KB -> 256KB) per LD. 
	          Increments will be done by powers of 2 (i.e., 4KB, 8KB, 
	          16KB, &). Currently the stripe size is set per RAID and 
	          can be either 32K for random or 256 for sequential.
	          
	   3.3.2. Increase the total number of Terabytes supported per  
	          LD to 64TB for sequential and 16TB for random  
	          configurations; these numbers are 2TB and 512GB 
	          respectively.
	          
	   3.3.3. Increase the total number of supported drives per LD 
	          to 36.
	          
	   3.3.4. Increase the number of LD's supported per controller
	          to 16; currently this number is 8.
	          
	   3.3.5. Automatic availability of RAID sets at start of 
	          initialization.
	          
	   3.3.6. 16-byte SCSI Command Data Blocks (CDB)s to support >  
	          2TB file system.
Implementation: 
---
        | X |   MANDATORY (Fully Proactive)
         ---


         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan)
         ---


         ---
        |   |   REACTIVE (As Required)
         ---
Corrective Action: 
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.

Install patch 113724-04 as explained in the README file.

The 4.11/2.0 release is a FW & SW upgrade and does not require any HW
change.  While the FW updates for this product has been non-disruptive
for previous releases, due to the big difference between the current
code(s) and this release the upgrade requires controller reset and is
a disruptive process.

Before upgrading, Sun strongly recommends the performing the following:

   1. Schedule a maintenance window.
   
   2. Backup all data.
   
   3. Perform a complete review of your specific configuration and 
      record it where it can be conviently retrieved.

Please use the following link for the Customer List to identify sites 
which may be affected.

   http://sdpsweb.central/FIN_FCO/FIN/FINI1147-1_dir/SPE/SE3511_Cust_List.sxc

Please use the following link for the Customer Letter, as needed, to 
communicate this issue with the customers.

   http://sdpsweb.central/FIN_FCO/FIN/FINI1147-1_dir/Customer_Letter.sxw
Comments: 
None.

============================================================================

NOTE: FIN Tracking Instructions for Radiance/SPWeb:
--------------------------------------------------

If a Radiance case involves the application of a FIN to solve a customer
issue, please complete the following steps in Radiance/SPWeb prior to
closing the case:

    o Select "Field Information Notice" in the REFERENCE TYPE field.

    o Enter FIN ID number in the REFERENCE ID field.
      For example; I1111-1.

If possible, include additional details in the REFERENCE SUMMARY field
(ie. implementation complete, customer declined, etc.)
--------------------------------------------------------------------------


Implementation Notes:
--------------------

In case of "Mandatory" FINs, Sun Services will attempt to contact
all known customers to recommend proactive implementation.

For "Controlled Proactive" FINs, Sun Services mission critical
support teams will initiate proactive implementation efforts for
their respective accounts as required.

For "Reactive" FINs, Sun Services and partners will implement
the necessary corrective actions as the need arises.


Billing Information:
-------------------

Warranty: On-Site Labor Rates are based on specified Warranty deliverables
          for the affected product.

Contract: On-Site Labor Rates are based on the type of service contract.

Non Contract: On-Site implementation by Sun is available based on On-Site
              Labor Rates defined in the Price List.

--------------------------------------------------------------------------

All FIN documents are accessible via Internal SunSolve.  Type "sunsolve"
in a browser and follow the prompts to Search Collections.

For questions on this document, please email:

        [email protected]

The FIN and FCO homepage is available at:

        http://sdpsweb.central/FIN_FCO/index.html

For more information on how to submit a FIN, go to:

        http://pronto.central/fin.html

To access the Service Partner Exchange, use:

        https://spe.sun.com
--------------------------------------------------------------------------