Document Audience:INTERNAL
Document ID:I1040-1
Title:SCSI errors are occurring on the Sun Fire V440 server configurations with internal Hitachi 36GB and 73GB Differential UltraSCSI 320 disk drives.
Copyright Notice:Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved
Update Date:2004-01-20

---------------------------------------------------------
            - Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                        FIELD INFORMATION NOTICE
               (For Authorized Distribution by Sun Services)
FIN #: I1040-1
Synopsis: SCSI errors are occurring on the Sun Fire V440 server configurations with internal Hitachi 36GB and 73GB Differential UltraSCSI 320 disk drives.
Create Date: Jan/16/04
SunAlert: No
Top FIN/FCO Report: Yes
Products Reference: Sun Fire V440 Servers
Product Category: Server / Diag-Doc-Service
Product Affected: 
Systems Affected:
-----------------  
Mkt_ID   Platform   Model    Description                    Serial Number
------   --------   -----    -----------                    -------------
  -       V440       A42     Sun Fire V440 Server                 - 


X-Options Affected:
-------------------
Mkt_ID    Platform   Model     Description          Serial Number
------    --------   -----     -----------          -------------
  -          -         -            -                     -
Parts Affected: 
----------------------
Part Number    Description   	                     Model
-----------    -----------   	                     -----
540-5462-01    ASSY 36GB 10K 1-in SCSI4 SPUD DRV     Hitachi DK32EJ
540-5456-01    ASSY 73GB 10K 1-in SCSI4 SPUD DRV     Hitachi DK32EJ
References: 
PatchId: 115275-02: mpt scsi driver patch.
         115662-01: LSI1030 scsi controller patch.
         116369-01: Hitachi 36GB disk drive firmware patch. 
                    (firmware PQ0B)
         116370-01: Hitachi 73GB disk drive firmware patch. 
                    (firmware PQ0B)

ECO:     WO_27772 
         WO_27811

ESC:     549558 - Hitachi 36GB and 72GB disk with firmware PQ08 - SCSI 
                  resync errors on V440.
Issue Description: 
There have been numerous service calls opened by Sun Fire V440 server
users with customers reporting a variety of SCSI errors on the internal
Differential UltraSCSI 320 disk drives for the Sun Fire V440 server.
All errors reported are occurring on Hitachi 36GB and 73GB disks with
Hitachi drive firmware version PQ08.

A later version of Hitachi drive firmware, version PQ0B, began shipping
with Sun Fire V440 server systems built beginning on January 1st,
2004.  This later version of firmware addresses some of the reported
SCSI drive errors.  In addition to the Hitachi drive firmware patch
PQ0B, Sun Fire V440 server systems built beginning on January 1st, 2004
have new updated LSI firmware installed. This later version of firmware
addresses some of the reported SCSI drive errors.

In addition, some modifications were made to the mpt driver in Solaris
to addresses some of the reported SCSI driver errors, as listed in the
Sun Fire V440 Server Product Notes document as shipped with the
system.  These modifications are included in a Solaris patch for the
mpt driver.

Reported errors have occurred on the Sun Fire V440 server mpt and scsi 
drivers.  Sun Fire V440 server users have reported the following types 
of errors:

   . Connected Command Timeout
   . Target X reducing sync. transfer rate
   . mpt.sync_wide_backoff
   . Trget X reverting to async. mode
   . read/write block errors with "ASC: 0x29" or "ASC: 0x48".  

Examples of SCSI error messages reported are as follows:

   Nov 30 14:07:09 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0: 
       fault detected in device; service still available
   Nov 30 14:07:09 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0: 
       Connected command timeout for Target 1

   Nov 30 14:45:15 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0: 
       fault detected in device; service still available
   Nov 30 14:45:15 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0: 
       Connected command timeout for Target 1

   Nov 30 14:47:15 rhapsody scsi: [ID 365881 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2 (mpt0):
   Nov 30 14:47:15 rhapsody        backoff failed for target 1
   Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.notice] 
       /pci@1f,700000/scsi@2 (mpt0):
   Nov 30 14:47:15 rhapsody        got external SCSI bus reset.
   Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2 (mpt0):
   Nov 30 14:47:15 rhapsody        mpt_check_task_mgt: Task 3 failed. ioc 
       status = 4a target= 0
   Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@3,0 (sd4):
   Nov 30 14:47:15 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command
   Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@2,0 (sd3):
   Nov 30 14:47:15 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command
   Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@1,0 (sd2):
   Nov 30 14:47:15 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command

   Nov 30 23:08:02 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0: 
       fault detected in device; service still available
   Nov 30 23:08:02 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0: 
       Connected command timeout for Target 1
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2 (mpt0):
   Nov 30 23:08:02 rhapsody        Target 1 disabled wide SCSI mode
   Nov 30 23:08:02 rhapsody mpt: [ID 554818 kern.warning] WARNING: 
       ID[SUNWpd.mpt.sync_wide_backoff.6012]
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2 (mpt0):
   Nov 30 23:08:02 rhapsody        Target 1 reverting to async. mode
   Nov 30 23:08:02 rhapsody mpt: [ID 675377 kern.warning] WARNING: 
       ID[SUNWpd.mpt.sync_wide_backoff.6013]
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@3,0 (sd4):
   Nov 30 23:08:02 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@0,0 (sd1):
   Nov 30 23:08:02 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@1,0 (sd2):
   Nov 30 23:08:02 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command
   Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@2,0 (sd3):
   Nov 30 23:08:02 rhapsody        SCSI transport failed: reason 'reset': 
       retrying command

   Dec  1 12:09:42 rhapsody scsi: [ID 107833 kern.warning] WARNING: 
       /pci@1f,700000/scsi@2/sd@1,0 (sd2):
   Dec  1 12:09:42 rhapsody        Error for Command: write(10) 
       Error Level: Retryable
   Dec  1 12:09:42 rhapsody scsi: [ID 107833 kern.notice]  Requested Block: 
       46910388                  Error Block: 46910388
   Dec  1 12:09:42 rhapsody scsi: [ID 107833 kern.notice]  Vendor: HITACHI 
                            Serial Number: 0337S1C9KT
   Dec  1 12:09:42 rhapsody scsi: [ID 107833 kern.notice]  Sense Key: Unit 
       Attention
   Dec  1 12:09:42 rhapsody scsi: [ID 107833 kern.notice]  ASC: 0x29 
      (), ASCQ: 0x3, FRU: 0x0

To check whether your system has the latest firmware and patches to
avoid known SCSI issues, please run the following commands.

   /usr/sbin/patchadd -p | grep 115275      -  Verify that -02 or later is 
                                               installed 

   /usr/sbin/prtconf -vp | grep 'firmware-version'  -  Check if LSI firmware 
                                                       is 1.03.11.01 or later

   /usr/sbin/format
      -> Select a disk #
            format> inquiry    - Check drive firmware is at PQ0B or later
                               - Repeat for each Hitachi DK32EJ 36GB disk drive

Root cause was found to be a firmware bug on the Hitachi drives and
interaction with other bugs identified in the LSI firmware and mpt
driver.

As of January 1st, 2004 a revised version of firmware, PQ0B, to address
the Hitachi drive time out bug was included with all new Sun Fire V440
server systems built.  ECO WO_27772 implemented the revised firmware
update, PQ0B, in new systems, FRUs and disk drive x-options.

Based on customer reports of SCSI errors, we believe that this error
has occurred in approximately 1% of systems shipped as of January 1st,
2004.

The action plan for resolving SCSI drive errors on the Sun Fire V440 
server is to apply the following patches:

   115275-02 mpt scsi driver patch
   115662-01 LSI1030 scsi controller patch

and one or both of the following as needed;

   116369-01 Hitachi 36GB disk drive firmware patch (firmware PQ0B)
   116370-01 Hitachi 72GB disk drive firmware patch (firmware PQ0B)
Implementation: 
---
        |   |   MANDATORY (Fully Proactive)
         ---    
         
  
         ---
        |   |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        | X |   REACTIVE (As Required)
         ---
Corrective Action: 
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.

Please apply the following patches in order to resolve the issue
mentioned above:

     115275-02 mpt scsi driver patch
     115662-01 LSI1030 scsi controller patch
     116369-01 Hitachi 36GB disk drive firmware patch (f/w PQ0B)
     116370-01 Hitachi 73GB disk drive firmware patch (f/w PQ0B)
 
It is required that the system is booted single-user mode from a
network connection when installing these patches to avoid any
complications with SDS/SVM or VxVM, and to ensure the disks are idle
during the firmware upgrades.  

Note that if booted single-user mode from a CDROM, the disk drive
firmware download utility will core-dump when it tries to read the
firmware on the CDROM drive itself.  It is also required to delete any
hardware RAID1 mirror that is configured with the raidctl utility,
otherwise affected drives in a hardware mirror will not be available
for upgrading.
Comments: 
None.

============================================================================
Implementation Footnote: 
i)   In case of MANDATORY FINs, Sun Services will attempt to contact   
     all affected customers to recommend implementation of the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Sun Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Sun Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.central/FIN_FCO/index.html

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.central/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
----------------
* Access the top level URL of https://spe.sun.com
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
--------------------------------------------------------------------------
Statusactive