Document Audience: | INTERNAL |
Document ID: | I1040-1 |
Title: | SCSI errors are occurring on the Sun Fire V440 server configurations with internal Hitachi 36GB and 73GB Differential UltraSCSI 320 disk drives. |
Copyright Notice: | Copyright © 2005 Sun Microsystems, Inc. All Rights Reserved |
Update Date: | 2004-01-20 |
---------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by Sun Services)
FIN #: I1040-1
Synopsis: SCSI errors are occurring on the Sun Fire V440 server configurations with internal Hitachi 36GB and 73GB Differential UltraSCSI 320 disk drives.Create Date: Jan/16/04
SunAlert: No
Top FIN/FCO Report: Yes
Products Reference: Sun Fire V440 Servers
Product Category: Server / Diag-Doc-Service
Product Affected:
Systems Affected:
-----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- V440 A42 Sun Fire V440 Server -
X-Options Affected:
-------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- - - - -
Parts Affected:
----------------------
Part Number Description Model
----------- ----------- -----
540-5462-01 ASSY 36GB 10K 1-in SCSI4 SPUD DRV Hitachi DK32EJ
540-5456-01 ASSY 73GB 10K 1-in SCSI4 SPUD DRV Hitachi DK32EJ
References:
PatchId: 115275-02: mpt scsi driver patch.
115662-01: LSI1030 scsi controller patch.
116369-01: Hitachi 36GB disk drive firmware patch.
(firmware PQ0B)
116370-01: Hitachi 73GB disk drive firmware patch.
(firmware PQ0B)
ECO: WO_27772
WO_27811
ESC: 549558 - Hitachi 36GB and 72GB disk with firmware PQ08 - SCSI
resync errors on V440.
Issue Description:
There have been numerous service calls opened by Sun Fire V440 server
users with customers reporting a variety of SCSI errors on the internal
Differential UltraSCSI 320 disk drives for the Sun Fire V440 server.
All errors reported are occurring on Hitachi 36GB and 73GB disks with
Hitachi drive firmware version PQ08.
A later version of Hitachi drive firmware, version PQ0B, began shipping
with Sun Fire V440 server systems built beginning on January 1st,
2004. This later version of firmware addresses some of the reported
SCSI drive errors. In addition to the Hitachi drive firmware patch
PQ0B, Sun Fire V440 server systems built beginning on January 1st, 2004
have new updated LSI firmware installed. This later version of firmware
addresses some of the reported SCSI drive errors.
In addition, some modifications were made to the mpt driver in Solaris
to addresses some of the reported SCSI driver errors, as listed in the
Sun Fire V440 Server Product Notes document as shipped with the
system. These modifications are included in a Solaris patch for the
mpt driver.
Reported errors have occurred on the Sun Fire V440 server mpt and scsi
drivers. Sun Fire V440 server users have reported the following types
of errors:
. Connected Command Timeout
. Target X reducing sync. transfer rate
. mpt.sync_wide_backoff
. Trget X reverting to async. mode
. read/write block errors with "ASC: 0x29" or "ASC: 0x48".
Examples of SCSI error messages reported are as follows:
Nov 30 14:07:09 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0:
fault detected in device; service still available
Nov 30 14:07:09 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0:
Connected command timeout for Target 1
Nov 30 14:45:15 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0:
fault detected in device; service still available
Nov 30 14:45:15 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0:
Connected command timeout for Target 1
Nov 30 14:47:15 rhapsody scsi: [ID 365881 kern.warning] WARNING:
/pci@1f,700000/scsi@2 (mpt0):
Nov 30 14:47:15 rhapsody backoff failed for target 1
Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.notice]
/pci@1f,700000/scsi@2 (mpt0):
Nov 30 14:47:15 rhapsody got external SCSI bus reset.
Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2 (mpt0):
Nov 30 14:47:15 rhapsody mpt_check_task_mgt: Task 3 failed. ioc
status = 4a target= 0
Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@3,0 (sd4):
Nov 30 14:47:15 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@2,0 (sd3):
Nov 30 14:47:15 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 14:47:15 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@1,0 (sd2):
Nov 30 14:47:15 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 23:08:02 rhapsody genunix: [ID 408822 kern.info] NOTICE: mpt0:
fault detected in device; service still available
Nov 30 23:08:02 rhapsody genunix: [ID 611667 kern.info] NOTICE: mpt0:
Connected command timeout for Target 1
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2 (mpt0):
Nov 30 23:08:02 rhapsody Target 1 disabled wide SCSI mode
Nov 30 23:08:02 rhapsody mpt: [ID 554818 kern.warning] WARNING:
ID[SUNWpd.mpt.sync_wide_backoff.6012]
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2 (mpt0):
Nov 30 23:08:02 rhapsody Target 1 reverting to async. mode
Nov 30 23:08:02 rhapsody mpt: [ID 675377 kern.warning] WARNING:
ID[SUNWpd.mpt.sync_wide_backoff.6013]
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@3,0 (sd4):
Nov 30 23:08:02 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@0,0 (sd1):
Nov 30 23:08:02 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@1,0 (sd2):
Nov 30 23:08:02 rhapsody SCSI transport failed: reason 'reset':
retrying command
Nov 30 23:08:02 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@2,0 (sd3):
Nov 30 23:08:02 rhapsody SCSI transport failed: reason 'reset':
retrying command
Dec 1 12:09:42 rhapsody scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,700000/scsi@2/sd@1,0 (sd2):
Dec 1 12:09:42 rhapsody Error for Command: write(10)
Error Level: Retryable
Dec 1 12:09:42 rhapsody scsi: [ID 107833 kern.notice] Requested Block:
46910388 Error Block: 46910388
Dec 1 12:09:42 rhapsody scsi: [ID 107833 kern.notice] Vendor: HITACHI
Serial Number: 0337S1C9KT
Dec 1 12:09:42 rhapsody scsi: [ID 107833 kern.notice] Sense Key: Unit
Attention
Dec 1 12:09:42 rhapsody scsi: [ID 107833 kern.notice] ASC: 0x29
(), ASCQ: 0x3, FRU: 0x0
To check whether your system has the latest firmware and patches to
avoid known SCSI issues, please run the following commands.
/usr/sbin/patchadd -p | grep 115275 - Verify that -02 or later is
installed
/usr/sbin/prtconf -vp | grep 'firmware-version' - Check if LSI firmware
is 1.03.11.01 or later
/usr/sbin/format
-> Select a disk #
format> inquiry - Check drive firmware is at PQ0B or later
- Repeat for each Hitachi DK32EJ 36GB disk drive
Root cause was found to be a firmware bug on the Hitachi drives and
interaction with other bugs identified in the LSI firmware and mpt
driver.
As of January 1st, 2004 a revised version of firmware, PQ0B, to address
the Hitachi drive time out bug was included with all new Sun Fire V440
server systems built. ECO WO_27772 implemented the revised firmware
update, PQ0B, in new systems, FRUs and disk drive x-options.
Based on customer reports of SCSI errors, we believe that this error
has occurred in approximately 1% of systems shipped as of January 1st,
2004.
The action plan for resolving SCSI drive errors on the Sun Fire V440
server is to apply the following patches:
115275-02 mpt scsi driver patch
115662-01 LSI1030 scsi controller patch
and one or both of the following as needed;
116369-01 Hitachi 36GB disk drive firmware patch (firmware PQ0B)
116370-01 Hitachi 72GB disk drive firmware patch (firmware PQ0B)
Implementation:
---
| | MANDATORY (Fully Proactive)
---
---
| | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| X | REACTIVE (As Required)
---
Corrective Action:
The following recommendation is provided as a guideline for authorized
Sun Services Field Representatives who may encounter the above
mentioned issue.
Please apply the following patches in order to resolve the issue
mentioned above:
115275-02 mpt scsi driver patch
115662-01 LSI1030 scsi controller patch
116369-01 Hitachi 36GB disk drive firmware patch (f/w PQ0B)
116370-01 Hitachi 73GB disk drive firmware patch (f/w PQ0B)
It is required that the system is booted single-user mode from a
network connection when installing these patches to avoid any
complications with SDS/SVM or VxVM, and to ensure the disks are idle
during the firmware upgrades.
Note that if booted single-user mode from a CDROM, the disk drive
firmware download utility will core-dump when it tries to read the
firmware on the CDROM drive itself. It is also required to delete any
hardware RAID1 mirror that is configured with the raidctl utility,
otherwise affected drives in a hardware mirror will not be available
for upgrading.
Comments:
None.
============================================================================
Implementation Footnote:
i) In case of MANDATORY FINs, Sun Services will attempt to contact
all affected customers to recommend implementation of the FIN.
ii) For CONTROLLED PROACTIVE FINs, Sun Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Sun Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.central/FIN_FCO/index.html
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.central/
* From there, select the appropriate link to browse the FIN or FCO index.
Internet Access:
----------------
* Access the top level URL of https://spe.sun.com
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to [email protected]
--------------------------------------------------------------------------