Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1010555.1
Update Date:2010-01-05
Keywords:

Solution Type  Technical Instruction Sure

Solution  1010555.1 :   I/O Timeouts Result in Unexplainable Delay  


Related Items
  • Sun Fire 280R Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Entry-Level Servers
  •  

PreviouslyPublishedAs
214514


Description
Customer generates I/O with dd if=/dev/rdsk/c3t0d0s0 of=/dev/null bs=32k
and watches the I/O progress with 'iostat -xtcn 1' to verify when I/O
starts and stops.
He has the following topology Sun280r---[Cisco MDS Switch]===3 Inter Switch
Links===[Cisco MDS Switch]---HDS 9970v running Solaris[TM] 8 108528-27 on a Sun
fire 280R.
While the I/O is running, the customer pulls one of the links between the
two MDS switches. This will cause a loss of some frames (those on the
wire).  The customer watches the I/O stop and restart using the iostat
command.
When the link is pulled, some frames will be lost as they are 'on the
wire'. This will result in an I/O timing out, and a pause of
sd:sd_io_time=## + 20seconds will result.  (Example, if sd_io_time=60,
then the pause will be observed to be 80 seconds).
This is an explanation of where the extra 20-second timeout comes from.


Steps to Follow
The driver sets a timeout for each I/O of sd_io_time. The HBA driver uses
this parameter to determine when a command is to be timed out. When a
target device is no longer responding to commands, every I/O can take up to
sd_io_time * sd_retry_count to be failed.
The SAN configuration in the problem description actually uses the ssd
driver (i.e. ssd:ssd_io_time). In the case of an all SUN stack in SAN
config, HBA driver would be attaching to the ssd target driver. If
customers use a 3rd party HBA driver (like lpfc from Emulex or JNI
drivers), they would attach to the sd target driver. The latest driver from
JNI attaches to ssd as they are LV compliant and use the Solaris[TM] drivers
for port, transport and scsi-fc mapping (fp/fctl/fcp).
Within the LV stack (Leadville), there is a 20-second delay to avoid any
unintentional removal of the cable. This ensures that the loss of sync
is due to a failure and not due to removing the wrong cable and then
realizing the mistake. The 20 seconds is used to correct the mistake.
If they do not put the cable back within 20 seconds, error recovery
will start.
A simple way to determine if a 3rd party HBA driver has a similar timeout
value in addition to the sd_io_time is to remove and replug the cable back
in and measure the delay.
The Leadville stack is made of several Sun[TM] drivers:
# modinfo | egrep '(SunFC|mpxio|scsi_vhci)'
21 101fb03a    1002c     150  1  fcp (SunFC FCP v6.0.1-2-1.20)
22 1020ab02   6f48           -   1  fctl (SunFC Transport v6.0.1-2-1.17)
23 102101a2   49ac          -   1  mpxio (MDI Library v6.0.1-1-1.7)
24 1021484c    7ac8      195   1  scsi_vhci (SCSI vHCI Driver
v6.0.1-1-1.8)
25 1021be3c    10cd3    149   1  fp (SunFC Port v6.0.1-2-1.19)
27 10239c2a    48988    153   1  qlc (SunFC Qlogic FCA v6.0.1-2-1.19)
In this example, 6.0.1 is the version of the LV stack.


Product
Sun Fire 280R Server

Internal Comments
Audited/updated 12/03/09 [email protected], Entry Level SPARC Content
Team Member

I/O, timeouts
Previously Published As
73581

Change History
Date: 2010-01-05
User Name: Silvana Villamil
Action: updated
Comments: Currency check, audited by Silvana Villamil, Entry Level SPARC Content Team Member
Audited/updated 12/03/09 [email protected], Entry Level SPARC Content
Team Member
Date: 2004-02-05
User Name: C149439
Action: Approved
Comment: Made minor edits to the document.

Gail Waldron
Version: 0
Date: 2004-02-05
User Name: 26074
Action: Approved
Comment: This appears to be technically correct.
Version: 0
Date: 2004-02-03
User Name: 27190
Action: Approved
Comment: Thanks for the review. Most of the doc came from Renaud Manus and Sanjay Tripathi.
Version: 0
Date: 2004-01-27
User Name: 27190
Action: Created
Comment:
Version: 0

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback