Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1002804.1
Update Date:2010-08-19
Keywords:

Solution Type  Problem Resolution Sure

Solution  1002804.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: Hpost fails an IO board which "Shares an expander with an unthrottled USIV+ board which is active in another domain."  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
203828


Applies to:

Sun Fire E20K Server
Sun Fire E25K Server
Sun Fire 12K Server
Sun Fire 15K Server
All Platforms

Symptoms

Hpost fails an IO board on a split Expander:

stage asic_probe: ASIC probe and JTAG/CBus integrity test...
FAIL Slot IO5: Shares an expander with an unthrottled USIV+ board which is active in another domain.
There is no FRU service action indicated for this failure.

Changes

Sun Fire[TM] 12K/15K/E20K/E25K: Hpost fails an IO board which "Shares an expander with an unthrottled USIV+ board which is active in another domain"

This new behavior was introduced in Patch ID 124319-01 , which included additional checks for AR3 (Address Repeater version 3).

The behavior is not a bug, but an additional safety check to prevent the possibility of a Dstop due to an AR overflow.

Cause

A split expander is an expander that has it's System Board (SB) in one domain, and it's IO board (IO) in a different domain.

For example, configuring SB0 in Domain A and IO0 in Domain B would split expander EX0 between domains A and B.

Solution

The preferred method to resolve this issue is to unsplit the offending Expander.

To do this, check to see if a re-arrangement of the System Boards in the system would result in unsplitting expanders. If at all possible, this is by far the best solution.

If it is impossible or impractical to unsplit the expander through SB re-assignment, the second choice is to configure the US-IV+ board with the .postrc directive

 force_us4plus_throttle_vector x<board_mask in hex>

There is a slight performance penalty when an SB is configured that way so it probably isn't appropriate to just configure them all. The vector should be chosen such that the only USIV+ boards to be configured are the ones which are likely to share an expander with an IO board in another domain.

The vector is in the form of a Hex bitmask.
As an example, to set the vector to throttle SB's 5, 12 and 13, the mask would look like this:

 0x03020 

The mask becomes more obvious when the hex is broken down into binary:

   0x 0     3     0     2     0
      |     |     |     |     |
     00  0011  0000  0010  0000
      ^     ^     ^     ^     ^
Bit  16    12     8     4     0


Counting from right to left, starting at 0, it can be seen that bits 5, 12 and 13 are set to 1, and all other bits are set to 0.The .postrc directive can be added to the .postrc file as a single line. Be aware that there is a searchpath for .postrc files. The first .postrc file found will be read and used by hpost:
  1. A file named with the -p command line option. If this is invoked, the file must exist and be readable by the hpost process, or this is considered a fatal error. The argument to -p can also be the reserved word "none", which causes no .postrc file to be read.
  2. The current working directory is searched for a file named ./.postrc. Starting with SMS 1.4, this step is skipped if the current working directory is the location of the domain or platform .postrc files (see below), or if it is /var/tmp, the home directory of SMS daemons.
  3. A domain-unique .postrc file in
    /etc/opt/SUNWSMS/config/[A-R]/.postrc
    where [A-R] is the current domain designator passed with -d.
  4. The platform default .postrc file:
    /etc/opt/SUNWSMS/config/platform/.postrc
Make sure that if you introduce a domain specific .postrc file, that settings from the platform's .postrc at /etc/opt/SUNWSMS/config/platform/.postrc are duplicated into the domain's .postrc file.

To see if a USIV+ Uniboard is throttled the last hpost log for that domain can be checked. It will display what .postrc it has used (if any) and if that has the throttle vector set. Be aware that SMS also will throttle the USIV+ Uniboard automatically if the domain with the IO board has been brought up first. In this manner no .postrc throttle vector will be seen in the HPOST logs but still the USIV+ Uniboard will be throttled. So the sequence in which the domains have been brought up will need to be investigated as well to get a complete picture.


FAQ:

Q1. We have a US IV+ SB which is not throttled and we there for we cannot use the IO in the same Expander in another domain. If we DR out the SB, set the throttle vector, and DR in the SB then we can boot the other domain with the IO in that that (split) expander. So does DR re-read the vector settings in the .postrc file?

A1. Yes it does.

Q2. Does every HPOST level re-read the throttle vector settings? So does even a "init 6" or DSTOP event cause the throttle vector to be re-read?

A2. Any POST run other than a "-Q (reboot)" will read the vector and implement it. The reboot case is special, in that the POST code will read the vector and check it against the current state read from the hardware. If they differ, then a message will be printed and the -Q run will be aborted. The result of this is that the normal domain recovery procedure takes over and POST is rerun at a higher level to bring the domain up, causing the vector to be read and implemented by the second POST run. The end result of changing the .postrc entries and rebooting is that the vectors will be implemented as desired but the reboot could take substantially longer than normal.


Product
Sun Fire E25K Server
Sun Fire E20K Server
Sun Fire 12K Server
Sun Fire 15K Server


Internal Comments:

This is the new behavior due to a workaround for the AR3 Q overflow problem.
The change went into patch 124319-01

There is also a bug that caused this issue to happen during DR
even if there were NO split expanders in the domain. See Bug ID
6532306 for details.

This is fixed in SMS Patch
120648-06 for SMS 1.5
124319-03 for SMS 1.6

See also Bug ID 6459140

For additional considerations on split expanders, please see Document 1018758.1

For addition information on the processing of .postrc file, please see Document 1008906.1

Additional info about .postrc directives can be found here

Thanks to Don Kay whose mail on [email protected] provided the wisdom for this document.

split expander, sms 1.6, US IV+, AR3, force_us4plus_throttle_vector, hpost, postrc, .postrc, DR, vector, throttle

Previously Published As DocId 88887

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback