Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1007568.1
Update Date:2011-05-23
Keywords:

Solution Type  Technical Instruction Sure

Solution  1007568.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: Testing a single slot 0 board with no slot 1 board in a domain  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
210473


Applies to:

Sun Fire 12K Server
Sun Fire E20K Server
Sun Fire E25K Server
Sun Fire 15K Server
All Platforms

Goal

Before putting a replacement or new System Board into a production environment, you should run high level hardware HPOST tests against that board and components to confirm its sanity.

To reduce domain downtime while doing these hardware tests, a method for testing this new or replacement hardware is to configure the board in its own "test" domain for extended HPOST testing.

The method below can be used to test a single Slot0 board (System Board) in a "test" domain without a Slot1 (IO or maxcpu) board in the configuration. After successful testing has completed, the new or replacement board can be dynamically reconfigured (DR'ed) into the production domain with confidence that it does not contain bad hardware.

Solution

Testing a slot 0 board in a domain with no slot 1 board can be done. The following steps will guide you through doing this. The example uses the unused domain 'R' and tests system board SB15.

Edit the postrc file for domain R ($SMSETC/config/R/.postrc) to contain the three postrc directives shown below. Make certain the file has world readable permissions (chmod 644) and remove this temporary postrc file when your work is completed.
level 64
no_ioadapt_ok
no_obp_handoff

NOTES:
  • hpost should not be run manually from the command line.
  • Sun strongly encourages post level 64 or higher be run on newly inserted hardware. If new memory is inserted level 96 or higher is advised

In order to perform the test:
% addboard -d R -c assign SB15
% setkeyswitch -d R on

Powering on: CSB at CS1 Already powered on: CSB at CS1 Powering on: CSB at CS0 Already powered on: CSB at CS0 Powering on: EXB at EX15 Already powered on: EXB at EX15 Powering on: CPU at SB15 Already powered on: CPU at SB15 NOTE: There are no Slot 1 system boards assigned to this domain. Significant contents of .postrc (domain) /etc/opt/SUNWSMS/SMS1.5/config/R/.postrc: no_ioadapt_ok no_obp_handoff level 64 no_obp_handoff in .postrc. COD CPU license requests will be skipped to facilitate offline hardware testing. . . stage final_config: Final configuration... Skipping OBP handoff as requested Key to resource status value codes: =Unknown p=Present c=Crunched _=Undefined m=Missing i=Misconfig o=FailedOBP f=Failed b=Blacklisted r=Redlisted x=NotInDomain u=G,unconfig P=Passed ==G,lockstep l=NoLicense e=EmptyCasstt CPU_Brds: Proc Mem P/B: 3/1 3/0 2/1 2/0 1/1 1/0 0/1 0/0 Slot Gen 3210 /L: 10 10 10 10 10 10 10 10 CDC SB15: P PPPP PP PP PP PP PP PP PP PP P Exitcode = 36: Non-configuration special hpost mode successful POST (level=64, verbose=20) execution time 66:48 [5304] Domain failed by hpost: ecode=36 Resetting and deconfiguring: CPU at SB15 Resetting and deconfiguring: EXB at EX15 Powering on: CSB at CS0 Powering on: CSB at CS1 %


How do you determine if the tested Systemboard is OK: checkout the listed status value codes at the very end of the POST run. You will see only 1 Systemboard, which is what you expect when testing a single board.

Do not be fooled by the following side-effects of having no_obp_handoff in domain R's .postrc file:

  • your message file will report setkeyswitch to have failed:

Apr 30 08:34:54 2006 s4u-12ka-sc0 setkeyswitch[20826]-R(): [5304 78389629078374 
ERR KeyswitchUtls.cc 1963] Domain failed by hpost: ecode=36
  • regardless the POST test result, showboards will report  Unknown  in the  Test Status  column, this is caused the PCD not being updated with the test result; showboards is querying the PCD, hence the unknown test status:

   Location Pwr Type of Board Board Status Test Status Domain
-------- --- ------------- ------------ ----------- ------
SB15 On CPU Assigned Unknown R

Product
Sun Fire 15K Server
Sun Fire 12K Server
Sun Fire E25K Server
Sun Fire E20K Server

Internal Section

Additional References:

Keywords: HPOST, slot0, slot1, DR, diagnostic, post, test, 12k, 15k, 20k, 25k

Previously Published As 47497



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback