Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1002109.1
Update Date:2011-05-27
Keywords:

Solution Type  Technical Instruction Sure

Solution  1002109.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: POST Overview  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
203006


Applies to:

Sun Fire 12K Server
Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
All Platforms
***Checked for relevance on 27-May-2011***

Goal

This document is an overview of POST (Power -On Self Test) on the Sun Fire[TM] 12K/15K/E20K/E25K platform.

Solution

When lpost executes for CPU it checks lpost versions on FPROM and SMS.
POST is the software that takes control of hardware in SunFire 12K/15K/E20K/E25K domains at power on or equivalent reset. It probes, tests and configures the domain resources and transfers control to OBP.

POST is a multi-threaded application (number of threads = number of processors + 1 for hpost). Domain processors are typically sequenced in parallel by hpost using local tests called lpost.

The hpost component

hpost (Host POST) is the controlling entity of POST. It is part of the SMS package. Hpost communicates with several sms daemons, most notably HWAD and PCD. All platform hardware communication from hpost is done using hwad, across the console bus.

The lpost components

lpost (local POST) is a component of hpost that is executed by a domain CPU. The CPU lpost tests are stored in FPROM on slot 0. The SC also has a disk copy of each lpost file in /opt/SUNWSMS/hostobjs.

For non-System boards, such as IO and expander, the appropriate lpost image from the SC's disk is downloaded to the domain's memory.

In all cases lpost is slave to hpost. Communication between the two is through SRAM on a specific I/O board.

CPU lpost version

To check the CPU lpost version on SC and FPROM (on SB) use the command given below.

Example for domain A SBs:-
   
v4u-15ka-sc0:sms-svc:1> flashupdate -d a -f /opt/SUNWSMS/hostobjs/sgcpu.flash -n

    Current System Board FPROM Information
    ========================================
    CPU at SB2, FPROM 0:
    POST   03/05/04 11:28:00  Release 5.17.0  Build 6.4 I/F 12
    OBP    03/05/04 11:27:00  Release 5.17.0  Build 6.4
    Ver    03/05/04 11:28:00  Release 5.17.0  Build 6.4

    CPU at SB2, FPROM 1:
    POST   03/05/04 11:28:00  Release 5.17.0  Build 6.4 I/F 12
    OBP    03/05/04 11:27:00  Release 5.17.0  Build 6.4
    Ver    03/05/04 11:28:00  Release 5.17.0  Build 6.4

    Flash Image Information
    ==========================
    POST   03/05/04 11:28:00  Release 5.17.0 Build 6.4 I/F 12
    OBP    03/05/04 11:27:00  Release 5.17.0 Build 6.4
    Ver    03/05/04 11:28:00  Release 5.17.0 Build 6.4

    Do you wish to update the FPROM (yes/no)  N

  • If the version of lpost on the FPROM is the same as the version on SMS:- no issues.
  • If FPROM lpost is an earlier version than SMS lpost :- CPUs associated with that FPROM may fail POST.
  • If FPROM lpost is a later version than SMS lpost :- hpost will log an uprev warning. SMS needs to be patched up if this warning is seen.

To check lpost version for non-system boards use this command:

    starcat-sc0:sms-svc:29> cd /opt/SUNWSMS/hostobjs

    starcat-sc0:sms-svc:31> mcs -p pcilpost.elf | grep elf
    pcilpost.elf:
    SMI sun4u_Sun_Fire_15K pcilpost.elf 5.17.0 Fri Mar  5 19:30:41 GMT 2004

    starcat-sc0:sms-svc:33> mcs -p caged_pcilpost.elf | grep elf
    caged_pcilpost.elf:
    SMI sun4u_Sun_Fire_15K caged_pcilpost.elf 5.17.0 Fri Mar  5 19:30:41 GMT 2004

    starcat-sc0:sms-svc:34> mcs -p explpost.elf | grep elf
    explpost.elf:
    SMI sun4u_Sun_Fire_15K explpost.elf 5.17.0 Fri Mar  5 19:31:30 GMT 2004

POST order of operation.

  1. POST reads information from PCD and determines which slots (0/1) are assigned to the domain.
  2. POST assumes that all hardware resources assigned to the domain are powered on. Powering on the components is the responsibility of SMS (setkeyswitch).
  3. Clear error state in the domain resources. If the error state can not be cleared then the component is marked as failed.
  4. The domain's resources are scanned to inventory the components present (such as CPUs, memory and I/O adapters), and their characteristics (such as type, speed and size). POST's findings are compared to entries in the PCD to confirm agreement between two. Incompatibilities such as those between part revisions or sizes and between actual and operation frequencies, are detected and handled. Components may be failed out of the configuration to maintain an internally consistent system.
  5. Built in self tests (LBIST and IBIST) are executed within and between ASICs to confirm individual ASIC operation and the communication paths between them.
  6. Lpost tests are downloaded and executed for slot 0 and slot 1 boards.
  7. At the end, POST assembles all of the components that have been successfully configured and passed all tests into the final domain configuration. This information is then passed to OBP.

Invoking POST

  1. setkeyswitch  This is the most common invocation of POST.
  2. DR attach  When slot0 or slot1 are DR'ed into running domain.
  3. Dstop/Rstop  If a failure causes a correctable error (rstop), POST is responsible for recording the state of the hardware at the time of error. If a failure causes dstop of the domain, POST will record the current state of the hardware, and the Hpost is executed to (hopefully) identify the faulty component and configure it out of the domain.
  4. Panic/reboot  After domain reset, POST must ensure that the OBP is delivered a valid set of resources from which to build a device tree.
  5. Manually  In some instances, a support specialist may execute a POST directly.

POST special cases

1. Split Expander:- To deal with this, POST examines the PCD for information on all 18 possible domains. If a domain is listed as active, then the expander boards associated with all active slots in that domain are considered active. These Expander boards and ASICs on them are assumed to have been tested and configured by previous POST process. POST will modify the configuration of the split expander, but in a manner that does not impact the running domain. If POST finds discrepancies or errors when examining the split expander, such as misconfigured ASICs or a domain stop condition, it will abandon the slot it is using on that expander, reporting it as failed.

If the expander is not in use then POST assumes ownership of the full EXB and configures it. To avoid a race condition of two different POST taking ownership of the same expander, a lock file gets created by hpost ($SMSVAR/.lock/hpost.lock.nn).

2. DR attach of an I/O board:- At attach time, DR arranges for CPU and memory from slot0 SB to run hpost. POST creates a transaction/error cage consisting of loaned processor, memory and the I/O board being tested. During POST, the cage prevents transactions from the board under test from routing to any other configured components of the running domain. After post is complete, CPU and memory are released to SunOS and I/O board is introduced into the running domain.


Internal information and links

There are few hpost command line options that are useful to know about and understand.

Be aware that running hpost directly will not bring a domain online. POST will only configure and test hardware. It does not download OBP or initiate SunOS boot process.

Command
Option

Meaning / Usefulness

-d <domain ID/TAG>

Executes POST against specific domain.

-

Lists all command line options.

-?
Lists all command line options, verbose output

-D

Creates a hardware state dump file. Typically invoked by DSMD.

When a dstop or rstop has been detected. In cases where a domain is hung and SunOS has not panicked or a domain stop has not occurred, using -D can force a dump of the hardware registers. The resulting dump file may contain useful information for troubleshooting.

-e<code>


List a terse meaning of any error code returned by POST.

-H


Configures and test a board being introduced by DR.

Do not run hpost with this option manually

-l<number>

Set the level of POST test. The level is also configurable through the  level  .postrc directive.


-L


Ensures that lpost and hpost files are valid, consistent and compatible with each other. No POST testing is performed.


- Q

Quick POST. SMS passes the -Q flag after SunOS reboot. The existing configuration of the domain is used rather than repeating full POST. This reduces reboot time of a domain.

Do not run hpost with this option manually

-v<number>


Sets verbosity of POST output


-W


Clears an rstop condition once dump file has finished processing.

Do not run hpost with this option manually

Customers often change domain/platform postrc directive to level 7 in order to save time. It is highly discouraged to change to level 7 in postrc.

For post level details and  .postrc  directives  refer to
http://esp.west.sun.com/starcat/post/
http://esp.west.sun.com/starcat/post/glossary.html
http://panacea.uk.oracle.com/twiki/pub/Products/ProdDocumentationStarcat/post.pdf
dead links as of 2011-05-27:
http://has.central.sun.com/starcat/.
http://pts-americas.west/esg/hsg/starcat/binaries/post.pdf



12k, 15K, 20K, 25K, hpost, lpost, ibist, lbist, hostobjs, flashupdate, post
Previously Published As
76758

Change History

Date: 2009-11-18
User Name: Volkmar Grote 117021
Action: Changed audience to "Contract"
Comment: Moved internal links and details into a new Internal Comments section

Date: 2005-05-05
User Name: 71396
Action: Approved
Comment: Publishing.
Version: 4

Date: 2005-05-02
User Name: 71396
Action: Accept
Comment:
Version: 0

Date: 2005-05-02
User Name: 26525
Action: Approved
Comment: I just added a link to PTS's new web page http://pts-platform.uk
Version: 0

Date: 2005-04-29
User Name: 26525
Action: Update Started
Comment: PTS web page is changing to a new location
http://pts-platform.uk
Version: 0

Date: 2004-11-24
User Name: 97961
Action: Approved
Comment: Minor update to title to cover other servers. Publishing without another round of TR.
Version: 2

Date: 2004-11-24
User Name: 97961
Action: Accept
Comment:
Version: 0

Date: 2004-11-24
User Name: 71349
Action: Approved
Comment: Added E20K and E25K to product and changed the title from
POST Overview (12K/15K/20K/25K) to Sun Fire[TM] 12K/15K/E20K/E25K: POST Overview
Version: 0

Date: 2004-11-24
User Name: 71349
Action: Update Started
Comment: Title need to be updated
Version: 0

Date: 2004-06-29
User Name: 25440
Action: Approved
Comment: Edits for clarity (and the table was producing broken XML). Publishing
Version: 0

Date: 2004-06-29
User Name: 25440
Action: Accepted
Comment:
Version: 0

Date: 2004-06-23
User Name: 28745
Action: Approved
Comment: Doc looks fine
Version: 0

Date: 2004-06-22
User Name: 136218
Action: Approved
Comment: Remco, Modified my DOC. Please review it.
Version: 0

Date: 2004-06-10
User Name: 28745
Action: Rejected
Comment: Send and email to the writer with my comments.
Setting it back to draft state for some modification.
Version: 0

Date: 2004-06-09
User Name: 28745
Action: Accepted
Comment:
Version: 0

Date: 2004-06-08
User Name: 136218
Action: Approved
Comment: kindly review and let me know your comments.
Version: 0

Date: 2004-06-07
User Name: 136218
Action: Created
Comment:
Version: 0

Product_uuid



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback