Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1018756.1
Update Date:2011-05-11
Keywords:

Solution Type  Technical Instruction Sure

Solution  1018756.1 :   Sun Enterprise[TM]10000 / Sun Fire[TM] 12K/15K/E20K/E25K servers: Dynamic Reconfiguration (DR) Cheat Sheets  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
  • Sun Enterprise 10000 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
230483


Applies to:

Sun Fire 12K Server
Sun Enterprise 10000 Server
Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
All Platforms
***Checked for relevance on 11-May-2011***

Goal

Dynamic Reconfiguration (DR) has seen a variety of changes over the past years.

Below is a quick guide that can be used to help set up and use DR in the Sun Enterprise 10000 (E10K) and Sun Fire 12K/15K/E20K/E25K server environments.

Solution

Steps to Follow
Sun Enterprise 10000 (E10K)

The method in which DR is enabled, differs according to the Solaris[TM] Operating System(OS)release. This applies to all versions of DR.

For Solaris 2.5.1 OS, DR is enabled by setting the Open Boot PROM(OBP) parameter(dr-max-mem), to any non-zero number via 'setenv' or 'eeprom'.  See the following examples.
ok  setenv  dr-max-mem 1
or
# eeprom dr-max-mem=1

NOTE:  If 'dr-max-mem' is set to 0, DR attach/detach is DISABLED.  If 'dr-max-mem' is set to anything other than 0 (non-zero), DR attach/detach is ENABLED.  This value denotes the maximum memory configuration permitted for the domain after all  DR attaches have been completed.  For example, a value of 16384 would allow for a maximum of 16GB of memory.  However, be careful not to set this variable too high, as it unnecessarily enlarges the kernel and wastes memory that might be better used elsewhere.

For Solaris 2.6 OS(similar to 2.5.1), DR is enabled by setting the OBP parameter (dr-max-mem) to any non-zero number, via 'setenv' or 'eeprom'.  See the following examples.
ok  setenv  dr-max-mem 1
or
# eeprom dr-max-mem=1

NOTE:   If 'dr-max-mem' is set to 0, DR attach/detach is DISABLED.  If 'dr-max-mem' is set to anything other than 0 (non-zero), DR attach/detach is ENABLED.  If the value is specifically set to 2, it will make the number of DR kernel  pages at boot time, 5X larger than the normal value. Be aware, that in environments with large configurations (i.e., Tbs of storage), it is possible to exhaust the kernel resources prior to the system becoming fully active. Review Bug ID 4218687  for details.


For Solaris 7-10 OS's, DR is enabled with an entry(kernel_cage_enable) in the /etc/system file. When this variable is set to  1 , it is enabled. If set to  0  then this function is disabled.  The 'dr-max-mem' OBP parameter becomes obsolete as well, with Solaris 7-10 OS's.  The following, represents an example entry in the /etc/system file, to enable DR:
* DR enabled set  kernel_cage_enable=1
* DR entry complete
There are three versions of DR that can be utilized on an E10K platform
Legacy DR (DR) - This was the initial release of DR, seen in SSP 3.1 through SSP 3.3. Each DR operation consisted of a 3 step manual process.
    1. To add a board (ex. SB6):

      ssp:domain% dr
      dr>  init_attach  6
      dr> drshow  6  obp (to verify board inventory)
      dr> complete_attach 6
      dr> exit


    2. To remove a board (ex. SB6):
      NOTE:  Stop edd so that no Recordstops can occur during a detach DR operation.
      If a Recordstop were to occur during a DR operation, the domain will have to be STOPPED!
      Therefore, you should stop 'edd' and then re-start it again after DR is finished with the 'edd_cmd' command:

      ssp% edd_cmd -x stop
      ssp:domain% dr
      dr>  drain 6
      dr>  drshow 6  IO  (determine if there is active I/O on board being detached)
      dr>  complete_detach 6
      dr>  reconfig
      dr>  exit
      Restart edd again:
      ssp% edd_cmd -x start

Automated DR (ADR) - Introduced in SSP 3.3, ADR had a new command structure that would allow users to use DR in scripts to 'automate' the process so each DR operation is completed by one command instead of three, as in the previous release.

New Generation DR (ngdr) - Introduced with the Sun Fire 12K/15K and backported into the E10K in SSP 3.4 and SSP 3.5 running Solaris 8 and Solaris 9 OS.  This new command structure, allows for remote DR capabilities as well.
These automated methods may be used for DR operations:
  1. addboard -d <domain> [-f] [-q] {-b board_number | SB<x>}
  2. moveboard -d <domain> [-f] [-q] {-b board_number | SB<x>}
  3. deleteboard -d <domain> [-f] [-q] {-b board_number | SB<x>}

Adding a board (ex. SB6):
ssp% addboard -b 6 -d domain_name -r 2 -t 600
where (-b) is SB#, (-d) is domain name, (-r) is # of retries, (-t) timeout

Removing a board (ex. SB6):
ssp% deleteboard -b 6 -r 2 -t 900

Moving a board (ex. SB6):
ssp% moveboard -b 6 domain_name -r 2 -t 900

If any RT (real-time) processes are running on a domain, it will prevent a DR from completing. These processes must be stopped for DR to work properly, if it complains about them. Use the command:
ssp% ps -eo class | grep RT
to identify which PIDs(Process Ids) to kill if necessary. Be aware of which RT processes are running, and what their exact function is. Be sure to understand any adverse affects that may arise if these processes are killed manually.

Sun Fire 12K/15K/E20K/E25K Servers

Syntax for SMS (System Management Services) 1.x DR commands from the SC (System Controller):
  1. addboard -d <domain_id|domain_tag> [-q] [-f] <SB 0-17 or IO 0-17>
  2. moveboard -d <domain_id|domain_tag> [-q] [-f] <SB 0-17 or IO 0-17>
  3. deleteboard [-q] [-f] <SB 0-17 or IO 0-17>
Examples:
sms> addboard -d A SB10
sms> moveboard -d B SB7
sms> deleteboard SB0

If running 'rcfgadm'(Remote configadm) commands from the SC, the usage may be as follows: 
sc0:sms-user:> rcfgadm -d <domain_id|domain_tag> [-f] [-v] -c <function> <APID>
            function  -  assign | unassign, configure | unconfigure, or connect | disconnect
           APIDs  - can be either logical or physical, and are either static or dynamic.
PHYSICAL EXAMPLES:
/devices/pseudo/dr@0:IO4
/devices/pseudo/dr@0:IO6
/devices/pseudo/dr@0:IO14
/devices/pseudo/dr@0:SB4
/devices/pseudo/dr@0:SB6

LOGICAL EXAMPLES:
IO4, IO6, IO14, SB4, SB6

STATIC AP TYPES:
HPCI, CPU, MCPU, pci-pci/hp

DYNAMIC AP TYPES:
cpu, mem, io

Examples:
sc0:sms-user:> rcfgadm -d a -f -c configure SB6
sc0:sms-user:> rcfgadm -d a -c unconfigure IO14
sc0:sms-user:> rcfgadm -d a -c configure SB6
sc0:sms-user:> rcfgadm -d a -c configure pcisch3:e06B1slot2  <--DR an I/O component (See Below)

Breakdown of specific I/O card to DR:

Example from above: sc0:sms-user:> rcfgadm -d a -c configure pcisch3:e06B1slot2

pcisch<#>:   This represents the pcisch device instance number.  The example shows that the device being configured is pcisch3, the third instance of a pcisch device for this domain.  Prior to configuring a new device instance, you should do a grep pcisch /etc/path_to_inst on the domain to confirm what instances of the device are currently configured.  Choose the next available instance to configure into the domain. 

e<#>:   This indicates the Expander Board location of this device.  The example shows that this is e06, indicating the device is located on Expander 06.

B1:   Indicates a slot1-type board.
NOTE:  The board type will always be B1 on a Sun Fire[TM] 12K/15K/E20K/E25K for the I/O devices, because a slot1 board is the only type of board where these devices can be installed.

Slot<#>:   Indicates the Cassette slot# (1-4) that the device is located in on a slot1 board.  The example above shows slot2.
This is the Bottom Left cassette slot on the I/O Board.

See Technical Instruction Document: 1017493.1 for a diagram. 
Useful information gathering commands:
rcfgadm -d a        lists all attachment points except dynamic points.
rcfgadm -d a -al    lists all current configurable hardware information (including dynamic).
rcfgadm -d a -avl   lists all current configurable hardware in verbose mode.
If the 'cfgadm' (configadm) command on the domain is used:
cfgadm [-f] [-v] -c <function> <APID>
Command uses the same syntax rules and examples as you see above with `rcfgadm`. The difference is, that 'cfgadm' is executed on the domain itself, not from the SC as 'rcfgadm' is used. There is no '-d <domain_id|domain_tag>' option required for 'cfgadm'.


Reference material:
10K Dynamic Reconfiguration Documentation
Sun Fire[TM] 12K/E25K Dynamic Reconfiguration Documentation


Internal Comments
DR, Dynamic Reconfiguration, SMS, SSP, E10K, e10k, 12k, 15k, 20k, 25k, cfgadm, rcfgadm, addboard, deleteboard, moveboard, dr
Previously Published As 50361

References

<NOTE:1001683.1> - Sun Fire[TM] 12K/15K/E20K/E25K: Location and Relocation of Kernel for DR Operations
<NOTE:1010363.1> - Sun Fire[TM] 12K/15K/E20K/E25K Servers: Dynamic Reconfiguration Considerations
<NOTE:1012349.1> - Kernel Cage Splitting Overview
http://download.oracle.com/docs/cd/E19065-01/servers.10k/816-3627-10/index.html

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback