Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1020243.1
Update Date:2011-05-02
Keywords:

Solution Type  Technical Instruction Sure

Solution  1020243.1 :   T5440 CMP configuration changes  


Related Items
  • Sun SPARC Enterprise T5440 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>CMT Servers
  •  

PreviouslyPublishedAs
254769


Description
One of the major design changes for the T5440 is the POST state on each PLX chipset is dependent on the CMP configuration, this in turn determines which paths are available to the internal and external I/O.

This document covers the process, and potential problems, for changing CMP configurations.

Steps to Follow
The following process is detailed in the Platform Service manual. Access to each PLX is via the local CMP if present, otherwise the upstream port is disabled and communication is PLX <-> PLX driven by the next lowest numbered CMP (ie. CMP0 > PLX3 > PLX2 | CMP1 > PLX1 > PLX0 in a 2P configuration). In a 1P configuration all paths are accessed via the CMP0 upstream path.

What does this mean in the field? Whenever we change the configuration we need to ensure the ILOM/VBSC are aware of which upstream ports should be active and the OS updated with any device path changes.

The ILOM holds the current masks that determine which ports are active, we can force VBSC to update these by rescanning on just the next power-on, or after every power-on via the ioreconfigure ILOM parameter. The default is to never perform this even if we change the CMP configuration changes, whether that be following a CMP failure or an increase/reduction in modules installed so engineers will need to do this manually whenever changes are made. Any changes also need to be reflected in the OS, we provide a Perl script that needs to be run when booted off a net install image with the root disk mounted and this must be run before booting the OS after a configuration change to prevent errors and a possible path_to_inst rebuild.

Note: Both the procedure and the script are covered in the T5440 Service manual. In the Service manual the reconfig.pl script is called reconf.pl, but it is actually the same script.

Note: The script 'reconfig.pl' is it available via patch 10264587: I/O Remapping Script for Sun SPARC Enterprise T5440 Server - Solaris SPARC

In summary after changing the CMP configuration we need to do the following:
1.On the ILOM set the reconfigure parameter
set /HOST ioreconfigure=nextboot
2.Set 'auto-boot?' to false to stop us booting on powerup eeprom auto-boot?=false
3.Shut down and power cycle the host init 0 stop /SYS start /SYS start /SP/console
4.Boot off the network, mount the root drive and run the reconfig.pl script {0} ok boot net -s Boot device: /pci@500/pci@0/pci@c/network@0 File and args: -s /pci@500/pci@0/pci@c/network@0: 100 Mbps link up Requesting Internet Address for 0:14:4f:ec:d9:22 Requesting Internet Address for 0:14:4f:ec:d9:22 /pci@500/pci@0/pci@c/network@0: 100 Mbps link up SINGLE USER MODE # mount /dev/dsk/c0t0d0s0 /mnt # cd /mnt # /reconfig.pl replacing /pci@400/pci@0/pci@8/pci@0/pci@8/pci@0/pci@8 with /pci@700 in /etc/path_to_inst updating /dev symlinks replacing /pci@400/pci@0/pci@8/pci@0/pci@8 with /pci@500 in /etc/path_to_inst updating /dev symlinks replacing /pci@400/pci@0/pci@8 with /pci@600 in /etc/path_to_inst updating /dev symlinks #

If you don't reconfigure the upstream PLX ports when upgrading the number of CMPs you will still be able to access all devices but it will be driven through the single upstream port which I imagine will be a performance hit.
If you fail to reconfigure after degrading the number of CMPs you will lose access to whichever devices were connected via that specific upstream port;
4P system reduced to a 1P, no ioreconfigure so the VBSC will try to access the onboard network through the only active upstream port it has (pci@400 = CMP0) - however the upstream to CMP1 is still held in the configuration so device access fails:
{0} ok boot net -s
Boot device: /pci@400/pci@0/pci@8/pci@0/pci@8/pci@0/pci@c/network@0  File and args: -s
ERROR: boot-read fail
Can't locate boot device
{0} ok
Perform a quick ioreconfigure and everything is working again: -> set /HOST ioreconfigure=nextboot Set 'ioreconfigure' to 'nextboot' -> start /SYS Are you sure you want to start /SYS (y/n)? y Starting /SYS -> start /SP/console Are you sure you want to start /SP/console (y/n)? y Serial console started. To stop, type #. T5440, No Keyboard Copyright 2008 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.28.11, 16032 MB memory available, Serial #82630946. Ethernet address 0:14:4f:ec:d9:22, Host ID: 84ecd922. {0} ok boot net -s Boot device: /pci@400/pci@0/pci@8/pci@0/pci@8/pci@0/pci@c/network@0 File and args: -s /pci@400/pci@0/pci@8/pci@0/pci@8/pci@0/pci@c/network@0: 100 Mbps link up

If you booted to Solaris then network services, and dependent services, would fail to start. If you boot to Solaris[TM] after the ioreconfigure but before running reconfig.pl (via the net install image) you will lose access to devices as above, but additionally when you later perform the OS device path reconfigure multiple device path entries will be created in the platform path_to_inst resulting in errors on reboot:
# mount /dev/dsk/c0t0d0s0 /mnt
# cd /mnt
# /reconfig.pl
replacing /pci@600 with /pci@400/pci@0/pci@8 in /etc/path_to_inst
updating /dev symlinks
replacing /pci@700 with /pci@500/pci@0/pci@8 in /etc/path_to_inst
updating /dev symlinks
# ls -lc etc/path_to_inst
-r--r--r--   1 root     root        2687 Nov  6 10:10 etc/path_to_inst
#

Rebooted to check everything is ok and errors reported during boot:
WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0' (driver pxb_plx), 18 used
WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@9' (driver pxb_plx), 19 used
WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@c' (driver pxb_plx), 20 used
WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@d' (driver pxb_plx), 21 used
WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0' (driver pxb_plx), 22 used
WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0/pci@9' (driver pxb_plx), 23 used
WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0/pci@c' (driver pxb_plx), 24 used

The reconfig.pl output shows that the CMP2 and CMP3 upstream paths have been swapped with CMP0 and CMP1 due to the PLX <-> PLX pathing. The best method for clearing the multiple path entries is to rebuild the path_to_inst from scratch:
# echo "#path_to_inst_bootstrap_1" > /etc/path_to_inst
# sync
# sync
# sync
# reboot 

If the customer is using LDOM this will cause further problems since we will lose any virtual devices, at this time we are unsure of the implications to ZFS.
So in summary customers/field engineers need to follow the correct procedure every time, this has so far proven to be 100% reliable in reconfiguring the platform and OS correctly. However we need to be aware of what occurs when things go wrong, and in fairness it is reasonably simple to recover from. Please be aware that customers using software raid (such as SVM) will need to detach one side of their root mirror prior to running the reconfigure script - once booted from the network OS image SVM will not be available and the underlying drive rather than the metadevice will be mounted. Once the reconfigure and reboot is complete simply reattach the submirror and allow to synchronize.

NOTE: A new Solaris command device_remap is added to S10U8 and later, which provides the functionality of the reconfig.pl script. For more details reference the man page of the device_map command.
 * The reconfig.pl is supported on all Solaris 10 releases.
 * The device_remap command (script) is supported with S10U8 and later and is not certified on earlier Solaris releases.


Product
Sun SPARC Enterprise T5440 Server

PLX, CMP, ioreconfigure, reconfig.pl

Change History
Date: 2010-12-21
User name: Dencho Kojucharov
Action: Currency check
Comments: audited by Entry-Level SPARC Content Lead
made a few minor updates (added reference to service manual)

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback