Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1010363.1
Update Date:2011-02-17
Keywords:

Solution Type  Technical Instruction Sure

Solution  1010363.1 :   Sun Fire[TM] 12K/15K/E20K/E25K Servers: Dynamic Reconfiguration Considerations  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
214215


Applies to:

Sun Fire E20K Server
Sun Fire 12K Server
Sun Fire E25K Server
Sun Fire 15K Server
All Platforms

Goal

The sections below are intended to delineate important points for various operations which should be known and understood prior to performing the given operation. This document is not intended as a tutorial on using DR command line syntax, etc. It may be helpful as a starting point for development of site specific DR practices and/or procedures.

Dynamic Reconfiguration (DR) is a powerful and useful tool for allocating and de-allocating resources within a given domain without interruption. It is useful when performing service actions on components in a system. However, some configurations are better suited to maximize the effectiveness of DR than others. Because of this, to fully appreciate the impact of a DR operation on a domain, a detailed understanding of the domain's configuration and application workload is required.

Please continue the use of My Oracle Support Portal to search the knowledge base for assistance in troubleshooting a specific Problem/Resolution (assuming this document does not meet your needs). Or, if you still need assistance, you can open up a service ticket, and obtain direct support from an engineer who will assist in troubleshooting your specific DR issue to resolution.

Solution

TERMS

When describing DR operations, this document uses the following terms:
Sun Enterprise[TM]10000 / Sun Fire[TM] 12K/15K/E20K/E25K: Dynamic Reconfiguration (DR) Cheat Sheets

DR Term

Definition

Attach

Adding a System Board (SB)/IO board to a running domain.

Detach

Removing a SB/IO board from a running domain.

Hot-Plug

Adding/removing a PCI adapter from a running domain.

Slot 0 DR

DR involving a board that physically resides in Slot 0 of an expander.

Slot 1 DR

DR involving a board that physically resides in Slot 1 of an expander.

ASSUMPTIONS

     It is a basic assumption that both the System Controller (SC) and Domain are running the minimum Solaris[TM] and System Management Services (SMS) software versions and patches required to support DR.

     At the time of this writing, the minimum requirements for DR are:

DR Scenario

SC OS & SMS Version

Domain OS Version

Slot 0 DR

Solaris 8 2/02 (S8U7) + SMS 1.2 +  patches:

112481-01, 112482-01, 112483-03, 112484-01, 112485-01, 112486-01, 112487-01, 112488-03, 112547-01, 112552-01, 112641-01

Solaris 8 02/02 (S8U7) + patches:

111111-03, 112396-02, 108987-09, 108528-14, 110820-08, 111332-05, 110838-05, 111097-08, 111335-12



Solaris 9 + patches:

112840-01 112841-01

Slot 1 DR

Solaris 8 02/02 (S8U7) + SMS 1.3

Solaris 8 02/02 (S8U7) + patches for slot0 DR + patches:

110826-07, 110836-05, 110837-04, 110838-06, 111293-04, 111310-01, 111332-06, 111335-16, 108528-18, 110820-10



Solaris 9 04/03 (S9U3)

Hot-Plug

SMS 1.1 (or higher)

S8U6 (or higher)

Minimum Requirement Notes

     The Sun Fire 15K/12K Dynamic Reconfiguration Installation Guide and Release Notes lists only the minimum set of patches required for DR, but it is recommended to have the most recent patch levels installed.  When appropriate, this document lists minimum desired patch revisions based on experienced customer problems, however it is not all-encompassing and does not preclude the need for proper patch management for a Sun Fire[TM] 12K/15K/E20K/E25K system.

     You can manually confirm that the patches listed above are applied by using the following command on the SC or Domain:

               # showrev -p | grep <patchid>                        NOTE:  patchid refers to the patch number without the revision number at the end.

     Furthermore, it is assumed that the Sun Fire[TM] 12K/15K hardware has undergone Field Change Orders (FCOs) A0192 and A0193. These FCOs upgrade the AXQ and Schizo ASICs to acceptable levels:

     Slot 1 DR requires AXQ 6.1 and Schizo 2.3 at a minimum (As of this update, January 2005, records indicate that the entire install base have already completed these FCOs).

     NOTE:  These FCOs do not apply to Sun Fire[TM] E20K/E25K.

     Additionally, it is assumed for any attach operations that the hardware being introduced into the system is fault free. POST may detect faults during the attach process which must be corrected by Sun personnel. This document does not discuss diagnosis of failures and/or replacement procedures.

     The Sun Fire[TM] 12K/E20K platform consists of up to 9 sets of System Boards (SB) and I/O Boards (IO).  The Sun Fire[TM] 15K/E25K platform consists of up to 18 sets of SBs and IO.  A domain requires at minimum one SB and one IO Board be in the configuration in order for the domain to function.  The maximum domain configuration is 9 SBs and 9 IO for the 12K/E20K platform, and 18 SBs and 18 IO for the 15K/E25K platform.

     In a minimum configuration (one SB and one IO) you can NOT REMOVE the SB or the IO board from the configuration because the domain would not function without one of these required resources.  DR would fail in this event anyway, and probably state the reason for the failure as  operation unsupported  or something to that effect.  In this minimum domain configuration, you could ONLY ADD A RESOURCE (SB or IO) to the domain using DR.

     Removing a component from a domain is a trickier topic then adding resources.  In a nut shell, removing a resource, especially a SB, requires that the remaining resources in the domain configuration can support the loss of the resources from the removed component.  This means that if you have a 2 SB domain and you need to DR out one of the boards, the remaining board needs to have enough CPU power and memory cache to take over the functions of the domain when the other board is gone.  If one board contains 2GB of memory and the board being removed contains 4GB, then DR is likely to fail.  You can not reallocate 4GB of cache into a 2GB space.  If need be, remove the smaller cache board instead.  See the DETACH OPERATIONS section of this document for details on this and other gotchas .

ATTACH OPERATIONS

     Prior to executing an attach operation for a component, review the notes and points listed below to minimize potential problems with the operation. Document 1010760.1 "Sun Fire[TM] 15K/12K/E20K/E25K Servers: What Happens in a DR Slot0 Attach Operation" is a good reference for what to expect during a process like this.

Slot 0 CPU/Memory Board

      Ensure the CPU/Memory board is flash updated to LPOST version 5.13.4 or higher.  

          Lesser versions of LPOST are exposed to Bug 4728549, which may cause the target domain to hang.  As a best practice, the board being introduced should be flashed to the same version used in the rest of the target domain.

          Document 1003372.1 Sun Fire[TM] 12K/15K/E20K/E25K: Firmware Revisions  contains the firmware support matrix.

          See Technical Instruction 210493 for the step by step process to be used when testing a slot0 board (SB) by using DR without having to have a slot1 (IO) board assigned to the domain.

Slot 1 MaxCPU Board

      Ensure the MaxCPU board is flash updated to LPOST version 5.13.4 or higher.  

           Lesser versions of LPOST are exposed to Bug 4728549 which may cause the target domain to hang.  As a best practice, the board being introduced should be flashed to the same version used in the rest of the target domain.

          Document 1003372.1 "Sun Fire[TM] 12K/15K/E20K/E25K: Firmware Revisions" contains the firmware support matrix.

          NOTE:  MaxCPU is NOT supported in Sun Fire[TM] E20K/E25K.

Slot 1 HPCI/HPCI+ Board

          Can the application(s) in the target domain tolerate 1 less processor   

          IO boards do not have processors or physical memory. POST requires a processor and a small amount of memory to execute its tests. Prior to POST running, Solaris "loans" POST a processor and memory for testing. During POST execution, this processor an memory is not available to the domain.

      Are there bound processes to all Slot 0 CPUs on the domain   

          The logic Solaris uses to select a "loaner" processor for IO attach operations is to step  through, from lowest to highest, all the Slot 0 CPUs in the system. Slot 1 CPUs (i.e., MaxCPU) are ignored. If a CPU has processes bound to it, that CPU cannot be off-lined and attach logic considers it unavailable. If all CPUs in the domain have bound processes, selection of a "loaner" processor fails.  

          DR will not automatically rebind processes to other CPUs. This must be done by an administrator. Whether to unbind/rebind the processes, and to which of the remaining CPUs in the domain, is a decision that must be made by someone knowledgeable of the application(s) on the domain. To locate and rebind bound processes, use the pbind command.

      Are the PCI adapters in the board qualified for hot-plug   

          For Sun adapters, a list of qualified adapters is maintained by Sun Microsystems Marketing and Sales staff.  Contact your sales representative for details.

      Are the PCI adapters in the board known, good adapters   

          It must be noted that POST does no testing, stressing or verification of the PCI adapters present in the IO board. If the adapters have not been stress tested by other means (SunVTS[TM] software, etc.) a card with a fault may be introduced into the system.

      Does the IO board contain 3rd party adapters   

          Sun does not qualify all vendor PCI adapters. Refer to the THIRD PARTY STATEMENT below.

      Followup Configuration:  

          After I/O devices are attached into a domain, followup configuration is likely required (network plumbing, file system creation, etc.).

DETACH OPERATIONS

Prior to executing a detach operation for a component, review the notes and points listed below to minimize potential problems with the operation.

Document 1003582.1 "Sun Fire[TM] 12K/15K/E20K/E25K: What Happens in a DR Slot0 Detach Operation" is also a good reference for what to expect to happen during this process.

Slot 0 CPU/Memory Board

     Does the board being detached have processes bound to its CPUs 

          CPUs with bound processes cannot be detached. DR will not automatically rebind processes to other CPUs. This must be done by an administrator.  Whether to unbind/rebind the processes, and to which of the remaining CPUs in the domain, is a decision that must be made by someone knowledgeable of the application(s) on the domain. To locate and rebind bound processes, use the pbind command.

     Can the application(s) on the domain tolerate fewer processors and less memory 

          A detach of a CPU/Memory will reduce the domain in both memory capacity and processing power. This may have an impact on domain application(s). A prior workload baseline may help determine if the application(s) performance will suffer with less resources.

          To avoid the tradeoff, if available, a different CPU/Memory board could be attached to the domain prior to detaching a board. This would also provide CPUs to shift bound processes to.

     Does the board being detached contain Intimate Shared Memory (ISM) pages 

          ISM is extensively used in database applications (Oracle, Informix, Sybase, etc.), ISM  pages cannot be paged out, and therefore must be relocated to other physical memory as part of a detach process. ISM allocations can be seen on a live system via the ipcs command.

          With S8U3 and S9, support for Dynamic ISM was introduced. This allows for ISM segments to be resized dynamically, reducing potential roadblocks to DR operations. The database/application must also support and be configured to use Dynamic ISM. Oracle 9i is the first database to support Dynamic ISM. Refer to Document 1018855.1 DISM Troubleshooting For Oracle9i and Later Releases" for more information.

          During the detach process, database performance may be impacted as ISM pages are relocated. The detach process itself may also be lengthy. Refer to Bug 4632219 for details.  Improvements were made to the Solaris 8 and Solaris 9 kernel in the following patch releases:

Solaris 8: 117000-05 and 117350-05

Solaris 9: 117171-08

          These kernel patches allow DR threads to take precedence over user application threads.  This allows DR to jump to the beginning of a queue for memory page requests thus forcing all pending readers and writers to wait until the page is relocated.

 

     Does the board being detached contain kernel memory 

          The memory used for the kernel cannot be paged out. Therefore, when detaching kernel memory, the OS must be temporarily suspended while kernel memory is relocated to another CPU/Memory board.

          Kernel memory is reported by cfgadm as permanent memory. Refer to Symptom Resolution Document 1001683.1 to determine where permanent memory is located and if that memory is kernel memory.  

          Solaris[TM] 9 kernel patch 118558-05 together with the platmod patch 117124-07 alters the behavior of the kernel cage on the Sun Fire[TM] 12K/15K/E20K/E25K by splitting the cage across more than one board depending on the size of the domain.  This change in kernel cage behavior is only available for Sun Fire[TM] 12K/15K/E20K/E25K servers and offers performance improvements to this platform.  Unfortunately, the tradeoff to this enhancement could affect (and in some way limit) the use of DR Detach in this configuration.

          Prior to splitting the kernel cage there was (ideally) one Board in each domain that would contain kernel memory.  If this board needed to be DR detached from the configuration, the Solaris[TM] OS would temporarily suspend while the kernel memory was reallocated to a resource (different board) which was to remain in the domain.  With the addition of split kernel cage, it is now possible that multiple boards in the configuration will contain kernel memory.  This means that there are now more boards in the domain that would need the Solaris[TM] OS to suspend in order to reallocate their kernel memory if they were DR detached.

          Please refer to Document 1012349.1 "Kernel Cage Splitting Overview" for details on this enhancement.

If kernel memory is present, there are some additional items to consider prior to executing the detach operation:

     Is a sufficiently equipped "target" board for permanent memory available 

          Permanent memory must be relocated to another board as a single, contiguous slice of physical memory. This requires another board in the system that can receive the permanent memory. Refer to Document 1001683.1 for how DR selects a target board for kernel relocation.

     Are there real time (RT) processes on the system 

          The most common real time process is NTP (Network Time Protocol) which is real time by default. But, other application(s) may be scheduled as real time.

          Since the OS must be temporarily suspended during a permanent memory detach, running processes will no longer be real time. In order to perform the detach, all real time processes must be changed to a non-real time scheduling class. See the priocntl MAN page for details.  Whether or not a process can be temporarily changed to a non-real time scheduling class must be determined by an administrator with knowledge of the application(s) requirements. Time sensitive applications may not tolerate a suspension, in which case a detach of permanent memory is not possible.

          Note: Checking for RT threads is done in Solaris[TM] 8 only! Solaris[TM] 9 and beyond does NOT perform this check. Refer to Bug 4396562 for details.

     Are QFE cards present in the system 

          Ensure the qfe patch level is NOT 108806-14. A regression causing a Dstop (Domain Stop) and/or  send mondo timeout   was introduced in this patch level. Refer to Bug 4727494 for details.

    Are QGE cards present in the system 

          Ensure the minimum set of PCI patches in domains running the Solaris [TM] 8 or 9 Operating System (Solaris OS).

In particular have a look to 110900-11 (/platform/sun4u/kernel/misc/sparcv9/pcicfg.e patch) that must be present. Refer to Bug 4879904 for details.

     Are fiber channel cards in fabric mode present in the system 

          Ensure that the appropriate fctl/fp/fcp driver patch is installed. For Solaris[TM] 8, patch 111095-13 (or higher). For Solaris[TM] 9, 113040-04 (or higher). See Bug 4727209 for details.

     Is MPxIO active in the system 

          Ensure the appropriate kernel level is applied to the system to avoid BugID# 4649851. This BugID# is addressed in Solaris[TM] 8 Operating System Kernel Update (KU)-15 (and higher) and Solaris[TM] 9 Operating System KU-01 (and higher).

     Are 3rd party adapters and/or driver software present in the domain 

          Sun does not qualify all vendor PCI adapters and/or 3rd party driver software with DR. Refer to the THIRD PARTY STATEMENT below.

     Is the domain part of a Cluster 

          Some clustering software will not support detach operations on permanent memory without tuning of heartbeat threads. At the time of this writing, if Sun Cluster [TM] is  installed, a DR operation on permanent memory will abort. See Document 1012186.1 "Sun Fire[TM] Server: How to Use Dynamic Reconfiguration (DR) in a Sun Cluster[TM] 3.x Environment" for details.

          Refer to the VERITAS Cluster Server Application Note: Sun Fire 12K/15K Dynamic Reconfiguration (pdf) for details on DR and Veritas Cluster Server configurations.

Slot 1 MaxCPU Board

     Does the board being detached have processes bound to its CPUs 

          CPUs with bound processes cannot be detached. DR will not automatically rebind processes to other CPUs. This must be done by an administrator. Whether to unbind/rebind the processes, and to which of the remaining CPUs in the domain, is a decision that must be made by someone knowledgeable of the application(s) on the domain. To locate and rebind bound processes, use the pbind command.

     Can the application(s) on the domain tolerate fewer processors 

          A detach of a MaxCPU will reduce the domain's processing power. This may have an impact on domain application(s).  A prior workload baseline may help determine if the application(s) performance will suffer with less resources.

          To avoid the tradeoff, if available, a different MaxCPU board could be attached to the domain prior to detaching a board. This would also provide CPUs to shift bound processes to.

          NOTE:  MaxCPU is NOT supported in Sun Fire[TM] E20K/E25K.

Slot 1 HPCI Board

Note1 : When removing PCI cassette cartridges from a SF12K or a SF15K, you must first disconnect the cartridge from the domain -- regardless of the LED status indicators on the cartridge. Follow the procedure below:

View the status of the cartridge

# cfgadm

The correct status is:   unknown  disconnected unconfigured

example:

pcisch0:e08b1slot1  unknown   disconnected unconfigured

If it is in a different status, such as unknown/connected/unconfigured (potentially possible with an empty PCI cassette) example:

pcisch0:e08b1slot1 unknown connected unconfigured

use cfgadm to disconnect it prior to removal.

# cfgadm -c disconnect pcisch0:e08b1slot1

# cfgadm pcisch0:e08b1slot1

Ap_Id   Type    Receptacle  Occupant    condition
pcisch0:e08b1slot1 unknown disconnected unconfigured unknown

# You can now physically remove the cartridge from the domain. The domain will continue to run when the cartridge is re-inserted.

     Are the PCI adapters in the IO board qualified for DR 

          For Sun adapters, a list of qualified adapters is maintained by Sun Microsystems Marketing and Sales staff.  Contact your sales representative for details.

     For each PCI adapter in the IO board, does an alternate path to its storage/network exist 

          Detaching an IO board obviously removes pathways to storage devices and networks.  An alternate path to the storage/network must be maintained to support running application(s). This is typically accomplished by multi-pathing software (MPxIO, IPMP, etc.).

          It is also required that the multi-pathing software be DR safe. For MPxIO, ensure the appropriate kernel level is applied to the system to avoid Bug 4649851. This bug is addressed in Solaris[TM] 8 Operating System KU-15 (and higher) and Solaris[TM] 9 Operating System KU-01 (and higher).

          Sun does not qualify all vendor multi-pathing software. Refer to the THIRD PARTY STATEMENT below.

     Can the application tolerate single pathways to storage/networks 

          For I/O intensive application(s), ensure the bandwidth provided by a single pathway to storage/networks is sufficient for the running application(s).

     Are any PCI adapters in the IO board 3rd party 

          Sun does not qualify all vendor PCI adapters. Refer to the THIRD PARTY   STATEMENT below.

     Is the domain part of a Sun Cluster 

          Sun Cluster has additional restrictions for DR operations on global devices, quorum devices, private cluster interconnects, and public network interfaces. Refer to the Sun Cluster 3.1 System Administration Guide or Sun Cluster 3.0 12/01 System Administration Guide for details on DR and Sun Cluster configurations.

          See the Veritas support document, VERITAS Cluster Server Application Note: Sun Fire 12K/15K Dynamic Reconfiguration for details on VCS and DR compatibility.

HOT-PLUG OPERATIONS

Prior to executing a hot-plug operation for an adapter, review the notes and points listed below to minimize potential problems with the operation.

Follow the procedure detailed above when rem5oving empty I/O cassettes.  Failure to use this procedure can result in domain outages due to bug

Adding An Adapter

     Is the PCI adapter qualified for hot-plug 

          For Sun adapters, a list of qualified adapters is maintained by Sun Microsystems Marketing and Sales staff.  Contact your sales representative for details.

     Is the PCI adapter a known, good adapter 

          It must be noted that the hot-plug procedure does no testing, stressing or verification of the PCI adapter. If the adapter has not been stress tested by other means (SunVTS, etc.) a fault may be introduced into the system.

     Is the adapter 3rd party 

          Sun does not qualify all vendor PCI adapters. Refer to the THIRD PARTY  STATEMENT below.

     Is Bug 4496757 addressed 

          This bug can cause a panic on adapter insertion. Ensure that an appropriate patch is applied prior to the hot-plug operation. Solaris[TM] 8 patch 110900-06 (or higher) addresses this BugID#.

     Followup Configuration

          After an adapter is hot-plugged into a domain, followup configuration is likely required (network plumbing, file system creation, etc.). Also be aware of BugID# 4721698. After an adapter is added, the GDCD is not updated so reboots (hpost -Q) do not recognize the newly added adapter.

Removing An Adapter

     Is the PCI adapter qualified for hot-plug 

          For Sun adapters, a list of qualified adapters is maintained by Sun Microsystems Marketing and Sales staff.  Contact your sales representative for details.

     Does an alternate path to this adapter's storage/network exist 

          Detaching an adapter obviously removes a pathway to storage devices and/or networks.  An alternate path to the storage/network must be maintained to support running application(s). This is typically accomplished by multi-pathing software (MPxIO, IPMP, etc.).

          It is also required that the multi-pathing software be DR safe. For MPxIO, ensure the appropriate kernel level is applied to the system to avoid BugID# 4649851. This bug is addressed in Solaris 8[TM]  KU-15 (and higher) and Solaris[TM] 9 KU-01 (and higher).

          Sun does not qualify all vendor multi-pathing software. Refer to the THIRD PARTY STATEMENT below.

     Can the application tolerate a single pathway to the storage/network 

          For I/O intensive application(s), ensure the bandwidth provided by a single pathway    to the storage/networks is sufficient for the  running application(s).

     Is the PCI adapter 3rd party 

          Sun does not qualify all vendor PCI adapters. Refer to the THIRD PARTY STATEMENT below.

Background Information:

THIRD PARTY STATEMENT

          If a 3rd party adapter is being attached, detached, or hot-plugged, it is  the responsibility of the owner of the system to ensure the adapter, as  well as any associated device drivers and firmware are DR safe. This also applies for 3rd party software drivers (i.e. multipathing software) if a detach requires an OS suspension. Consult with the vendor of the adapter to ensure their driver stack and adapter firmware level are DR compliant.

     Some 3rd party component resources:

           Legacy JNI driver and HBA computability information can be found here.

          Emulex driver and HBA computability information can be found here.

          Veritas Cluster Server computability and information can be found here.

ADDITIONAL KNOWN ISSUES/LIMITATIONS

  • Bug 4785231: If a psradm command is issued before running cfgadm, it may hang all cfgadm processes, effectively preventing any DR operations. Workaround: Run cfgadm before psradm.   Fix integrated in: Solaris[TM] 10.

  • Bug 4797110: A hotplug of an adapter into a Slot 1 board that is simultaneously being detached may cause a panic.

  • Bug 4672974: Attaching a Slot 1 board during an OS quiesce fails. Workaround: Serialize DR operations within the domain.   Fix integrated in: Solaris 10, and Solaris 9 (u2_07)

  • Bug 6352919: DR addboard of panther board caused domain panic in prom_hotaddcpu() call

FEEDBACK

This is a living document. As features/requirements change, all attempts to keep this document current will be made.  Please contact your Sun Support Service Representative with questions or suggestions on how we might improve the information contained in this document.

REFERENCES

Sun Fire 15K/12K Dynamic Reconfiguration Installation Guide and Release Notes

Sun Fire High-End Systems Dynamic Reconfiguration (DR) User Guide (pdf) or (HTML)

Document 1012186.1: Sun Fire[TM] Server: How to Use Dynamic Reconfiguration (DR) in a Sun Cluster[TM] 3.x Environment

Document 1018756.1 Sun Enterprise[TM]10000 / Sun Fire[TM] 12K/15K/E20K/E25K: Dynamic Reconfiguration (DR) Cheat Sheets

Document 1012284.1 "Sun Fire[TM] 12K/15K/E20K/E25K: Fastest Way to Move System Boards (SB) Between Domains."

Document 1006214.1 "Sun Fire[TM] 12K/15K/E20K/E25K Servers: How to Replace a System Board Using Dynamic Reconfiguration"

Document 1012349.1 "Kernel Cage Splitting Overview"

Document 1001683.1 "Sun Fire[TM] 12K/15K/E20K/E25K: Location and Relocation of Kernel for DR Operations"

Document 1018855.1 "DISM Troubleshooting For Oracle9i and Later Releases"

Document 1007568.1 "Sun Fire[TM] 12K/15K/E20K/E25K: Testing a single slot 0 board with no slot 1 board in a domain"

Document 1005025.1 "IPMP with Dynamic Reconfiguration (DR)"

Dynamic Reconfiguration for High-End Servers: Part 1 Planning Phase (pdf)

Dynamic Reconfiguration for High-End Servers: Part 2 Implementation Phase (pdf)

Sun Cluster 3.1 System Administration Guide

Sun Cluster 3.0 12/01 System Administration Guide

VERITAS Cluster Server Application Note: SunTM Fire 12K/15K Dynamic Reconfiguration (pdf)


@ Internal Only Tools and Resources:

SCAT Tool
A separate core analysis tool, called SUNWscat
(Solaris Crash Analysis Tool), can be downloaded from
http://www.oracle.com/technetwork/indexes/downloads/index.html
This tool contains a method of mapping ISM pages
to the physical address space which is assigned to each board.
In this manner one can map ISM pages to a board
in order to predict the operation's affect on the application.
Document 1017710.1 "Sun Fire[TM] Servers : Dynamic Reconfiguration and Intimate Shared Memory"

Internal only references:
Bug 4946459: Unable to release memory during the DR operation (related to FAB 200462 "CPU/Memory board memory drain
during Dynamic Reconfiguration may take a long time to complete without patches for Solaris 8 and 9).


I/O Matrix (IO Matrix for Systems Group servers and workstations, 06-20-2006)
I/O Wiki (Cross Platform IO Support)
Dynamic Reconfiguration Technical Training
@ Dynamic Intimate Shared Memory (DISM)
@ Dynamic Reconfiguration with Oracle9i and Solaris 8

Domain and SC patch check (PTS) tools :  dmpatch.pl  scpatch.pl
@ EIS DR Checklist for Sun Fire 12K/15K & E25/20K

Keywords
15K, 12K, 20K, 25K, E20k, E25k, SF15K, SF12K, Sun Fire 15K
Enterprise Server, Sun Fire 12K, Dynamic Reconfiguration, DR
best practices, ISM, DISM, Oracle, EIS, Checklist, MPO

Previously Published As 49667

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback