Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1010722.1
Update Date:2011-02-09
Keywords:

Solution Type  Problem Resolution Sure

Solution  1010722.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: Error message "cfgadm: Hardware specific failure: unconfigure SB2: Failed to off-line: dr@0:SB2::cpu2"  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
214797


Applies to:

Sun Fire 12K Server
Sun Fire 15K Server
Sun Fire E20K Server
Sun Fire E25K Server
All Platforms
Sun SPARC Sun OS

Symptoms

For dynamic reconfiguration (DR) to succeed on a system board (SB), it should not have any realtime thread or process bound to the CPU on that SB.

Cause

Realtime class thread or process may cause DR operation to fail.

Solution

When CPU has a process bound to it, DR will fail with message as show in example below.

Example

    # cfgadm -c disconnect SB2
cfgadm: Hardware specific failure: unconfigure SB2: Failed to off-line: dr@0:SB2::cpu2

Console logs for the above error is as below:

    Oct 26 10:03:02 2004 Oct 26 10:03:24 v4u-15ka-b dr: WARNING: dr_pre_release_cpu:thread(s) bound to cpu 66

To find out which process is bound to CPU66, use the pbind command as below:

    # pbind
process id 1902: 66

The above command shows that process id 1902 is bound to CPU 66, which is on SB2, so DR on SB2 is failing.

Use the commands below to unbind process from that CPU:

    # pbind -u 1902
process id 1902: was 66, now not bound

After doing this DR will work.

It should be noted that if an attempt is made to DR out that last CPU of a processor set, the same style of issue will be encountered.

Use psrset to review current processor sets and include at least one CPU from a different board to aviod this problem.

    f15ka-dom-c# cfgadm -c disconnect SB17
Nov 25 12:34:05 f15ka-dom-c dr: WARNING: Failed to off-line: dr@0:SB17::cpu0
cfgadm: Hardware specific failure: unconfigure SB17: Failed to off-line: dr@0:SB17::cpu0

See the manpage for psrset(1m) for further details on the maintanence of Processor Sets and how to add or remove processes or processors from a Processor Set.

Finally, it should be noted that on older versions of Solaris (8), the presence of threads or processes running in the Real Time scheduling class will prevent Solaris from being able to suspend the system and DR out a system board with permanent (non-pageable) memory such as the Kernel or Dynamic Intimate Shared Memory (DISM).

See the manpage on priocntl(1m) for details of reassigning thread and process scheduling classes to resolve this issue.



Additional Information

See also:

Document 1003332.1: Sun Fire[TM] Midframe/Midrange Servers: CPU/Memory Board Dynamic Reconfiguration (DR) Considerations
Document 1010363.1: Sun Fire[TM] 12K/15K/E20K/E25K Servers: Dynamic Reconfiguration Considerations


Product
Sun Fire E25K Server
Sun Fire E20K Server
Sun Fire 15K Server
Sun Fire 12K Server

Internal Comments
Bug & Solution Set

Internal Reference
For more details, see Change Request 4941560 "modify bound process warning message to include thread #, process ID & name"

pbind, off-line, cfgadm, rcfgadm, deleteboard, unconfigure, 12k, 15k, 20k, 25k
Previously Published As 78878

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback