Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1019467.1
Update Date:2011-04-11
Keywords:

Solution Type  Technical Instruction Sure

Solution  1019467.1 :   How To Diagnose Missing CPU's and/or Memory on sun4v Platforms  


Related Items
  • Sun SPARC Enterprise T5440 Server
  •  
  • Sun SPARC Enterprise T1000 Server
  •  
  • Sun SPARC Enterprise T5220 Server
  •  
  • Sun SPARC Enterprise T5240 Server
  •  
  • Sun SPARC Enterprise T2000 Server
  •  
  • Sun SPARC Enterprise T5140 Server
  •  
  • Sun SPARC Enterprise T5120 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>CMT Servers
  •  

PreviouslyPublishedAs
239925


Applies to:

Sun SPARC Enterprise T2000 Server
Sun SPARC Enterprise T1000 Server
Sun SPARC Enterprise T5120 Server
Sun SPARC Enterprise T5140 Server
Sun SPARC Enterprise T5220 Server
All Platforms

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Coolthread Servers.

Goal

Description


This document describes one possible reason why CPU's (cores/threads) and/or Memory, may not be displayed after booting the Solaris Operating System on sun4v platforms.

This document provides steps to recover these CPU's or Memory, if needed.


Symptoms


The following Solaris commands do not report the actual physical CPU's (cores/threads) or Memory DIMMs installed within the system:


prtdiag(1M),
mpstat(1M),
prtconf(1M)

This gives the impression that the system has less CPU's or Memory than it should have.

Here are a couple of examples taken from a Sun Fire T2000. As you will see it only shows 4 CPU's and 1024MB Memory:

From OBP:



Sun Fire T200, No Keyboard
Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.28.1, 1024 MB memory available, Serial #69080228.
Ethernet address 0:14:4f:1e:14:a4, Host ID: 841e14a4.
{0} ok ls
f0283234 pci@7c0
f027ab00 pci@780
f027a9cc cpu@3
f027a89c cpu@2
f027a76c cpu@1
f027a63c cpu@0
f0277024 virtual-devices@100
f023fa78 virtual-memory
f023f46c memory@m0,8000000
f022d23c aliases
f022d1c4 options
f022d07c openprom
f022d008 chosen
f022cf90 packages
      

From the output of prtdiag -v :


System Configuration:  Sun Microsystems  sun4v Sun Fire T200
Memory size: 1024 Megabytes
================================ Virtual CPUs ================================
CPU ID Frequency Implementation         Status
------ --------- ---------------------- -------
0      1000 MHz  SUNW,UltraSPARC-T1     on-line
1      1000 MHz  SUNW,UltraSPARC-T1     on-line
2      1000 MHz  SUNW,UltraSPARC-T1     on-line
3      1000 MHz  SUNW,UltraSPARC-T1     on-line
     


However, on the Service Processor (SP) via the Alom account, the following commands will show all the CPU's and Memory physically installed within the system:

sc> showcomponent
sc> showfru 


Systems such as the Sun Fire T2000 and T5240, etc... are equipped with sun4v architecture that supports virtualization, and are therefore affected by this issue.

A possible common cause for this condition, is the presence of a previously installed LDom (Logical Domain) configuration that has not been cleared/removed completely.
This may result in the previously configured LDom installation, still affecting the component availability within the system. (this information is stored in Service Processor)

Solution

Related Document(s)

For details on LDoms setup, please refer to the Logical Domains (LDoms) ‘Ldom Version’ Administration Guides, available at Oracle VM for SPARC Documentation.

For details on the hardware configurations of the relevant sun4v platforms, please refer to the system specific documentation .


Steps to Follow


The following, are steps to verify that the system does NOT have an 'active' LDOM configuration currently running.


We need this check 'before' we set the system back to "factory-default".
(NB: Only IF the LDom configuration is no longer in use or to be reconfigured).

Steps to Resetting LDoms to factory-default

[1.] Confirm that if there are existing LDoms configurations that are actually in use.
     

[1-A.] Verify if the LDoms Manager is installed on the system.

Perform the following command to check if the LDoms Manager package SUNWldmis installed on the system.

# pkginfo |grep SUNWldm
application SUNWldm                          Logical Domains Manager


The binaries for the package is located at /opt/SUNWldm if the package is installed.

If the SUNWldm package does not exist on the system, it means that the LDoms Manager is not installed.
(Though the configuration is still effective as stored on the System Processor, which why we have missing CPU's and/or Memory in the first place).

If LDoms Manager is not installed, skip Step [1-B.] and proceed to Step [2.]
However, if  want to find out what is configured under LDoms, please install the LDoms Manager package, and review the available configurations via the ldm(1M) command.


[1-B.] Verify that there is no other configured or active Guest Domains on the system.

If LDoms Manager does exist, some active Guest Domains are possibly still in-use.
To confirm the existence of Guest Domains, login as root,  and run the following command:

# /opt/SUNWldm/bin/ldm list


Below is an example showing that there are 2 Guest Domains configured, and bound:

# /opt/SUNWldm/bin/ldm list
NAME             STATE    FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active   -n-cv   SP      4     1G       0.3%  22h 59m
ldom2            bound    -----   5000    4     1G       
ldom3 bound ----- 5001 4 1G


Below is an example showing 2 active Guest Domains, with the Operating System running:

# /opt/SUNWldm/bin/ldm list
NAME             STATE    FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active   -n-cv   SP      4     1G       3.6%  22h 36m
ldom2            active   -n---   5000    4     1G        24%  11s
ldom3 active -n--- 5001 4 1G 32% 11s


Below is an example showing if there are no Guest Domains configured, only the Primary Domain entry is shown:

# /opt/SUNWldm/bin/ldm list
NAME             STATE    FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active   -n-cv   SP      4     1G       0.3%  22h 59m


***CAUTION***
When there are Guest Domains displayed, be aware that they could be still in-use by other parties.
Please take EXTRA care to find out from the System Administrator or organization, who has configured these Guest Domains, and whether if these services are still in use/needed.

DO NOT PROCEED FURTHER If you are unsure if the Domains configured are still in use or not.




[2.] Reseting the LDoms configuration back to default.

By now, after carrying out Step [1.] above, we have confirmation that if LDoms configurations on the system are no longer needed, we are ready to reset the system configuration back to factory default settings.

The LDoms configuration for factory default is named as factory-default.

There are 2 possible ways to reset the configurations of the system.

If SUNWldm package is installed and verified, we can reset the system through LDoms Manager using the ldm(1M) command.

However, if the system has been freshly (re)installed with Solaris without LDoms Manager installed, the configurations can be reset from the Service Processor.
This can be via either the ALOM or ILOM prompt (depending on the specific platform/configuration).
Step [2-A.]
and [2-B.] below, illustrate the steps for the 2 methods respectively.



[2-A.] Resetting through existing LDoms Manager installed on the system.

List out the current LDom configuration, as well as other available configurations saved on the system.
The following example shows that the current configuration is named prod:

# /opt/SUNWldm/bin/ldm list-spconfig
factory-default
prod [current]


To set to factory-default, use the following command:

# /opt/SUNWldm/bin/ldm set-spconfig factory-default


Once set, confirm the next power on configuration used will be factory-default, by running the following command

# /opt/SUNWldm/bin/ldm list-spconfig
factory-default [next]
prod [current]

NB - MANDATORY STEP:
For the action performed above to take effect we need to power cycle the system. (power off and power on the system)

On booting up, the system would be using the factory-default configuration, and all the hardware components on the system should now be visible.
The OBP banner will present all the available CPU's and memory. mpstat(1M), prtdiag(1M), etc ... will be showing the fully configured cores/threads and memory.

ldm(1M) will show the current effective configuration as factory-default.

# /opt/SUNWldm/bin/ldm list-spconfig
factory-default [current]
prod


After the configuration has been reset to factory-default, proceed to remove the other non active LDoms configurations on the system using the remove-spconfig subcommand for ldm(1M).
Please refer to LDoms Adminstration Guide for further details on LDoms configuration tasks.


[2-B.] Reseting through the Service Processor.

The LDoms configuration can be reset through the SP before booting the OS on the system.
Ensure you are connected to the Serial Management Port of the Service Processor.

For system with the ALOM configured on the Service Processor, use the following syntax:

sc> bootmode config="factory-default"
sc> poweroff

   Wait for system to power off. 
   We can the power on the system

sc> poweron -c
OR
sc> poweron
sc> console -f

For system with ILOM, use the following syntax:

-> set /HOST/bootmode config="factory-default"
-> stop /SYS
Are you sure you want to stop /SYS (y/n)? 

   Wait for machine to power off.
   We can then power on the system.

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
-> start /SP/console


Once the machine is powered on, it should have all configurations reset to the default with all the hardware components visible.

Please refer to the following documentations for further details on ALOM and ILOM commands:

- Advanced Lights Out Management (ALOM) CMT vX.X Guide
- Sun Integrated Lights Out Manager X.X Supplement
- Platform specific Service Manuals 


Data Collection for further troubleshooting

If after the above steps of resetting the system back to factory-default, still shows some CPU's and/or memory not visible, please find out the number of CPU cores the system show be seeing,  before contacting Oracle Support with the following information:

- The latest explorer output using the latest version of explorer script.
- The console session log of the attempt to reset the system configuration to factory-default.

Please refer to 1002383.1 Sun[TM] Explorer Data Collector for the latest Explorer Script.



Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback