Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1009124.1
Update Date:2010-05-27
Keywords:

Solution Type  Problem Resolution Sure

Solution  1009124.1 :   Sun Fire[TM] 12K/15K/E20K/E25K: showdevices takes a long time to return  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
212618


Symptoms
The showdevices -d <domain> command on a Sun Fire 12K/15K/E20K/E25K System controller can appear to hang, and takes upwards of 20 minutes to complete depending on the size of the domain. There are no errors returned and the output appears to be complete.

A truss of the dcs daemon on the domain in question shows it is blocking for 60 seconds at a time in door_call().



Resolution
It is possible that the domain has an IP address for its hostname that does not match to a configured network interface.

When showdevices is run, it connects to the dcs daemon on the domain, which in turn starts the rcm_daemon. The dcs makes requests to the rcm_daemon via a door using door_call().

The rcm_daemon uses a series of pluggable modules to gather device information from the domain, one of these being a Solaris[TM] Logical Volume Manager (SLVM) module. This module uses the SLVM libraries to gather SVM information. To do this, it connects to rpc.metad, which is an RPC service on the domain for use by various SVM commands.

When the SVM library attempts to connect to rpc.metad, it gets the hostname of the domain using the uname(2) system call. In most circumstances, this hostname resolves to an IP address on the system and the connection is made quickly.

The above problem occurs when the IP address that is looked up does not exist on the domain or is incorrect. For example, let's say that the wrong IP address is in /etc/hosts. Depending on the network configuration, the RPC connection will attempt to connect to a non-existant host's rpcinfo service, with a timeout of 60 seconds. Once this timeout expires the RPC connection will return an error, and the SVM library will continue on. This timeout occurs once for every disk device lookup by rcm_daemon, thus on a large domain it will take quite some time for showdevices to complete.

To check what IP address the RPC is connecting to, you can use the following command on the domain:

   $ getent hosts `uname -n`

For example:

   f15ka-dom-c$ getent hosts `uname -n`
10.15.2.43      f15ka-dom-c loghost

If the IP address returned is not on a domain's network interface, you will see the above symptoms.

Correct the /etc/hosts or nameservice entry for the hostname returned by uname -n, or check that the interface, which should have that IP address, is correctly configured.



Additional Information
A truss of the rcm_daemon on the domain will show it is sleeping on poll() 3 times during each 60 second iteration.

Each poll will have incrementing timeout values of 15, 30 and slightly less than 15 seconds, i.e:

   poll(0x000487A4, 1, 15000)      (sleeping...)
.....
poll(0x000487A4, 1, 30000)      (sleeping...)
.....
poll(0x000487A4, 1, 14848)      (sleeping...)

This occurs in the RPC functions of libnsl during the initial connection to the rpcinfo daemon.



Product
Sun Fire 15K Server
Sun Fire 12K Server
Sun Fire E25K Server
Sun Fire E20K Server

showdevices, dcs, rcm_daemon, rpcinfo, hang, slow, door_call, SUNW_svm_rcm.so
Previously Published As
74727

Updated by the ESG Knowledge Content Team 4/2010 Product_uuid
29e4659c-0a18-11d6-9fa1-e67bbc033df8|Sun Fire 15K Server
077fd4c5-df8f-4320-ad69-7d01603a674d|Sun Fire 12K Server
d842dd03-059b-11d8-84cb-080020a9ed93|Sun Fire E25K Server
1404a2d3-059a-11d8-84cb-080020a9ed93|Sun Fire E20K Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback