Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1009921.1
Update Date:2011-04-06
Keywords:

Solution Type  Troubleshooting Sure

Solution  1009921.1 :   Troubleshooting the DSCP service on Sun SPARC(R) Enterprise Mx000 (OPL) servers  


Related Items
  • Sun SPARC Enterprise M9000-64 Server
  •  
  • Sun SPARC Enterprise M9000-32 Server
  •  
  • Sun SPARC Enterprise M8000 Server
  •  
  • Sun SPARC Enterprise M4000 Server
  •  
  • Sun SPARC Enterprise M5000 Server
  •  
  • Sun SPARC Enterprise M3000 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>OPL Servers
  •  

PreviouslyPublishedAs
213600


Applies to:

Sun SPARC Enterprise M4000 Server - Version: Not Applicable and later   [Release: N/A and later ]
Sun SPARC Enterprise M5000 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun SPARC Enterprise M8000 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun SPARC Enterprise M3000 Server - Version: Not Applicable and later    [Release: N/A and later]
Sun SPARC Enterprise M9000-32 Server - Version: Not Applicable and later    [Release: N/A and later]
All Platforms

Purpose

This resolution path addresses problems where the OPL Mx000 DSCP service is not started, which prevents data from flowing between the XSCF and domain.

Symptoms

This issue can manifest itself in several different ways listed below:

XSCF> showdevices -d 01
Can't get device information from DomainID 1.  <=====
XSCF> deleteboard -c unassign 09-0
Start unconfiguring XSB from domain.
XSB#09-0 will be unassigned from domain immediately. Continue [y|n] :y
DR failed. Domain (DomainID 1) cannot communicate via DSCP path.         <======

snapshot file @scf@cli@usr@bin@showdevices_-v_-d_?.err will contain:
SNAPSHOT MSG: Command Timeout: Exit Signal: 15

DSCP is used by FMA to transfer data between fmd on the domain and the XSCFU.  The FMA logs and /var/adm/messages on the domain will log failures if DSCP is not properly configured.  The fmd logs will consume large amounts of space recording errors of ereport.fm.fmd.module and no fma messages are propagated to the xscfu.

Last Review Date

April 6, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

Steps to Follow
Please validate that each troubleshooting step below is true for your environment. The steps will provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
Note:  This procedure requires that the Solaris[TM] domain that cannot be reached via DSCP service be rebooted in order to complete the resolution steps.

1.  Verify via Solaris that the required packages for the DSCP service are installed on the Solaris domain

-  pkginfo | grep SUNWdscp  indicates the SUNWdscpr and SUNWdscpu packages are installed on the domain

-  pkginfo | grep SUNWppp  indicates that the SUNWpppd, SUNWpppdr, SUNWpppdu, and SUNWpppdt are installed.

-  pkginfo | grep SUNWsckm  indicates that the SUNWsckmr, and SUNWsckmu packages are installed.

-  pkginfo | grep SUNWdcs  indicates that the SUNWdcsr, and SUNWdcsu packages are installed.

- A reboot may be required following the installation of any missing packages.

Reference : <Document: 1011376.1 - Sun SPARC[R] Enterprise Mx000 Servers (OPL) packages for minimal Solaris[TM] Installation

2.  Verify on the problematic Solaris domain that  ifconfig -a  displays a sppp0 interface and the interface  flags  show the interface as  RUNNING .

If sppp0 exists but is not  RUNNING , then create an escalation to assist in the resolution.

....

         sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index
         inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00
         ether 0

3.  Verify on the problematic Solaris domain that  svcs -l dscp  displays  enabled true  and  state online 

To enable the dscp service, type  svcadm enable dscp .

Example:

         # svcs -l dscp
         fmri         svc:/platform/sun4u/dscp:default
         name         DSCP Service
         enabled      true
         state        online
         next_state   none
         ....

Note : The absence of the appropriate SUNWppp* packages, the absence of the spp0 interface or the DSCP service disabled, as described above, will be reported on the domain via  ereport.fm.fmd.module  in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signatures :

msg = xport - dscpBind on accept socket failed for dev:///sp0 : rv = 5

 and

msg = xport - dscpAddr on client socket failed for dev:///sp0 : rv = 5 


4. 
Verify on the XSCF that DSCP is configured.

Example:

Display DSCP information.

XSCF> showdscp
 
DSCP Configuration:
 
Network: 192.168.244.0
Netmask: 255.255.255.0
 
Location      Address
----------   ---------
XSCF         192.168.244.1
Domain #00   192.168.244.2
Domain #01   192.168.244.3
Domain #02   192.168.244.4
Domain #03   192.168.244.5

If it is not configured showdscp output will state this:

XSCF> showdscp

ERROR: DSCP is not configured. Please use setdscp.

Then DSCP should be configured:

*Note, all domains should be powered down when DSCP is configured on the XSCF*

From the setdscp man page:

"setdscp is intended for initial configuration only.  Domains
should not be powered on when running this command.

Note -  You are required to  reboot  the  Service  Processor
        after modifying the DSCP IP address assignment using
        this command, and before the IP addresses you speci-
        fied are used."

Example:

XSCF> setdscp
DSCP network  [192.168.244.0  ] > 192.168.2.0
 
DSCP netmask  [255.255.255.0  ] > 255.255.255.0
 
XSCF address  [192.168.2.1  ] > 192.168.2.1
Domain #00 address  [192.168.2.2  ] > 192.168.2.2
.
.
.
Commit these changes to the database? [y|n]:y

XSCF> rebootxscf

5. Verify on the problematic Solaris domain that  svcs -l sckmd  shows  enabled true  and  state online .

To enable  sckmd  services, from the Solaris domain type  svcadm enable sckmd .

Example:

         # svcs -l sckmd
         fmri         svc:/platform/sun4u/sckmd:default
         name         key management daemon
         enabled      true
         state        online
         next_state   none

Note : The absence of the appropriate packages for sckmd or the sckmd service disabled will be reported on the domain via  ereport.fm.fmd.module  in the 'fmdump -e'. Further investigation using 'fmdump -V' will report the following signature :

msg = Failed to write C_HELLO to dev:///sp0: Transport endpoint is not connected

6.  Verify on the problematic Solaris domain that the  ipseckey dump  command returns security key information in addition to  Dump succeeded for SA type 0. 

If the  ipseckey dump  does not return any keys check to see that domain OS is S10U8 or greater or that patch 140589-01 or greater is installed.

Example:
         # ipseckey dump
         Base message (version 2) type DUMP, SA type AH.
         Message length 136 bytes, seq=1, pid=766.
         SA: SADB_ASSOC spi=0xff01, replay=0, state=MATURE
         SA: Authentication algorithm = hmac-md5
         SA: flags=0x80000000 < X_USED >
         SRC: Source address (proto=0/<unspecified>)
         SRC: AF_INET: port 0, 192.168.224.3 <unknown>.
         .....
         Dump succeeded for SA type 0.

Reboot (via shutdown, init 6, reboot, or some other orderly process) the problematic Solaris domain that has been unable to use the DSCP interface.

7.  Verify on the problematic domain that  svcs -l dcs  shows  enabled true  and  state online .

  1. To enable  dcs  services, from the Solaris domain type  svdadm enable dcs .

Example:

         # svcadm enable dcs
         # svcs -l dcs
         fmri         svc:/platform/sun4u/dcs:default
         name         domain configuration server
         enabled      true
         state        online
         next_state   none

8.  Verify on the problematic domain that a  ping  from the domain to the XSCF via the DSCP interface is working ( ping  shows  alive ):

- The  ifconfig -a  will show information about the sppp0 interface.  On the Solaris domain, locate the IP address following the  "-->" .  This is the point to point (ppp) interface for the XSCF from this domain.   This is the interface you want to ping...in the example below, you would ping 192.168.224.1.

- As this is a ppp style interface, the two IP interfaces should be on the same subnet (ie 192.168.224.Z) and have the same netmask, which is controlled by the XSCF  setdscp  command

IP Address Example:

         ....
         sppp0: flags=10010008d1<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1500 index
         inet 192.168.224.3 --> 192.168.224.1 netmask ffffff00
         ether 0
Ping Example:
         # ping 192.168.224.1
         192.168.224.1 is alive

The XSCF command  showdevices -d <rebooted domain>  to the problematic Solaris domain should now be successful.

At this point, if you have validated that each troubleshooting step above  is true for your environment, and the issue still exists, further troubleshooting is required. For additional support contact Oracle Support.


 Internal Comments

Reference: 6821108 DR and "showdevices" don't work after XSCF reboot

The DSCP service is used by the Solaris[TM] domain and XSCF on an Mx000 system to communicate with each other.  While these processes should start automatically on boot,   there may be instances where a customer has disabled the functionality inappropriately or unintentionally.    When this occurs, commands such as DR and showdevices will not function correctly.  DSCP also depends on another component sckmd to perform header authentication of packets going back and forth between the domain and XSCF.  It is possible to start DSCP without sckmd (and vice versa) so it is important to check for the proper operation of both


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback