Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1006233.1
Update Date:2010-12-27
Keywords:

Solution Type  Problem Resolution Sure

Solution  1006233.1 :   Sun Fire[TM] Servers: Interpreting System Management Services(SMS) Failed Power Supply Messages  


Related Items
  • Sun Fire E25K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire 12K Server
  •  
  • Sun Fire 15K Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>High-End Servers
  •  

PreviouslyPublishedAs
208741


Applies to:

Sun Fire E20K Server
Sun Fire E25K Server
Sun Fire 12K Server
Sun Fire 15K Server
All Platforms
Add ***Checked for relevance on 27-Dec-2010***

Symptoms

A failed power supply must be replaced to insure proper platform operation.
Accurate interpretation of the error message is critical.

Changes

{CHANGE}

Cause

Each of the members of the Highend Platform Family(12K/15K/E20K/E25K) contain six AC-to-DC power supplies. On detection of a power supply failure, you must contact your authorized Sun[TM] Service representative to have any failed supplies replaced.

Solution

The SMS "showenvironment -p powers" command will display the status of the six power supplies.  Partial output from this command is contained in the example below: 

Note: All command output may vary slightly depending on type of power supply installed in the platform.

POWER UNIT AC0 AC1 DC0 DC1 FAN0 FAN1
----- ---- --- --- --- --- ---- ----
PS0 OK OK OK ON ON OK FAIL PS1 OK OK OK ON ON OK OK PS2 OK OK OK ON ON OK OK PS3 OK OK OK ON ON OK OK PS4 OK OK OK ON ON OK OK PS5 FAIL OK OK ON ON OK OK

This example shows PS0 has a failed fan and PS5 has a failed unit status. Messages are logged in the SMS platform messages file (/var/opt/SUNWSMS/adm/platform/messages) at the time of failure. Example messages for the above failures are shown below: 

Example message 1:

May 1 23:17:12 2004 sc0 esmd[1363]: [1926 17158283261375019 ERR Equipment.cc 604] A power supply failure has been noted on PS at PS5.  For N+1 redundancy, the system configuration requires 19034.00 watts.
The power supplies are providing 20000.00 watts. May 1 23:17:14 2004 sc0 esmd[1363]: [1929 17158284966232407 NOTICE Patrols.cc 1876] PS at PS5 breaker has been tripped: ecode = 0

In the above message, SMS has detected a unit fail status on PS5 and has tripped its breakers. The remaining power supplies are providing 20,000 watts for a system, which requires 19,034 watts for N+1 redundancy. The system is still within N+1 power redundancy. 

Example message 2:

May 24 06:32:36 2004 sc0 esmd[1363]: [1925 19085207450945051 ERR Equipment.cc 531] An internal fan failure has been noted in PS at PS0, which is being shutdown. For N+1 redundancy, the system configuration
requires 18311.00 watts. The power supplies are providing 16000.00 watts.
May 24 06:32:37 2004 sc0 esmd[1363]: [1929 19085208483622553 NOTICE Patrols.cc 1916] PS at PS0 breaker has been tripped: ecode = 0

In the above message, SMS has detected a failed fan internal to PS0.

For such a failure SMS must shut down the power supply to prevent overheating. It trips the failed supply's breakers. In this failure, the system requires 18,311 watts for N+1 redundancy. The remaining supplies are providing only 16,000 watts. The system has fallen below the required power for N+1 redundancy. An additional power supply failure would enter brown out condition and the results are undefined. A domain crash is likely.

Resolution

 

Contact your authorized Sun[TM] Service representative to have any failed power supplies replaced as soon as detected.


@ N+1, redundancy
 Previously Published As 77254

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback