Sun Microsystems, Inc.   Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Sun Fire[tm] 6800 Server: Repair Procedures

This section provides the most common repair procedures for the Sun Fire 6800, which are included in the Sun Fire 6800/4810/4800/3800 Systems Service Manual (805-7363).

Clustered Hardware Note: When repairing a clustered system, you should replace server components by first switching over the data services to the functioning server, halting the host to be serviced, powering down the host, and then performing the hardware procedure to replace the component. Following the procedure, the logical hosts should be switched back to the default masters.

The following table provides the page numbers for each procedure.

Component Tips
Air Intake Screen
  • Periodic maintenance requires the air intake screen be inspected and/or cleaned once every 3 months. Have spare air intake screens onsite so that replacements are available when needed for cleaning.
Centerplane (Power)
  • A total of 13 Sun Fire 6800 power distribution centerplanes with date code "08-03" may experience a thermal event (FCO A0220-1).
Compact PCI (cPCI) I/O Assembly
  • Sun Fire Servers (3800/4800/4810/6800) may encounter panic during Dynamic Reconfiguration (DR) operation of PCI and cPCI I/O boards (FIN I0840-1).
  • Sun Fire x800 systems are subject to a "panic" problem at boot time when a cPCI Dual FC Network Adapter (FRU or X-option) is installed for the first time (FIN I0708-1)
Compact PCI (cPCI) Card
  • Sun Fire x800 systems are subject to a "panic" problem at boot time when a cPCI Dual FC Network Adapter (FRU or X-option) is installed for the first time (FIN I0708-1)
CPU/Memory Board
  • Sun Fire 4800/E4900/6800/E6900 systems with US-IV+ boards have restrictions which limit the number of domains. (FAB 1000283.1).
  • The Vcore setting for UltraSPARC IV 1350MHz boards should be 1.25V, but some earlier releases of firmware will power on and operate the board at an incorrect voltage. (FIN I1166-1).
  • CPU/Memory Board FRUs for Sun Fire 12K/15K and Sun Fire 3800-6800 systems are not enabled for Capacity On Demand (COD) use (FIN I0912-1).
  • Guidelines for understanding and diagnosing UltraSPARC III Level 2 (L2) SRAM Cache Memory Errors (FIN I0887-1).
  • UltraSPARC III and III+ based platforms could be susceptible to UCC errors that may cause system panics (FIN I0856-1).
  • Issue with diagnosing of "send mondo" panics (FIN I0765-1)
  • 900MHz CPUs operating at 750MHz (FIN I0759-1)
  • Under certain conditions, flashupdate -u command can create incompatible firmware versions between boards, rendering system unusable until problem is corrected (FIN I0731-1)
  • Loose EMI spring fingers on chassis may damage CPU/Memory Boards (FIN I0720-1)
  • A limited set of Sun Fire System Boards may be vulnerable to Uncorrectable Errors in L2 SRAM. (FCO A0248-1).
CPU/Memory Board EMI Spring Finger Clip
  • Loose EMI spring fingers on chassis may damage CPU/Memory Boards (FIN I0720-1)
Disk Drive
  • Improved firmware for Seagate 10K.6 disk drives will reduce the incidence of unexpected outages due to a spindle motor issue (FIN I1136-1).
Memory
  • Best Practices Guide for Memory Errors for diagnosing UltraSPARC III memory errors now available (FIN I1018-1).
  • Diagnosing Main Memory errors versus L2SRAM errors on UltraSPARC III and UltraSPARC III Cu systems (FIN I0954-1).
  • On certain Sun Systems a small number of 512MB and 1GB Micron DIMMs may experience premature failures with Uncorrectable Memory Error (UE) messages. (FAB 103004) (formerly FCO A0285-1)
  • A sub-population of DIMMs that shipped between 2001 and 2002 on the below platforms are showing significantly lower reliability than expected (FCO A0253-1).
  • Systems containing 256MB Samsung B-die DIMMs, having a module date code between 0115 and 0127 (built between weeks 15 and 27 of 2001), may experience Uncorrectable Memory Errors (UE). This can lead to System Panics (FCO A0223-1).
PCI I/O Assembly
  • The QFE network interface is reporting excessive input packet errors when running back to back stress tests. (FIN I1138-1).
Redundant Transfer Switch (RTS)
  • Sun Fire 6800 Servers may encounter power error issues when system RTS/RTU units are configured from a single power source. (FIN I1133-1)
  • A weak fuse in the Serengeti 3800/48x0/6800 standby RTS blows when the load from the primary RTS is switched over (FCO A0206-3).
Redundant Transfer Unit (RTU)
  • Sun Fire 6800 Servers may encounter power error issues when system RTS/RTU units are configured from a single power source. (FIN I1133-1)
System Controller Board
  • Sun Fire 3800/4800/4810/6800, V1280 and Netra 1280 systems may experience false data parity error messages (FIN I1020-1).
  • Sun Fire System Controllers with 5.11.x firmware could experience a loss of network settings during a firmware upgrade (FIN I0891-1).
  • Sun Fire 3800/4800/4810/6800 domains can hang when System Controller firmware is downgraded from 5.13.0, 5.13.1, or 5.13.2 to 5.12.7 or lower (FIN I0890-1).
  • Systems with redundant SCs may experience failed domains after clock failure (FIN I0762-1)
  • Unexpected behavior from SCs due to downrev firmware (FIN I0756-1)
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback