Sun Microsystems, Inc.   Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Sun Fire[tm] 12K / 15K Server: Repair Procedures

This section provides tips (when available) for the most common repair procedures for the Sun Fire 12K / 15K. The repair procedures are located in the Sun Fire 15K System Service Manual (806-3512).

Clustered Hardware Note: When repairing a clustered system, you should replace server components by first switching over the data services to the functioning server, halting the host to be serviced, powering down the host, and then performing the hardware procedure to replace the component. Following the procedure, the logical hosts should be switched back to the default masters.

FRU Replacement Procedure Note: Sun Fire 15K/12K failure logs must be returned with defective FRUs to ensure correct problem diagnosis and FRU repair. When returning F15K and F12K FRUs which have functionally failed, the appropriate files must be captured and sent in. See FIN I0847-1 for details.

The following table provides the available tips for each procedure.

Component Tips
CPU Board (Slot 0)
  • Reworked UltraSPARC IV 1.2GHz system boards may be downclocked to 1050MHz due to a misprogrammed SEEPROM. (FIN I1178-1).
  • The Vcore setting for UltraSPARC IV 1350MHz boards should be 1.25V, but some earlier releases of firmware will power on and operate the board at an incorrect voltage. (FIN I1166-1).
  • A special procedure must be followed when replacing this FRU (details).
  • When boards are inserted into Slot 0 or Slot 1 of a Sun Fire 12K/15K chassis, the board LEDs may fail to illuminate (FIN I0980-1).
  • Replacement of 900MHz System Boards by 1200MHz boards in Sun Fire 12K/15K platforms may fail if the proper installation procedure is not followed (FIN I0958-1).
  • CPU/Memory Board FRUs for Sun Fire 12K/15K and Sun Fire 3800-6800 systems are not enabled for Capacity On Demand (COD) use (FIN I0912-1).
  • Guidelines for understanding and diagnosing UltraSPARC III Level 2 (L2) SRAM Cache Memory Errors (FIN I0887-1).
  • UltraSPARC III and III+ based platforms could be susceptible to UCC errors that may cause system panics (FIN I0856-1).
  • Sun Fire CPU/Memory Boards may become unusable when components are knocked off by improper handling (FIN I0855-1).
  • LEDs may not light upon cold start (FIN I0826-1).
  • Issue with diagnosing of "send mondo" panics (FIN I0765-1)
  • 900MHz CPUs operating at 750MHz (FIN I0759-1
CPU Board DIMM (Slot 0)
  • A special procedure must be followed when replacing this FRU (details).
  • Best Practices Guide for Memory Errors for diagnosing UltraSPARC III memory errors now available (FIN I1018-1).
  • Diagnosing Main Memory errors versus L2SRAM errors on UltraSPARC III and UltraSPARC III Cu systems (FIN I0954-1).
  • Systems containing 256MB Samsung B-die DIMMs, having a module date code between 0115 and 0127 (built between weeks 15 and 27 of 2001), may experience Uncorrectable Memory Errors (UE). This can lead to System Panics (FCO A0223-1).
Centerplane Support Board
  • A special procedure must be followed when replacing this FRU (details).
  • The "thermcal" utility must be executed following replacement of this part (FIN I1071-1).
  • A limited number of Sun Fire 15K/20K/25K Expander and CSBs are experiencing higher than expected rates of loss of redundant DC power supplies. (FCO A0267-1).
Disk Drive
  • On the System Control Peripheral board, the upper disk (target Id 3) connects to J2 SCSI backplane connector, and the lower disk (target Id 2) connects to J3 SCSI backplane connector.
Fan Tray
  • A special procedure must be followed when replacing this FRU (details).
Hot-Swap PCI (hsPCI) Assembly (Slot 1)
  • A special procedure must be followed when replacing this FRU (details).
  • When boards are inserted into Slot 0 or Slot 1 of a Sun Fire 12K/15K chassis, the board LEDs may fail to illuminate (FIN I0980-1).
  • Some pcisch driver panics on F15K systems are unrelated to failed hardware (FIN I0852-1).
  • LEDs may not light upon cold start (FIN I0826-1).
  • Sun Fire 15K domains may panic due to problem with Schizo 2.2 ASICs on hsPCI I/O Boards (FIN I0820-1).
  • Crystal-2A card (or other PCI adapter cards) in Slot 1 of hsPCI I/O board (5.0v cassette) may not be recognized upon boot or may unexpectedly fail to initialize. (FCO A0246-1).
  • The 3.3V hsPCI Cassette inaccurately determines the speed of the PCI adapter and sets it to 33 MHz (FCO A0218-3).
  • hsPCI boards having Schizo 2.2 ASICs can suffer a panic due to a timing race condition (FCO A0193-1).
MaxCPU Board (Slot 1)
  • When boards are inserted into Slot 0 or Slot 1 of a Sun Fire 12K/15K chassis, the board LEDs may fail to illuminate (FIN I0980-1).
  • Issue with diagnosing of "send mondo" panics (FIN I0765-1).
  • A limited set of Sun Fire System Boards may be vulnerable to Uncorrectable Errors in L2 SRAM. (FCO A0248-1).
Memory
  • On certain Sun Systems a small number of 512MB and 1GB Micron DIMMs may experience premature failures with Uncorrectable Memory Error (UE) messages. (FAB 1000920.1) (formerly FCO A0285-1)
  • A sub-population of DIMMs that shipped between 2001 and 2002 on the below platforms are showing significantly lower reliability than expected (FCO A0253-1).
PCI Cassette(Slot 1)
  • A special procedure must be followed when replacing this FRU (details).
  • Some pcisch driver panics on F15K systems are unrelated to failed hardware (FIN I0852-1).
  • EMI cap screws in Sun Fire 12K/15K PCI Cassettes may tighten over time, making it difficult to remove PCI I/O cards. (FCO A0249-1).
  • The 3.3V hsPCI Cassette inaccurately determines the speed of the PCI adapter and sets it to 33 MHz (FCO A0218-3).
PCI Cassette Card
  • A special procedure must be followed when replacing this FRU (details).
  • PCI adapters in F12K/15K domains may intermittently fail following a reset, reboot or "setkeyswitch on" operation. (FIN I0877-1).
  • UltraSPARC III systems may panic due to a conflict between the Schizo 2.4 ASIC and the GigaSwift Ethernet PCI Card (FIN I0874-1).
  • Some pcisch driver panics on F15K systems are unrelated to failed hardware (FIN I0852-1).
  • Cauldron cards (501-5727-04 and below) with the "BC" bridge chip may cause a system panic during boot on Sun Fire 12K/15K systems. The symptom will appear as a PCI SERR which causes the OS to panic (FCO A0219-1).
  • The 3.3V hsPCI Cassette inaccurately determines the speed of the PCI adapter and sets it to 33 MHz (FCO A0218-3).
Power Centerplane
  • A special procedure must be followed when replacing this FRU (details).
  • When a Power Centerplane Board is replaced in the field there is the potential for the 48VDC cable to be routed so that it shorts against a ground lug (FIN I1061-1).
Power Module
  • A special procedure must be followed when replacing this FRU (details).
Power Supply
  • A special procedure must be followed when replacing this FRU (details).
System Control (SC) Board
  • Running the `prtdiag` command on the System Controller of Sun Fire 12K/15K/E20K/E25K can cause a segmentation fault. (FIN I1135-1).
  • System Controllers (SCs) built prior to September 2002 may not be able to save a crash dump due to a misconfiguration issue (FIN I0878-1).
  • All Sun Fire 15K systems running SMS 1.1 or SMS 1.2 do not have the osdTimeDeltas file in the file propagation list. If a failover occurs, this could result in a rollback of a domain's Time Of Day. Such a shift in time has unforseen effects on applications (FIN I0785-1).
  • Sun Fire 15K System Controller should be configured with at least 2 Gbytes swap space (FIN I0748-1).
System Control (SC) CPU Board
  • A special procedure must be followed when replacing this FRU (details).
  • Mac address of SC can change when installing new CP1500 board, which will cause the system not to boot (FIN I0771-1)
  • When inserting a CP1500 CPU board firmware must be updated (FIN I0761-1)
  • SMS may shut off the system controller board, causing a failover to the redundant system controller board, which causes a loss of redundancy (FCO A0217-1).
System Control (SC) Peripheral Board
  • A special procedure must be followed when replacing this FRU (details).
System Expander Board
  • A special procedure must be followed when replacing this FRU (details).
  • Slot 1 Dynamic Reconfiguration (DR) on F12K/F15K platforms is not supported and must not be performed on domains containing AXQ 6.0 or Schizo 2.2 ASICs (FIN I0933-1).
  • Due to a bug in the AXQ ASIC which resides on the System Expander Board of the F15K, a Solaris panic can occur if too many outstanding noncacheable Programmed I/O transactions are awaiting service (FIN I0763-1).
  • Sun Fire 15K domains containing Expanders with hardware dash revision 16 or less may dstop due to bug 4505200 (FIN I0822-1).
  • LEDs may not light upon cold start (FIN I0826-1).
  • System Expander Boards having AXQ 6.0 ASICs may experience a deadlock condition (FCO A0192-1).
Top Cap Frame Manager Assembly
  • A special procedure must be followed when replacing this FRU (details).

  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback