Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1000095.1
Update Date:2010-09-01
Keywords:

Solution Type  FAB (standard) Sure

Solution  1000095.1 :   X4100 and X4200 May Encounter Unscheduled System Reboots Due to Double-Bit Uncorrectable Memory Errors  


Related Items
  • Sun Fire X4100 Server
  •  
  • Sun Fire X4200 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Reactive
  •  

PreviouslyPublishedAs
200113


Product
Sun Fire X4100 Server
Sun Fire X4200 Server

Bug Id
<SUNBUG: 6364001>

Part
  • Part No: 540-6497
  • Part Description: 2GB ECC Registered DIMM Module

Impact

A small proportion of X4100 and X4200 systems have been experiencing unscheduled reboots.


Contributing Factors

The reboot could happen anytime there is heavy traffic between the CPU and DIMMs.


Symptoms

The BIOS Event Log (DMI) will show "Sync flood error" just prior to the reboot. The System event log (SEL) of the ilom if interrogated with ipmitool (available on Resource cd) will show messages similar to these:

e00 | 03/21/2006 | 04:58:39 | OEM #0xfb | f00 | 03/21/2006 | 04:58:49 | Memory | Memory Device Disabled | CPU 0 DIMM 0
1000 | 03/21/2006 | 04:58:55 | System Firmware Progress | Motherboard initialization

 


Root Cause

DDR1 memory on these platforms may have an issue dealing with going in or out of the PowerDown mode and trigger uncorrectable ECC errors that cause system reboots. BIOS 034 and earlier enables the PowerDown mode (self-refresh/low-power mode) on the DIMMs with the wrong topology setting for these systems.


Workaround

 


Resolution

Upgrade to BIOS 036 or later.  Statistically, BIOS 036 reduces the probability of an unscheduled reboot with certain registered DIMMs and increases stability. BIOS 036 will disable the PowerDown mode per AMD's recommendation.  BIOS 036 can be obtained via the following website:

http://www.sun.com/servers/entry/x4100/downloads.jsp

Note: Some corner-case registered DIMMs with poor noise immunity coupled with corner-case noisy motherboards may not be fixed with BIOS 036.  If the issue continues, the CFE or field representative should have their case escalated and the engineer assigned should refer to the TSC VSP - X4100/X4200 website (listed below) for further remediation actions.

http://systems-tsc.uk/twiki/bin/view/Products/ProdIssuesSunFireX4100


Previously Published As
102619
Internal Comments


It is recommended that if a field engineer is doing a motherboard replacement or other FRU replacement, BIOS 036 or later should be loaded.



Upgrading to BIOS 036 or later should be the first step in resolving memory related issues.



Customers should be advised to upgrade their LSI firmware/MPT BIOS firmware if moving to BIOS 036.



Sun supplied vendor DIMMs meet all of the JEDEC specs and are not faulty in their own right.



Refer to Product Notes 1.2.1 (819-1162-21) and Release Note Supplement 1.2.1 (819-4344-10) for further information.



Hardware Remediation Details

 


Related Information
  • Manual: Product Notes 1.2.1 - PN 819-1162-21
  • URL: http://systems-tsc.uk/twiki/bin/view/Products/ProdIssuesSunFireX4100
    http://www.sun.com/servers/entry/x4100/downloads.jsp

Internal Contributor/submitter
[email protected]

Internal Eng Business Unit Group
NSG (Network Systems Group)

Internal Eng Responsible Engineer
[email protected]

Internal Services Knowledge Engineer
[email protected]

Internal Escalation ID
1-14109922, 1-13950402, 1-15145059, 1-15344836, 1-15612351, 1-15844911, 1-16558422, 1-17641524

Internal Kasp FAB Legacy ID
102619

Internal Sun Alert & FAB Admin Info
Critical Category:
Significant Change Date:
Avoidance: Upgrade
Responsible Manager: [email protected]
Original Admin Info: null

Product_uuid
54e2ac49-df71-11d9-89e6-080020a9ed93|Sun Fire X4100 Server
c6e795ef-df6f-11d9-89e6-080020a9ed93|Sun Fire X4200 Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback