Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-73-1000859.1
Update Date:2011-03-17
Keywords:

Solution Type  FAB (standard) Sure

Solution  1000859.1 :   PCI and PCI+ IO Assemblies in Sun Fire 4800/4810/6800 or Sun Fire E4900/E6900 systems may fail to power up if one or more QGE cards are installed.  


Related Items
  • Sun Fire E6900 Server
  •  
  • Sun Fire 6800 Server
  •  
  • Sun Fire E4900 Server
  •  
  • Sun Fire 4800 Server
  •  
  • Sun Fire 4810 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Sun FAB>Standard>Reactive
  •  

PreviouslyPublishedAs
201140


Product
Sun Fire 4800 Server
Sun Fire 4810 Server
Sun Fire 6800 Server
Sun Fire E6900 Server
Sun Fire E4900 Server

Bug Id
<SUNBUG: 6237685>

Part
  • Part No: 501-6522-07 (and lower)
  • Part Description: Quad GigaSwift Ethernet UTP
Xoption
  • Xoption Number: X4444A
  • Xoption Description: Sun Quad GigaSwift Ethernet

Impact

If one or more Quad GigaSwift Ethernet UTP (QGE) adapters are present in one of the slots of a PCI or PCI+ IO Assy, the IO Assy may fail to power up.  This will prevent domains from utilizing any resources in that IO Assy.


Contributing Factors

This affects Sun Fire 4800, 4810, 6800, E4900, and E6900 systems where a QGE card, Option X4444A, is installed in a PCI or PCI+ IO Assy.  The issue will only occur if there are additional HBAs/NICs installed in the PCI or PCI+ IO Assy along with the QGE card/s.  It has been seen in configurations where there are 7 or 8 total cards in an IO Assy.

This issue does not affect OS-booted production domains.  It is not seen once the system has been successfully booted, unless it is later shutdown and then powered on again.

Since the PCI or PCI+ IO Assy will fail before the affected domain is booted, it is not possible to identify the affected QGE card using prtdiag(1M).  However, here is an excerpt from a typical prtdiag showing a QGE card installed in slot 4.

  ========================= IO Cards =========================
                                Bus  Max
            IO    Port Bus      Freq Bus  Dev,
FRU Name    Type  ID  Side Slot MHz  Freq Func State    Name                             Model
----------  ---- ---- ---- ---- ---- ---- ---- ----- --------------------------------  --------------
/N0/IB7/P1  PCI   27   B    4    33   33  1,0  ok    pci-pci8086,b154.0/pci (pci)      pci-bridge
/N0/IB7/P1  PCI   27   B    4    33   33  0,0  ok    pci-pci8086,b154.0/network        (netw+ pci-bridge
/N0/IB7/P1  PCI   27   B    4    33   33  0,0  ok    network-pci100b,35.30             SUNW,pci-qge
/N0/IB7/P1  PCI   27   B    4    33   33  1,0  ok    network-pci100b,35.30             SUNW,pci-qge
/N0/IB7/P1  PCI   27   B    4    33   33  4,0  ok    pci-pci8086,b154.0/network        (netw+ pci-bridge
/N0/IB7/P1  PCI   27   B    4    33   33  2,0  ok    network-pci100b,35.30             SUNW,pci-qge
/N0/IB7/P1  PCI   27   B    4    33   33  3,0  ok    network-pci100b,35.30             SUNW,pci-qge


Symptoms

When this occurs during poweron of a PCI or PCI+ IO Assy, a message similar to the following will appear.

   nspga:A> poweron all
   /N0/SB0: powered on
   /N0/SB2: powered on
   Mar 08 10:54:00 nspga Domain-A.SC: sun.serengeti.HpuFailedException: PCI I/O Board at /N0/IB6
   /N0/IB6: powered on
   /N0/IB8: powered on

Root Cause

It has been determined that the initial power surge during power-on of the QGE card could cause the DC-DC converter on the PCI or PCI+ IO Assy to shut down.  The QGE card consumes high power during power-up initialization, leading to this situation.  The cumulative power surge during powerup initialization may cause the DC-DC converter to shut down.  However, if the DC-DC converter "rides through" the initial powerup, then under nominal operation, this issue is not encountered.

The issue has been addressed with an update to the QGE card (ECO WO_31157) which limits the initial current spike at poweron.  The part number has been dash rolled from -07 to -08.  GSAP 3122 was implemented to purge P/N 501-6522-07 and lower from Service's inventory.


Workaround

The workaround is to remove cards that are not being used (if such cards can be identified).  This will decrease the cumulative power surge and avoid overloading the DC-DC converter on the PCI or PCI+ IO Assy.  Isolating the QGE card to an IO Assy configuration with fewer populated slots is required to ensure the power draw issue does not affect any IO Assy operation.


Resolution

Resolution

If customers encounter this issue, replace any offending QGE Card with the fixed version.

  • Replace 501-6522-07 (and lower) with 501-6522-08 (or higher)

This is an Upon Failure remediation and should only be performed if this specific issue has been identified at a customer site.


Comments

This issue was fully evaluated as an FCO candidate via the official FCO process. However, the FCO was rejected due to the very low expected failure rate, and the fact that it has not been reported at any customer site.

If a customer adds additional NIC/HBAs in an IO Assy, this could change the intial power-up surge characteristics and may affect the load conditions for the DC-DC converter. This could possibly lead to the condition described in this FIN.


References
    ECO: WO_31157
  • GSAP: 3122

Modification History
Date: 04-MAY-2006
. Updated technical fix in Issue Description.
. Added final resolution (HW replacement) to Corrective Action.
Previously Published As
101753
Contacts
Internal Contributor/submitter: Kevin Siebenthal
Internal Eng Business Unit Group: KE Authors
Internal Eng Responsible Engineer: Ron Emerick
Internal Services Knowledge Engineer: Pete Stauffer
Internal Kasp FAB Legacy ID 101753
Product_uuid
29d3a694-0a18-11d6-92da-df959df44cdd|Sun Fire 4800 Server
29d6f808-0a18-11d6-8aa8-943929fbbdd8|Sun Fire 4810 Server
29da7938-0a18-11d6-8a41-9ed1ad6d6779|Sun Fire 6800 Server
4fe39727-0599-11d8-84cb-080020a9ed93|Sun Fire E6900 Server
bed24aa9-0598-11d8-84cb-080020a9ed93|Sun Fire E4900 Server

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback