Document Audience:INTERNAL
Document ID:A0230-1
Title:B1600 systems configured with one or more Sun Fire B10n Content Loadbalancing blades may experience a hardware ssue resulting in a "watchdog timeout" event.
Copyright Notice:Copyright © 2007 Sun Microsystems, Inc. All Rights Reserved
Update Date:Wed Jun 23 00:00:00 MDT 2004

___________________________________________________________________

***  Sun Confidential:  Internal Use and Authorized VARs Only  ***
__________________________________________________________________

This message including any attachments is confidential information
of Sun Microsystems, Inc.  Disclosure, copying or distribution is
prohibited without permission of Sun.  If you are not the intended
recipient, please reply to the sender and then delete this message.
__________________________________________________________________


                             FIELD CHANGE ORDER
            (For Authorized Distribution by Enterprise Services)
            
FCO #: A0230-1
Status: inactive
Synopsis: B1600 systems configured with one or more Sun Fire B10n Content Loadbalancing blades may experience a hardware ssue resulting in a "watchdog timeout" event.
Date: Jun/23/2004
SunAlert: No
Top FIN/FCO Report: No
Products Reference: Puma B10n Load Balancing Blade
Product Category: Server / System Component
Product Affected: 
Systems Affected:

Mkt_ID   Platform   Model   Description        
------   --------   -----   -----------        
-        A44        ALL     Sun Fire B1600       



X-Options Affected:

Mkt_ID      Platform    Model    Description       
------      --------    -----    -----------	
x8080A      A44         All      Sun Fire B10n Load Balancing Blade
Parts Affected: 
Part Number	             Description             	     
-----------	             -----------	
540-5593-03(Or Less)         Load Balancing Blade - B10n            


(SCSI Devices)
Type   Vendor    Model     SerialNumber(Min)   SerialNumber(Max)   Firmware
----   ------    -------   -----------------   -----------------   --------
N/A
References: 
LEAP: 2557                                              
   ECO : WO_28736
Issue Description: 
Sun Fire B1600 systems configured with one or more Sun Fire B10n Content
Loadbalancing blades may experience a hardware issue resulting in a "watchdog
timeout" event.  If the blade is configured in a high availability failover
configuration the standby unit will take over.

Error messages can be observed either at the SC or at the B10n console.  At the
SC console "watchdog timeout" events could be observed if the Network Processor
Unit (NPU) were to fail.

   Login:LOM event:   Offset: +0h2m31s host watchdog timeout modified
   LOM event:   Offset: +0h3m43s host FAULT: watchdog triggered
   LOM event:   Offset: +0h3m43s host reset
   LOM event:   Offset: +0h3m43s Svc_Reqd LED state change: ON

Below is an initialization error.  The BSC INIT fails and then subsequently,
the error messages are sent to the console corresponding to each initialization
failure up until a system reset is issued by the SC:

   */ Copyright ) 2003 Sun Microsystems, Inc.

   Copyright 1984-2001  Wind River Systems, Inc.

   Booting SunFire B10n Blade
   Bootrom Build Date: Oct 16 2003, 20:00:53

   Press any key to choose configuration file option...
   0
   Press any key to choose boot image...
   0
   auto-booting...

   Booting Image /RFA0/BOOTIMAGE/boot_image_1 ...3134320

   Initializing RDRAM              ...  Done
   Initializing SDRAM ECC          ...  Done
   Initializing BSC Interface      ...  ERROR[-1]:BSC Initialize failed

   muxDevLoad failed for device entry 0!
   muxDevLoad failed for device entry 1!
   Invalid device "tffs=0,00"
   Driver not initialized: Not starting applications

   LOM event:   Offset: +1h7m2s host reset

At the B10n console, the following messages might occur if the NPU was not
responding:

   RDRAM: Rambus configuration failed

   Initialization of Lookup Pool Failed

   Initialization of Lookup table Failed

   Driver not initialized: Not starting applications

IF POST or diagnostics are run, the following are messages that may be seen.

Regarding Diags/Post Error Prints, a few are listed.  However, these would be
specific to the devices, and would let the user know which module failed
reading/writing to a particular register (name and address of the register is
displayed):

   diag_gmac.c:641:       DiagPrintf("%16s: *ERROR*", basePtr[i].regName);
   diag_misc.c:7488:      DiagPrintf("ERROR READING BOARD CONFIGURATION DATA.\n");
   diag_misc.c:8071:      printf("***ERROR*** Bad SPD rev level %02d for
                          device %02d\n",
   diag_omac.c:982:       DiagPrintf("%16s: *ERROR*", regTablePtr[i].regName);
   diag_pio.h:278:        ERROR --> RTC NVRAM overflow!!!
   diag_ppe.c:3754:       DiagPrintf("ERROR: ICC load failed.\n");
   diag_ppe.c:3765:       DiagPrintf("ERROR: Overall ICC image
                          wouldn't load cleanly, aborting scan.\n");
   diag_ppe.c:3784:       DiagPrintf("ERROR: ICC load of PHINT failed.\n");
   diag_rdram.c:176       DiagPrintf("\n\nERROR reading
                          device 0x%02X register 0x%02X (%s), aborting.\n",
   diag_rdram.c:1802:     DiagPrintf("\n\nERROR writing device 0x%02X
                          register 0x%02X (%s), aborting.\n",
   diagnostics.c:981:     {"ERRORLOG",    diagErrorLog,   FALSE},
   diagnostics.c:1198:    DiagPrintf("COMMAND COMPLETED, ERROR.\n");
   diagnostics.c:1248:    DiagPrintf("COMMAND COMPLETED, ERROR.\n");
   diagnostics.c:1258:    DiagPrintf("COMMAND COMPLETED, ERROR.\n");
   diagnostics.c:1268:    DiagPrintf("COMMAND COMPLETED, ERROR.\n");
   diagnostics.c:1317:    DiagPrintf("    DIAG> ERRORLOG [T|V|N|F]
                          Display (T|V), chk cnt (N) or flush (F).\n\n");
   diagnostics.c:1547:    DiagPrintf("COMMAND COMPLETED, ERROR.\n");
   diagnostics.c:2449:    DiagPrintf("ERROR: POST Exiting with Unkown
                          ChipType %d, 6 is expected.\n", diagApiRec.hostChipType);
   errorlog.c:386:        "ERROR: %02x%02x%02d%02d%02d %02d%02d %d %02d
                          %08x%08x %08x%08x %08x%08x %08x%08x",
   errorlog.c:401:        "ERROR: #%03d.%03d, %02d-%s-%02d,%02d:%02d, Agent %d,
                          %s.\n       %s\n       Parameters: %08x%08x, %08x%08x\n
                          %08x%08x, %08x%08x\n",
   errorlog.c:415:        sprintf(buffer, "ERROR: Agent %d, %s %s\n",

In order to determine the dash level of the Blade, run the service controller
console command "showfru sX" (where X is the slot number of the B10n);

   sc>showfru s6

   SEGMENT: SD
   /ManR/UNIX_Timestamp32: Thu Apr 17 23:24:13 UTC 2003
   /ManR/Fru_Description: SUNW,Sun Fire B10n, IQ4, RD512MB, VR4, SD512MB
   /ManR/Manufacture_Loc: Milpitas, CA, USA
   /ManR/Sun_Part_No: 5405593
   /ManR/Sun_Serial_No: 000005
   /ManR/Vendor_Name: Solectron
   /ManR/Initial_HW_Dash_Level: 01
   /ManR/Initial_HW_Rev_Level: 01
   /ManR/Fru_Shortname: SF B10n

An internal only link to the customer list showing all external and internal
customers on different tabs can be viewed via below URL;

   http://sdpsweb.central/FIN_FCO/FCO/FCO_A0230-1_Dir/CustomerList.sxc

Root cause has determined that a critical component, the Network Processor
Unit (NPU) (Sun Part number 100-7643-01), has a potential long term
reliability failure mode based on high temperature storage tests by the
manufacturer at 150 degrees Celcius.

Corrective action was made available on March 26, 2004 via ECO# WO_28736 by
releasing the new part 540-5593-04.  Corrective action was made available in
Sun Services on March 31, 2004 via LEAP# 2557 by changing the Minimal
Acceptable Level (MAL) from 540-5593-03 to 540-5593-04.
Parts Affected: 
June 30, 2006
Implementation: 
---
|   |   MANDATORY (Fully Pro-Active)
 ---

 ---
| X |   CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
 ---

 ---
|   |   UPON FAILURE
 ---
Replacement Time Estimate: 
0.5 hours
Special Considerations: 
Please mark all Defective Material Tags with "FCO A0230-1 - Do Not Screen".
Corrective Action: 
At all external customers per the above Customer List, proactively do the 
following:

   - replace all 540-5593-03 (or below) with 540-5593-04 (or above)


Sun internal systems should only be implemented reactively (Upon Failure).


IMPORTANT: Please follow the process steps below when replacing the
           540-5593.

Installing an upgraded B10 board at the customer site
=====================================================


The upgraded B10n blade has the following:
------------------------------------------

1. The 1.0 BSC firmware
2. Two B10n boot images - version 1.2.3 and 1.1.2_diag.
   The default boot image is 1.2.3.
3. The B10n bootrom, version 1.2.3

To export the configuration from the old board:
-----------------------------------------------

1. Go to the /RFA0 directory

        puma{admin}# cd /

2. Tar the CONFIG directory:

        puma{admin}# tar lbconfig.tar CONFIG

3. Export the config tar file:

        puma{admin}# export file
        The FTP server address:
        The source directory path: type [cr] to use current directory:
        (null) source path, using current directory
        The source file name: lbconfig.tar
        The destination directory path:
        The destination file name: lbconfig.tar
        The user name:
        The user password:

        export file succeed!

To import the configuration to the upgraded board:
--------------------------------------------------

1. Poweroff the old board and take it out. Plug in the upgraded board.

2. The board comes up with an empty configuration with the B10n 1.2
   application image running.

3. Configure the network interface. Optionally, configure the management
   VLAN (if applicable).

4. Go to the /RFA0 directory

           puma{admin}# cd /

5. Import the old (1.0/1.1.x) configuration.

        puma{admin}# import file
        The FTP server address:
        The source directory path:
        The source file name: lbconfig.tar
        The destination directory path:
        (null) path, using current directory...
        The destination file name: lbconfig.tar
        The user name:
        The user password:

        import file succeed!

6. Untar the configuration file.

        puma{admin}# untar lbconfig.tar

7. Reboot the B10n blade to get the imported configuration.

        puma{admin}# reboot


NOTE: To run traffic with B10n 1.2 application image, the blade server
module has to be updated to version 1.2.

To update the B100s blade server module to version 1.2:
-------------------------------------------------------

1. Download the 1.2 version of the blade server module software from the
   following site:

   http://wwws.sun.com/software/download/network.html

2. Unzip the file:

        # /usr/bin/unzip SunFire_B10n-1_2-Solaris-ServerModule.zip

3. Install the blade server module software packages:

        # cd /Solaris_8/Packages
        # pkgadd -d .

4. Restart the blade server module:

        # /etc/init.d/clbctl stop
        # /etc/init.d/clbctl start
Comments: 
None

------------------------------------------------------------------------------
Billing Type: 
Warranty: Sun will provide parts at no charge under Warranty
           Service. On-Site Labor Rates are based on how the
           system was initially installed.

 Contract: Sun will provide parts at no charge. On-Site Labor Rates
           are based on the type of service contract.

 Non Contract: Sun will provide parts at no charge. Installation by
               Sun is available based on the On-Site Labor Rates
               defined in the Price List.

--------------------------------------------------------------------------
Implementation Footnote: 
________________________

i)   In case of Mandatory FCOs, Sun Services will attempt to contact
      all known customers to recommend the part upgrade.

ii)  For controlled proactive swap FCOs, Sun Services mission critical
     support teams will initiate proactive swap efforts for their respective
     accounts, as required.

iii) For Replace upon Failure FCOs, Sun Services partners will implement
     the necessary corrective actions as and when they are required.

--------------------------------------------------------------------------

All released FINs and FCOs can be accessed using your favorite network
browser as follows:

SunSolve Internal Access:
_______________________

* Access the SunSolve Online URL at http://sunsolve.Central/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
_______________

* Access the top level URL of  https://spe.sun.com

FIN/FCO Homepage Access:
_________________________

* Access the top level URL of http://sdpsweb.Central/FIN_FCO/index.html

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.

To submit either a FIN or FCO refer to the following URLs for templates
and instructions;

*  For FCO: http://pronto.central/fco.html
*  For FIN: http://pronto.central/fin.html

--------------------------------------------------------------------------
General:
________

Send questions or comments to [email protected]

---------------------------------------------------------------------------
Statusinactive