Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1009309.1
Update Date:2011-05-26
Keywords:

Solution Type  Troubleshooting Sure

Solution  1009309.1 :   Proactive setup/troubleshooting of a Sun Fire[TM] 280R  


Related Items
  • Sun Fire 280R Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Entry-Level Servers
  •  

PreviouslyPublishedAs
212887


Applies to:

Sun Fire 280R Server
All Platforms

Purpose

This document describes how to set up your system, Sun Fire [TM] 280R, so that in case trouble arises Sun support will be able to troubleshoot the system as good and as efficient as possible.

Last Review Date

February 8, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

Description
This document describes how to set up your system, Sun Fire [TM] 280R, so that in case trouble arises Sun support will be able to troubleshoot the system as good and as efficient as possible.


Steps to Follow

1) Patches

  • Be certain the system is up to date with patches. An up to date system has two advantages, availability will go up and in case of a problem it is better to diagnose.

See appendix A for the recommended Solaris 8 patches.

See appendix B for the recommended Solaris 9 patches.

See appendix C for the recommended Solaris 10 patches.

NOTE: The patch versions shown in appendix A, B or C are not the latest (Last Updated:  April 21, 2005). To get the latest version of a patch, login to MOS, select the Patches & Updates tab and search for the patch ID# as shown in the appendix.

* To get an overview of the state of the system a Sun Checkup can be requested (an explorer output is needed for that)

2) Open Boot Prom (OBP)

  • For these systems very important and should be at 4.16.4 (Patch 118323-01)

  • Recommended settings for this version:

diag-switch  = true

diag-level = max

diag-script = normal

auto-boot  = true

diag-device =

error-reset-recovery = sync

* With the diag-switch  set to true booting can take a long time, especially if the system holds a lot of memory. When this is not acceptable set it to false.

* With the diag-script set to normal obdiag tests all devices expected to be present in the baseline configuration, so no pci cards.

* With error-reset-recovery set to sync OBP invokes a sync, which will create a crash dump, after a XIR or a Red state

3) Configure the Remote System Controller (RSC)

  • Three packages need to be installed (available on the Solaris Supplemental CD):

SUNWrsc On the host system - RSC

SUNWrscd On the host system - RSC user guide

SUNWrscj On a client - RSC gui

- Configure the RSC:

# /usr/platform/`uname -i`/rsc/rsc-config

  • To get the console messages and to get to the ok-prompt via the RSC set the following two parameters in the OBP:

input-device = rsc-console

output-device = rsc-console

diag-out-console = true

4) Enable the watchdog reset mechanism

  • Add the following setting in /etc/system:

     watchdog_enable=1

* a reboot is necessary to activate the setting

5) Configure Solaris to save a crash dump to disk after a panic

  • How much space is needed for the dump device 

Crash dumps vary in size based on the memory configuration of the system and how much of that memory was in use. On systems with relatively small amounts of RAM (up to 5 GB), a guideline is to allow 35% of the amount of RAM per crash dump. For larger amounts of RAM, 2 GB is usually sufficient.

  • Configuring the dump device

Crash dumps are enabled by default, and unless the dumpadm command was used to change it, the dump device is the primary swap partition (the first one listed by the swap -l command). If the dump device is a regular partition (begins with /dev/dsk), and is of sufficient size, no further configuration is necessary.

  • Configuring a DiskSuite-encapsulated swap partition as the dump device

If the swap partitions are encapsulated by DiskSuite, you must use the name of the encapsulated partition, not one of the raw partitions it is made from. The output from dumpadm should look something like this one:

Dump device: /dev/md/dsk/d1 (swap)

If you are using the primary swap partition, use this dumpadm command to configure it:

# dumpadm -d swap

  • Configuring a Veritas Volume Manager encapsulated swap partition as the dump device

If there is a spare partition with sufficient space, use the "dumpadm -d" command to configure that as the dump device. If the only available space for the dump device is an encapsulated Veritas partition, you must provide the path of the original disk device name, rather than the Veritas encapsulated path name for a dedicated device. For example:

Dump device: /dev/dsk/c6t0d0s1 (dedicated)

vs

Dump device: /dev/vx/dsk/swapvol (dedicated)

* For more info on setting up a dump device see:

Technical Instruction Document 1004803.1 - Collecting System Crash Dump Images on Solaris[TM] 7 and later

Technical Instruction Document 1017485.1 - Determining Approximate Crash Dump File Size

6) Configure an external loghost for the message files

  • It can be useful to log the /var/adm/messages also to an external loghost, but NOT JUST to an external host, log them also locally.

* For information on the syslog mechanism see the following documents on sunsolve:

Technical Instruction Document 1007237.1 - Setting up and debugging logging to remote hosts

Technical Instruction Document 1004455.1 - Working with the Solaris[TM] Operating Environment messaging and logging daemon

7) When we do not have a stable system

  • Enable full firmware diagnostics. Change these settings in OBP:

diag-switch  = true

diag-script = all

test-args = verbose, subtests

* With the diag-script set to all obdiag tests all devices expected to be present in the baseline configuration, including pci cards.

  • When hangs can be expected enable the deadman kernel. Add these settings in /etc/system:

set snooping=1

set snoop_interval=9000000

* A reboot is necessary to activate the setting

* Enabling the deadman kernel will cost performance so do not leave this on as a default

* Technical Instruction Document 1004530.1 - KERNEL: How to enable deadman kernel code.

8) Configure an console loghost

  • Especially when problems are encountered but no relevant errors are seen in the message file we probably will see errors on the console.

  • For example a Red state exception will only be seen on the console

  • Connect a laptop (or other system) to the RSC network port, make a telnet connection to the RSC and capture the console messages.

  • When for some reason the RSC is not used capture the console messages from the ttyA serial port.

* For more info on setting up a console logging:

Technical Instruction 1008702.1 - Console Logging Options to capture Fatal Reset output for Sun systems.

9) What to do when the system hangs

- What is exactly hanging (system, RSC, network) 

a) The main system with Solaris 

  • is it pingable 

  • can you telnet to it 

  • run explorer

  • check for crashdumps

  • get status of system led's

b) The RSC card 

  • is it pingable 

  • can you telnet to it 

  • get output of:

     consolehistory

     showenvironment

     loghistory

     version

          - log in to the console (from the RSC)

               - when solaris is up and running

  •                     get status of system led's

  •                     check for crashdumps

  •                     run explorer

               - when console found on ok-prompt

  •                     get status of system led's

  •                     get output of printenv

  •                     issue a sync

  •                     run explorer

  •                     check for the crashdump

               - when no output from the console

  •                     get status of system led's

  •                     turn keyswitch to diagnostics position

  •                     check if the console registers this change

  •                     perform a sync, if this fails sent a XIR from the rsc> prompt (rsc> xir)

c) The network 

  • is it pingable 

  • telnet/rlogin to system and RSC

  • tried to telnet with directly attached terminal 

* When there is no response from the system at all

  • Press the system Power button for five seconds.

  • This causes an immediate hardware shutdown.

  • Wait at least 30 seconds, then power on the system

10) What to do when the system has panicked and automatically rebooted

  • check for a crash dump

  • run explorer

11) What to do when the system has panicked and sits at the ok-prompt

  • get status of system led's

  • get output of printenv

  • issue a sync

  • run explorer

  • check for the crashdump

12) What information will Sun usually ask for in these situations

  • Explorer

  • Crash dump

  • Console logging

* Run explorer as follows:  /opt/SUNWexplo/bin/explorer -w fru,default .

The output file of the explorer will be located in /opt/SUNWexplo/output.

* The crash dump consists of two files: unix.(nr) and vmcore.(nr) located in /var/crash/`uname -n`.

Appendix A

Here is the list of options being offered as acceptable by SUN for the patches to add to Solaris 8 HW02/02 or earlier.  Note that using Solaris 8 HW02/02 gets many additional patches pre-installed that may not be contained in earlier Solaris 8 releases.  

Highly Recommended Platform Patches (in this order)

111111-04  SunOS 5.8: /usr/bin/nawk patch (required for KUP)

112396-02  SunOS 5.8: /usr/bin/fgrep patch (required for KUP)

108987-16  SunOS 5.8: Patch for patchadd and patchrm (required for KUP)

111310-01  SunOS 5.8: /usr/lib/libdhcpagent.so.1 patch (required for KUP)

109882-06  SunOS 5.8: eri header files patch

108528-29  SunOS 5.8: kernel update patch (KUP)

117000-05  SunOS 5.8: Kernel Update Patch (KUP) (108528-29 required)

117350-22  SunOS 5.8  Kernel Patch

116903-01  SunOS 5.8  pcihp, pcipsy,pcisch Patch ( Obsoleted by 117000-05)

110723-07  SunOS 5.8: /kernel/drv/sparcv9/eri patch

110460-32  SunOS 5.8: fruid/PICL plug-ins patch (obsoleted by patch 108528-29)

109888-26  SunOS 5.8: platform drivers patch (obsoleted by patch 108528-29)

Highly Recommended Firmware Updates (require manual installation)

118323-01   Hardware/PROM: Sun Fire 280R OBP_4.16.4,POST_4.16.3,OBDIAG_4.16.4

111228-01   Hardware/CPU: SF280R FRU Update for early release 750MHz CPUs

114708-05   Hardware/Disk ST373307FC 73GB 10k Drive Firmware

109962-14   Hardware/Disk: Seagate FC-AL 36GB/73GB Disk Drive firmware

111649-03   Hardware/DVD: Toshiba DVD 1401 firmware

Supplemental Platform Software Patches

111792-11  SunOS 5.8 PICL plugins patch

109873-26  SunOS 5.8: prtdiag and platform libprtdiag_psr.so.1 patch

             (Obsoleted by 111792-11)

111500-09  RSC 2.2 bug fixes

118956-01   SunVTS5.1 Patch Set 9(Preferred version)

MPxIO/Leadville Driver Stack Patches

SUNWsan    SAN 3.0/3.1/3.2 foundation kit (SFK) packages

111412-16   SunOS 5.8: Sun StorEdge Traffic Manager patch

111413-15   SunOS 5.8: luxadm, liba5k and libg_fc patch

111095-20   SunOS 5.8: fctl/fp/fcp/usoc driver patch

111096-10   SunOS 5.8: fcip driver patch

111097-17   SunOS 5.8: qlc driver patch

111847-08   SunOS 5.8: SAN foundation kit patch

Non-Platform-Specific Extra Kernel Drivers

108434-18   32-Bit Shared library patch for C++

108435-18   64-Bit Shared library patch for C++

108693-25   Solstice DiskSuite 4.2.1: Product patch

108806-18   SunOS 5.8: Sun Quad FastEthernet qfe driver

108813-16   SunOS 5.8: Sun Gigabit Ethernet 3.0(Obsoleted by 117000-05)

109657-11   SunOS 5.8  isp driver

108989-02   SunOS 5.8: /usr/kernel/sys/acctctl and /usr/kernel/sys/exacctsys

108993-44   SunOS 5.8: LDAP2 client, libc, libthread and libnsl libraries patch

109147-33   SunOS 5.8: Linker patch

109234-09   SunOS 5.8: Apache Security and NCA Patch(obsoleted by 108528-29)

109793-25   SunOS 5.8: su driver patch

109815-20   SunOS 5.8: se, acebus, pcf8574, pcf8591 and scsb patch

109873-26   SunOS 5.8: prtdiag and platform libprtdiag_psr.so.1 patch

110165-04   SunOS 5.8: /usr/bin/sed patch

110386-03   SunOS 5.8: RBAC Feature Patch

110835-07   SunOS 5.8: platform/sun4u/kernel/misc/sparcv9/gptwo_cpu patch

110934-21   SunOS 5.8: pkgtrans, pkgadd, pkgchk and libpkg.a patch

110955-05   SunOS 5.8: /kernel/strmod/timod patch

111023-03   SunOS 5.8: /kernel/fs/mntfs and /kernel/fs/sparcv9/mntfs patch

111325-02   SunOS 5.8: /usr/lib/saf/ttymon patch

111562-02   SunOS 5.8: /usr/lib/librt.so.1 patch

111588-05   SunOS 5.8: /kernel/drv/ws and /kernel/fs/specfs patch

111624-05   SunOS 5.8: /usr/sbin/inetd patch

116962-06   SunOS 5.8  pcisch and pcipsy driver

111846-08  SunOS 5.8: cfgadm fp plug-in library patch

111883-30   SunOS 5.8: Sun GigaSwift Ethernet 1.0 driver patch

112438-03   SunOS 5.8: /kernel/drv/random patch

112991-01   SunOS 5.8: /usr/sbin/prtvtoc patch

113355-09   Sun Crypto Accelerator 1000 v1.1 Patch

115297-02   Sun Crypto Accelerator 1000 v1.1 Patch

* Last Updated: April 19, 2005

Appendix B

Here is the list of options being offered as acceptable by SUN for the patches to add to Solaris 9 FCS or later.  Note that using Solaris 9 HW 4/03 gets many additional patches pre-installed that may not be contained in earlier Solaris 9 releases.  

Minimum Required Platform Patches (in this order)

112233-12   SunOS 5.9: Kernel Patch

117171-17   SunOS 5.9  Kernel Patch (requires 112233) 

118558-06   SunOS 5.9  Kernel Patch

112904-09   SunOS 5.9: tcp Patch (Obsoleted by 112233-11)

112951-11   SunOS 5.9: patchadd and patchrm Patch

112965-05   SunOS 5.9: patch /kernel/drv/sparcv9/eri

113447-25   SunOS 5.9  libprtdiag_psr patch (Obsoleted by 118558-05)

113221-03   SunOS 5.9: libprtdiag_psr.so.1 Patch(obsoleted by 113447)

114126-03   SunOS 5.9: todds1287 patch (Obsoleted by 118558-02)

Minimum Required Firmware Updates (require manual installation)

118323-01   Hardware/PROM: Sun Fire 280R OBP_4.16.4,POST_4.16.3,OBDIAG_4.16.4

111228-01   Hardware/CPU: SF280R FRU Update for early release 750MHz CPUs

109962-14   Hardware/Disk: Seagate FC-AL 36GB/73GB Disk Drive firmware

111649-03   Hardware/DVD: Toshiba DVD 1401 firmware

Supplemental Platform Software Patches

112913-01  SunOS 5.9: fruadm Patch

113388-02  RSC 2.2.1 bug fixes

116363-07  RSC 2.2.2 Bug Fixes for updates 5,6, and 7.

118956-01  SunVTS5.1 Patch Set 9 (preferred version)

MPxIO/Leadville Driver Stack Patches

SUNWsan    SAN 3.0/3.1/3.2 foundation kit (SFK) packages

111847-08  SAN foundation kit patch

113039-08  SunOS 5.9: Sun StorEdge Traffic Manager patch

113046-01  SunOS 5.9: fcp Patch

113040-12  SunOS 5.9: fctl/fp/fcp driver patch

113041-08  SunOS 5.9: fcip driver patch

113042-10  SunOS 5.9: qlc driver patch

113049-01  SunOS 5.9: luxadm & liba5k.so.2 Patch

113043-09  SunOS 5.9: luxadm, liba5k and libg_fc patch

113044-05  SunOS 5.9: cfgadm fp plug-in library patch

Non-Platform-Specific Extra Kernel Drivers

111711-12    32-Bit Shared library patch for C++

111712-12    64-Bit Shared library patch for C++

112764-07    SunOS 5.9: Sun Quad FastEthernet qfe driver

112817-23   SunOS 5.9: Sun GigaSwift Ethernet 1.0 driver patch

112874-31   SunOS 5.9: patch libc

112839-08   SunOS 5.9: patch libthread.so.1

112955-03   SunOS 5.9: patch kernel/fs/autofs kernel/fs/sparcv9/autofs

112963-19   SunOS 5.9: linker patch

113077-14    SunOS 5.9: patch su driver

113146-06   SunOS 5.9: Apache Security Patch

113355-09   Sun Crypto Accelerator 1000 v1.1 Patch

113361-08   SunOS 5.9: Sun Gigabit Ethernet 3.0 (GEM) driver patch (obsoleted by 112233)

114863-01   SunOS 5.9: /platform/sun4u/kernel/misc/forthdebug Patch (obsoleted by 112233)

115297-02   Sun Crypto Accelerator 1000 v1.1 Patch

*Last Updated:  March 3, 2005

Appendix C

Here is the list of options being offered as acceptable by SUN for the patches to add to Solaris 10 FCS or later.

Minimum Required Platform Patches (in this order)

118822-01   SunOS 5.10: Kernel Patch

119130-04   SunOS 5.10: Sun Fibre Channel Device Drivers

Minimum Required Firmware Updates (require manual installation)

118323-01   Hardware/PROM: Sun Fire 280R OBP_4.16.4,POST_4.16.3,OBDIAG_4.16.4

111228-01   Hardware/CPU: SF280R FRU Update for early release 750MHz CPUs

109962-14   Hardware/Disk: Seagate FC-AL 36GB/73GB Disk Drive firmware

111649-03   Hardware/DVD: Toshiba DVD 1401 firmware

Supplemental Platform Software Patches

118962-01   SunVTS6.0 Patch Set 1 (preferred version)

Non-Platform-Specific Extra Kernel Drivers

119374-01   SunOS 5.10: sd Patch

*Last Updated:  April 21, 2005  



Product
Sun Fire 280R Server

SF280R, troubleshooting, setup, OBP, RSC, watchdog, panic, Littleneck, hang
Previously Published As
82171

Change History
Date: 2011-02-08
User name: Dencho Kojucharov
Action: Currency check & update
Comments: audited by Entry-Level SPARC Content Lead
Converted to AWIZ format, updated docs & patch links

Date: 2005-08-30
User Name: 111868
Action: Approved
Comment: Verified Metadata
Verified audience as per FvF http://kmo.central/howto/FvF.html
Checked review date
Checked for TM - corrected
Publishing
Version: 4
Date: 2005-08-30
User Name: 111868
Action: Accept
Comment:
Version: 0


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback