Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1008702.1
Update Date:2011-05-23
Keywords:

Solution Type  Technical Instruction Sure

Solution  1008702.1 :   Console Logging Options to capture Fatal Reset output for Sun systems  


Related Items
  • Sun Enterprise 4500 Server
  •  
  • Sun Enterprise 5500 Server
  •  
  • Sun Enterprise 3500 Server
  •  
  • Sun Enterprise 6500 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>Midrange Servers
  •  

PreviouslyPublishedAs
211946


Description
Console Logging Options to capture Fatal Reset output

This document describes how to log the console output from Sun systems.



Steps to Follow
Use the following procedures to log the console output to help diagnose Fatal Resets on Sun Systems

Purpose of Console Logging

The purpose of console logging is to capture console messages, which are used to improve the quality and timeliness of problem diagnosis. By default, Fatal Reset details and POST output after a Fatal Reset are directed to serial port A (ttya).

In many system interrupts, serial port data is the only output available. This is because in some failure modes, Solaris [TM] Operating Environment has already terminated and there is no software running in the system that is capable of logging messages to traditional file system locations. For this reason, capturing diagnostic/failure data via serial console logging provides additional diagnostic information and reduces the number of "unexplained system reboots".

Fatal error fatal resets bring a system down extremely fast. Additional components to the failing item often detect the error, but the speed of crash often leaves these "error artifacts" in component registers. The PROM can subsequently interpret these artifacts by indicating the wrong component as the cause for the reset and may offline a good component as a result. Serial console logging allows analysis of the Fatal Reset ouput to help ensure that the actual defective FRU is replaced, and not a good component incorrectly reported as failed.

The following sections outline possible console logging options. Note that there may be other software and hardware vendors with equivalent products. The functionality of these other products is likely similar to what is discussed below.  When selecting a console server, ensure that it supports port buffering and that the buffer size is at least 200K per port.  The larger the buffer size, the less likely that important data is overwritten.

Console Logging Options - Data Logging Console Servers

A replacement for traditional terminal servers, which do not have console logging capability, is a console server device from Lantronix. The Lantronix console server is the equivalent to a traditional network based terminal server, except that the Lantronix device has memory added which is used as a "wrap around" message buffer.

As console messages are output from a SUN system , they are stored in this memory. As the memory fills up, the oldest messages are overwritten. One can connect to this console server via the network, and then display the contents of the memory buffer for a specific system in order to retrieve the stored console messages.

More information can be found at: http://www.lantronix.com/products/cs/index.html

Console Logging Options - Centralized Console Control

A centralized console control solution is available from SIE Computing Solutions.

This is a solution that allows a single Sun workstation to serve as a console access and logging point. Hardware is installed in the Sun system which supports multiple serial ports and system consoles that are controlled and monitored via these ports. The workstation can both grant console access as well as log all console activity on its local disk for review at anytime.

More information can be found at:

http://sie-cs.com/products/details.php?Product=114

Console Logging Options - Tip line to ttya

This may be one of the least expensive console logging options, but can create challenges when attempting to monitor multiple systems. The system that is performing the monitoring function must be up and operational, or logging of the other system's console is lost.

To enable this console logging mode, take a null serial cable (see below) and connect one end to the monitored system's ttya port, then connect the other end of the cable to any serial port on the monitoring system.

Once the cable is connected, a user on this monitoring system can issue the tip command and be connected to the other system's console. Note that prior to issuing the tip command, the user must enable some form of logging, for instance. using the  log to file  option of an xterm session, etc.

Using TIP

Have the system console of the monitored system redirected to another system.

The basic steps:

Hook a null modem cable between serial port A of the monitored machine and one of the serial ports of the healthy machine. The port (a or b) on the healthy machine depends on the hardwire entry in the /etc/remote file on the healthy system.

Here is the hardwire entry /etc/remote that uses port b on the healthy machine.

hardwire: :dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

A null modem cable in its most basic form is a rs232 serial cable with minimal pin connections as follows:

2 ------ 3

3 ------ 2

7 ------ 7

A standard serial cable with a null modem adapter from an electronics store will work as well.

There should be an entry for hardwire already in /etc/remote. It comes with the default OS. If one is not there, you can always copy it from another Solaris system.

Now open a command-tool on the healthy system. Sometimes tip behaves better with a shell-tool, but you lose scrolling (this window will be your buffer).

Type in: tip hardwire

You should see a  connected  message in this command-tool window.

NOTE: you will get the  connected  message regardless of the presence of the serial cable.  Connected  just means your tip session is talking to the serial port, not to another system.

Serial console logging using RMC

RMC - Remote Management Control

Sun products use several RMC cards.  These include:

ALOM   Advanced Lights Out Manager

RSC   Remote System Control

These cards allow serial as well as network access to the console ports.  These cards have buffers built into them so that some console output may be captured.  For information on pinouts and additional information on how to use these cards see <Document: 1005844.1> .

Serial console logging using non-Sun system

Serial console logging can also be done using a laptop or other PC type system running a terminal emulator program. The cabling requirements are identical as for a tip session (see "Using Tip"). Serial parameters are 9600 8n1; i.e. 9600 baud, 8 data bits, no parity bit and 1 stop bit. Set the term program to emulate a VT100 or similar terminal. Logging to disk parameters are set within the emulator program, usually referred to as either session logging or session capture. For systems running Win OS, a program named Tera Term is available that works with fewer problems than Hyperterm.

Recommended NVRAM settings :

Bring system to OBP level from command line using "shutdown" or "init 0" commands (either will run all RC shutdown scripts), sync file systems and then drop system to OK prompt. DO NOT use a stop+A key press. The following commands can be executed from the OK prompt or from the command line using the "eeprom <variable=parameter>" command.

at OK prompt

# eeprom

Description

setenv diag-level max

diag-level=max

system will run extended POST

printenv boot-device

boot-device

determine what your boot device is....

setenv diag-device <your boot-device>

diag-device=<your bootdev>

prevent attempting net boot w diags on

setenv error-reset-recovery sync

error-reset-recovery=sync

force sync reboot if system drops to OK

setenv diag-switch  true

diag-switch =true

reset-all

reboot or init 6

system has to reset for changes to take affect



Product
Solaris
Computer Systems


Internal Comments

Also see web based decoder for diagnosis http://panacea.uk.oracle.com/twiki/bin/view/Tools/ToolDecodeFatalResetDecoder for a
tip, console, logging, capture, null, modem, alom, rsc, rmc, serial port

Attachments

This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback