Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1001917.1
Update Date:2008-07-15
Keywords:

Solution Type  Technical Instruction Sure

Solution  1001917.1 :   Collecting crash dumps from Sun StorEdge[TM] T3+/6020/6120 arrays  


Related Items
  • Sun Storage 6320 System
  •  
  • Sun Storage T3 Array
  •  
  • Sun Storage T3+ Array
  •  
  • Sun Storage 6020 Array
  •  
  • Sun Storage 6120 Array
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - 6xxx Arrays
  •  
  • GCS>Sun Microsystems>Storage - Disk>Modular Disk - Other
  •  

PreviouslyPublishedAs
202659


Description
This document describes the procedure for recovering crash dump data from a Sun StorEdge[TM] T3+/6020/6120 array after a controller crash.

These crash dump files are useful to Sun Engineers in determining the cause of the controller crash.

Steps to Follow
Collecting crash dumps from Sun StorEdge[TM] T3+/6020/6120 arrays

T3+ and 6x20/T4 arrays running firmware version 3.00 and later will automatically generate a "crash dump" whenever the array controller abnormally reboots (including when the "abort" command is run).

On reboot, the controller console and syslog will include a message that a dump file has been generated and saved to one of the disks in the array:

 Feb 20 23:40:22 ROOT[1]: N: u1ctr Saving dump image to u1d2
Feb 20 23:41:27 ROOT[1]: N: u1ctr Dump copied 3809 blocks. Compressed ratio
68.83 / 1

Listing available crash dumps

The "savecore" command on the array is used to manage crash dumps. The "savecore list" command is used to list all crash dumps available, which physical disk they are saved on, when the crash occured, and the panic string for the crash.

 T3:/:<15>savecore list
u1d1: Jan 26 18:54.33 2004 : 7.99MB : u1ctr XOR: Flags=B--M--- Cntr=0x01
Synd=0xC9 Addr=0x84000000
u1d2: Feb 20 23:38.12 2004 : 1.85MB : CPU machine check exception (2)

Preparing the host to gather the crash dump

Copying the crash dump to a host is done using the Trivial File Transfer Protocol (tftp). Most platforms have a tftp server available, but only the Solaris[TM] tftp server is discussed below.

The procedure to setup tftp on Solaris hosts is available on <Document: 1004474.1>  

For security reasons the Solaris tftp server does not allow a client to create files. It only allows the client to write to existing files which are world writeable. Thus, before uploading the dump, you need to first create a zero-length world-writeable copy of the file on the destination host.

The name of the file will vary, depending on the type of array and which physical disk the dump is stored on in the array. The file must be of the form "core.uXdX.tXz.1". The "uXdX" refers to the physical disk the dump is stored on in the array, while the "tXz" refers to the type of array, being either "t3z" for the T3+, or "t4z" for the T4/6020/6120 arrays.

If the dump is greater than 16MB in size, a second file "core.uXdX.tXz.2" must also be manually created. This is due to a TFTP limitation that no single file can be greater than 16MB. The second file is simply a continuation of the first file, beyond the 16MB mark.

Using the example above from a T3+ array :

 u1d2: Feb 20 23:38.12 2004 : 1.85MB : CPU machine check exception (2)
^^^^                         ^^^^^^
This dump is stored on disk u1d2, and is 1.85 MBytes long. 

Create the zero-length, world-writeable file on the destination machine:

 # touch /tftpboot/core.u1d2.t3z.1
# chmod 666 /tftpboot/core.u1d2.t3z.1

As this dump is less than 16MB, only the .1 file is required. If it had been larger, we would also need to create a .2 file:

 # touch /tftpboot/core.u1d2.t3z.2
# chmod 666 /tftpboot/core.u1d2.t3z.2

Copying a crash dump to a host

To transfer the dump from the array to the destination host, use the "savecore upload" command on the array. Additional command arguments are: the name of the disk containing the dump, and the IP address (or hostname if DNS is correctly configured) of the destination host.

 T3:/:<18>savecore upload u1d2 192.168.0.1
Uploading core.u1d2.t3z.1..OK

(If no IP or hostname is specified, then the value set in the array's "tftphost" variable will be used automatically. This value can be determined by use of the array "set" command.)

Verify that the core was uploaded:

 # ls -l /tftpboot/core*
-rw-rw-rw-   1 root     other    1949914 Feb 20 14:49 /tftpboot/core.u1d2.t3z.1
-rw-rw-rw-   1 root     other          0 Feb 20 14:48 /tftpboot/core.u1d2.t3z.2

In the example above, a zero-length core.u1d2.t3z.2 file is visible; this was created by the "touch" command in step 4., and would not normally exist for a dump smaller than 16MB.

These file(s) can then be transferred to Sun for analysis.

Common errors

If any errors occur during the transfer, a verbose error message is returned. The most common errors will be either communications problems with the tftp server (array network connection not configured correctly, tftp daemon not enabled, inetd not restarted, etc), or the destination files not existing or having incorrect permissions. Examples:

 T3:/:<16>savecore upload u1d2 192.168.0.1
core.u1d2.t3z.1: tftp open error(10060002)

The TFTP server we sent to has failed to respond within the timeout window. Please check network connections and that the TFTP server is running on the host.

 T3:/:<17>savecore upload u1d2 192.168.0.1
core.u1d2.t3z.1: tftp open error(10060001)

This occurs when the target file is absent, or if it does not have the correct permissions (0666).

Crash Dump Housekeeping

Each physical disk in the array is only able to store a single crash dump. If all of the disks already contain a crash dump, then no further dumps will be collected. It is thus good practice to remove all dumps which are no longer required and/or have been transfered to the host. This is done using the "savecore remove" command.

 T3:/:<20>savecore remove u1d1
u1d1: dump removed
T3:/:<21>savecore remove u1d2
u1d2: dump removed

If a dump is accidently deleted, it may be restored using the "savecore restore" command, as long as it has not been overwritten by another dump.

 T3:/:<23>savecore restore u1d2
u1d2: dump restored
T3:/:<24>savecore list
u1d2: Feb 20 23:38.12 2004 : 1.85MB : CPU machine check exception (2)


Product
Sun StorageTek T3+/6X20 Controller Firmware 3.1
Sun StorageTek T3+ Array
Sun StorageTek T3 Array
Sun StorageTek 6320 System
Sun StorageTek 6120 Array
Sun StorageTek 6020 Array

Internal Comments
Core files can be analysed by PTS/PDE using the t3adb/t3act programs.



Details on analysis are available at http://webhome.east.sun.com/skeebler/t3/doc/Crashdumps/txadb.pdf

savecore, t3, t3b, t3+, 6020, 6120, 6320
Previously Published As
74285

Change History
Date: 2007-06-01
User Name: 71396
Action: Approved
Comment: Performed final review of article.
No changes required.
Publishing.
Version: 6
Date: 2007-06-01
User Name: 71396
Action: Accept
Comment:
Version: 0
Date: 2007-06-01
User Name: 100761
Action: Approved
Comment: Doc 19272 link is appropriate, publish it.
- Srinivas.
Version: 0
Date: 2007-05-31
User Name: 100761
Action: Accept
Comment:
Version: 0




Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback