Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-1187199.1
Update Date:2011-05-09
Keywords:

Solution Type  Troubleshooting Sure

Solution  1187199.1 :   VTL - How to start troubleshooting hardware driver decompression issues  


Related Items
  • Sun StorageTek VTL Plus Storage Appliance
  •  
Related Categories
  • GCS>Sun Microsystems>Storage - Tape>Tape Virtualization
  •  


When VTL tries to decompress (read) virtual tapes and receives non zero error codes from the HiFN hardware compression driver, what are the first steps in troubleshooting?

In this Document
  Purpose
  Last Review Date
  Instructions for the Reader
  Troubleshooting Details


Applies to:

Sun StorageTek VTL Plus Storage Appliance - Version: 2.0 - Build 1590d to 2.0 - Build 1656 - Release: 2.0 to 2.0
Information in this document applies to any platform.

Purpose

When VTL tries to decompress (read) virtual tapes and receives non zero error code from the HiFN hardware compression driver (h9630), what are the first steps in troubleshooting?

Note: Refer to Knowledge doc id 1124273.1 for list of HiFN hardware compression driver return codes.

Last Review Date

August 25, 2010

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details


1. Check to see if the read/decompression process was retried for that same virtual tape and if successful on retry.

    a. Most of the HiFN hardware compression/decompression driver errors (rc = -2 through -6) are temporary errors and if retried may be successful.
         - A quick and easy check to see if temporary decompression error, is to look at the 'ActualSize' and 'RetSize' values in the error messages.  If equal, it's a temporary error and a retry will be successful in most cases.  Refer to the following sample message:
Jun 7 19:45:21 xxxxxxxx genunix: [ID 459543 kern.notice] <6>[fffffe800751ec80] TLE_ERROR: READ:Tape:20004077, Decompression failed -6, CompSize 254504, ActualSize 262144, RetSize 262144   
    
    b. RC=-1 can be an actual hardware failure, but even these errors are sometimes temporary and the card will either automatically recover or recover on server reboot.
        i. Check subsequent compression/decompression requests and see if they were successful.  If so, then the card recovered. 

           Note: Be aware there can be two compression cards installed (each with 4 processors), so verify that the card that had the original error recovered and the retry didn't go through the second card.

        ii. If card did not recover automatically, reboot VTL server and check for the following messages during bootup (2 cards instance 0 and 1, each with 4 processors):


>>                           >>>************ Start sample messages *************<<<
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 259371 kern.notice] NOTICE: h9630 card type = EXPRESS
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=0, proc_index=0
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=0, proc_index=1
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=0, proc_index=2
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=0, proc_index=3
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 449429 kern.notice] NOTICE: Supported Interrupt Priority = 1
Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 950684 kern.notice] NOTICE: h9630vtl_attach() POST Passed, instance=0

Jun 11 09:46:59 cunyvtl-bottom h9630vtl_drv: [ID 259371 kern.notice] NOTICE: h9630 card type = EXPRESS
Jun 11 09:47:00 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=1, proc_index=0
Jun 11 09:47:00 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=1, proc_index=1
Jun 11 09:47:00 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=1, proc_index=2
Jun 11 09:47:00 cunyvtl-bottom h9630vtl_drv: [ID 879293 kern.notice] NOTICE: h9630vtl_proc_reset() Proc reset complete: instance=1, proc_index=3
Jun 11 09:47:00 cunyvtl-bottom h9630vtl_drv: [ID 449429 kern.notice] NOTICE: Supported Interrupt Priority = 1
                          >>>************* End sample messages *************<<<

     If above messages are seen, then the issue was temporary and no further action is required.

     Note: Refer to doc id 1124273.1 for list of HiFN hardware compression driver return codes.

2. If errors are reoccurring on same vtape(s), check for errors during the last write to that vtape.
    a. Check backup client logs to determine time of last write
    b. Check backup logs for errors occurring during that time.
    c. Check VTL messages logs for errors occurring during that time.
    d. Check disk array event log for errors occurring during that time.

3. If no issues were found in logs during the last write, dump the virtual tape header and send to VTL engineering for review to determine if corrupted.
    - Contact VTL Support for procedure for dumping virtual tape.

>>
Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback