Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1011590.1
Update Date:2011-05-27
Keywords:

Solution Type  Technical Instruction Sure

Solution  1011590.1 :   How to check for Windows platform disk errors and online/offline status  


Related Items
  • Sun Fire X4600 M2 Server
  •  
  • Sun Blade X8420 Server Module
  •  
  • Sun Fire X4200 M2 Server
  •  
  • Sun Blade X6220 Server Module
  •  
  • Sun Fire V20z Compute Grid Rack System
  •  
  • Sun Java Workstation W2100z
  •  
  • Sun Ultra 20 Workstation
  •  
  • Sun Fire V65x Server
  •  
  • Sun Fire V60x Server
  •  
  • Sun Blade X6450 Server Module
  •  
  • Sun Fire X4440 Server
  •  
  • Sun Ultra 20 M2 Workstation
  •  
  • Sun Blade X6275 Server Module
  •  
  • Sun Fire X2200 M2 Server
  •  
  • Sun Fire X4170 Server
  •  
  • Sun Fire X4600 Server
  •  
  • Sun Fire X4150 Server
  •  
  • Sun Blade X8450 Server Module
  •  
  • Sun Fire X4275 Server
  •  
  • Sun Fire X4100 Server
  •  
  • Sun Fire X4450 Server
  •  
  • Sun Blade X8440 Server Module
  •  
  • Sun Fire X4500 Server
  •  
  • Sun Fire V20z Server
  •  
  • Sun Blade X6250 Server Module
  •  
  • Sun Fire X4270 Server
  •  
  • Sun Blade X6270 Server Module
  •  
  • Sun Blade X6420 Server Module
  •  
  • Sun Ultra 40 Workstation
  •  
  • Sun Fire X4100 M2 Server
  •  
  • Sun Java Workstation W1100z
  •  
  • Sun Fire V40z Server
  •  
  • Sun Fire X4540 Server
  •  
  • Sun Fire V60x Compute Grid Rack System
  •  
  • Sun Fire X4200 Server
  •  
  • Sun Ultra 40 M2 Workstation
  •  
  • Sun Fire X2100 Server
  •  
  • Sun Fire X4250 Server
  •  
  • Sun Fire X2100 M2 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Servers>x64 Servers
  •  

PreviouslyPublishedAs
215899



To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - Sun x86 Systems


Description
In this document we are going to talk about how to check for Windows platform disk status, errors and device management.

Steps to Follow
How to check for Windows platform disk errors and online/offline status

Overview

There are so many different kind of disks and arrays that it is impossible to talk in detail about them.

But we can try to define two big classes of storage devices:

  • Stand alone disks - JBOD and internal no-RAID controlled drives.
  • RAID controlled storage

Stand alone disks - JBOD and no-RAID storage

This is the simplest case. When an event occurs, the driver associated to the chip that control your disk, will take care of any communication to the OS and user.

Common events are:

  • read/write errors
  • catastrophic disk failures
  • I/O interruption
  • Fibre channel link online/offline
  • Communication timeouts

Having no redundancy you will have a direct impact on the applications that are using that device or resource. For this reason additional application errors must be expected.

RAID volumes

The nature of a hardware RAID controller is to hide direct disk access and create a pseudo volume for mount/read/write access.

In case of RAID 1 and 5 volumes you have redundancy and tolerance to single disk drives failures. In some cases, if your OS is not well configured, it can also hide problems occurring on the lower HW layer.

If a disk is under the control of a hardware RAID device then the driver specific for your controller will manage every operation on the logical volume itself.

In case of disk failure, fiber offline/online, hotplug etc. the driver will take care of any communication to the OS and user.

A number of packages are available to view RAID array disk configurations from the RAID controller vendors download sites:

- For Nvidia RAID, download and install Nvidia Media Shield.

- For LSI RAID, download and install LSI MegaRAID Storage Manager.

- For Adaptec RAID (including Sun STK SAS HBAs), download and install Sun StorageTek RAID Manager (StorMan).

As this is a third party package, please refer to the documentation distributed with the package. Note: Sun may supply the appropriate Storage manager packages for platform
integrated/onboard RAID controllers, or RAID Host Bus Adapter cards purchased with the system.

Check the Tools & Drivers disc that shipped with the system, or the Sun download pages for the product.

In addition to SW management, with HW RAID controller you always have the option to reboot the platform and enter the system BIOS's disk setup utility.

It does not require any additional software to be installed and quickly permits a low level check of the disk status.

- For Nvidia RAID, press the <F10> when prompted during the initial stages of a reboot.

- For LSI Platform RAID, press <CTRL><C> when prompted during the initial stages of a reboot.

- For LSI MegaRAID, press <CTRL><M> when prompted during the initial stages of a reboot.

- For Adaptec RAID (including Sun STK SAS HBAs), press <CTRL><A> when prompted during the initial stages of a reboot.

- For external storage arrays, out of band management SW is usually available for the OS or on the Storage itself. I.e web interfaces running on the storage control board.

How to read generated event logs

Generally speaking, in Windows (any release), the best thing to do when you have or suspect problems on your storage, is to check "Windows event viewer".

  • In Windows XP: Click Start, and then click Control Panel. Click Performance and Maintenance, then click Administrative Tools.
    Double-click Computer Management. In the console tree, click Event Viewer.
  • In Windows 2003 & 2008: Click Start, then Programs, Administrative Tools, Computer Management. In the console tree, click Event Viewer.

In the event viewer check System Events.

How to check and manage storage resources

An easy way to control your generic "logical" storage resources is (again) using the "Computer Management" GUI.

See the previous section to know how to start it.

In the console tree go to Storage --> Disk management

If SW specific for your storage/chipset is available... use it.

How to collect additional system configuration informations

In Windows there are a couple of useful tool designed to collect system configuration data like OS version, drivers, events, etc.

Msinfo32 (bundled in Windows) and Microsoft Product Support's reporting tools that need to be downloaded from Microsoft website.

msinfo32:

You can start it manually: [Start]->[Run]->msinfo32
Then export system informations into a text file.
File->Export...

MPS report (Microsoft Product Support's Reporting Tools):

It is a compressed software package that contains one or more scripts and other utilities that you can use
to capture critical system, diagnostic, and configuration information about your system.
For a detailed description and download links, see: http://www.microsoft.com/downloads/details.aspx?familyid=cebf3c7c-7ca5-408f-88b7-f9c79b7306c0.

See also "How to collect useful configuration information about my system?" section in <Document: 1007054.1>  “How to handle Microsoft Windows Panics on x64 platforms”.

References

Article ID: 308427 - How to view and manage event logs in Event Viewer in Windows XP:
http://support.microsoft.com/kb/308427/en-us

<Document: 1008396.1>  - How to Identify Optical and Hard Disk Firmware Revisions for Checking of Known Issues:

Examples

The following are examples of errors logged into the "Event Viewer - System" when a disk failure occurs.

Disk failure of an internal SAS drive in a Sun Fire x4100 running w2k3 (no RAID)

12/20/2007      1:37:11 PM      Ftdisk  Warning Disk    57      N/A     X4100-W2K3-R2   The system failed to flush data to the transaction log. Corruption may occur.
12/20/2007      1:37:11 PM      Ftdisk  Warning Disk    57      N/A     X4100-W2K3-R2   The system failed to flush data to the transaction log. Corruption may occur.
12/20/2007      1:37:10 PM      Application Popup       Information     None    26      N/A     X4100-W2K3-R2   Application popup: Windows - Delayed Write Failed : Windows was unable to save all the data for the file \Device\HarddiskVolume6\AC-PKC-5.0_BN59\Extras\App Only.msi. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.
12/20/2007      1:37:10 PM      Ntfs    Warning None    50      N/A     X4100-W2K3-R2   {Delayed Write Failed} Windows was unable to save all the data for the file . The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.
12/20/2007      1:37:09 PM      Ntfs    Error   Disk    55      N/A     X4100-W2K3-R2   The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume Extradisk.
12/20/2007      1:37:09 PM      PlugPlayManager Error   None    12      N/A      X4100-W2K3-R2   The device 'SEAGATE ST973401LSUN72G SCSI Disk Device' (SCSI\Disk&Ven_SEAGATE&Prod_ST973401LSUN72G&Rev_0556\5&e7b2690&0&000300) disappeared from the system without first being prepared for removal.
12/20/2007      1:37:09 PM      Disk    Error   None    15      N/A     X4100-W2K3-R2   The device, \Device\Harddisk1, is not ready for access yet.
12/20/2007      1:37:09 PM      Ftdisk  Warning Disk    57      N/A     X4100-W2K3-R2   The system failed to flush data to the transaction log. Corruption may occur.
12/20/2007      1:37:09 PM      lsi_sas Error   None    11      N/A     X4100-W2K3-R2   The driver detected a controller error on \Device\RaidPort0.

Disk failure of an internal SATA drive in a Sun Ultra 20 M2 running w2k3 (NVIDIA RAID 1 configured)

12/19/2007	7:43:46 PM	NVRAIDSERVICE	Warning	None	999	N/A	TESTSYS	Disk HDS728080PLA380 has been removed from array NVIDIA  Mirroring   74.53G.
12/19/2007	7:33:17 PM	NVRAIDSERVICE	Warning	None	1003	N/A	TESTSYS	Disk HDS728080PLA380 is gone or has been removed on port SATA 1.1.
12/19/2007	7:33:17 PM	NVRAIDSERVICE	Information	None	1004	N/A	TESTSYS	Array NVIDIA  Mirroring   74.53G searching for spare disk.
12/19/2007	7:33:17 PM	NVRAIDSERVICE	Information	None	1001	N/A	TESTSYS	New disk detected: HDS728080PLA380.
12/19/2007	7:33:17 PM	NVRAIDSERVICE	Warning	None	999	N/A	TESTSYS	Disk HDS728080PLA380 has been removed from array NVIDIA  Mirroring   74.53G.
12/19/2007	7:33:16 PM	NVRAIDSERVICE	Error	None	1006	N/A	TESTSYS	Access failure: Critical error on disk HDS728080PLA380 (Port: SATA 1.1).

Disk failure of an internal SATA drive in a Sun Ultra 20 M2 running w2k3 (no RAID !!)

12/20/2007      1:42:22 PM      Ftdisk  Warning Disk    57      N/A     TESTSYS The system failed to flush data to the transaction log. Corruption may occur.
12/20/2007      1:42:22 PM      Ftdisk  Warning Disk    57      N/A     TESTSYS The system failed to flush data to the transaction log. Corruption may occur.
12/20/2007      1:42:21 PM      PlugPlayManager Error   None    12      N/A     TESTSYS The device 'HDS72808 0PLA380 SCSI Disk Device' (SCSI\Disk&Ven_HDS72808&Prod_0PLA380&Rev_PF2O\4&34808028&0&010100) disappeared from the system without first being prepared for removal.
12/20/2007      1:42:21 PM      Disk    Warning None    51      N/A     TESTSYS An error was detected on device \Device\Harddisk1 during a paging operation.
12/20/2007      1:42:21 PM      Disk    Warning None    51      N/A     TESTSYS An error was detected on device \Device\Harddisk1 during a paging operation.
12/20/2007      1:42:21 PM      Disk    Warning None    51      N/A     TESTSYS An error was detected on device \Device\Harddisk1 during a paging operation.
12/20/2007      1:42:21 PM      Disk    Warning None    51      N/A     TESTSYS An error was detected on device \Device\Harddisk1 during a paging operation.
12/20/2007      1:42:21 PM      Disk    Warning None    51      N/A     TESTSYS An error was detected on device \Device\Harddisk1 during a paging operation.
12/20/2007      1:42:17 PM      nvgts   Error   None    9       N/A     TESTSYS The device, \Device\Scsi\nvgts1, did not respond within the timeout period.

Previously published as:
91572

Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback