Sun Microsystems, Inc.  Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1003122.1
Update Date:2011-05-31
Keywords:

Solution Type  Technical Instruction Sure

Solution  1003122.1 :   Veritas Volume Manager - Procedure to Replace Internal FibreChannel (FC) Disks controlled by VxVM  


Related Items
  • Sun Fire V480 Server
  •  
  • Sun Fire 280R Server
  •  
  • Veritas Volume Manager (VxVM) Software
  •  
  • Sun Fire V880z Visualization Server
  •  
  • Solaris Cluster
  •  
  • Sun Fire V890 Server
  •  
  • Sun Fire V880 Server
  •  
  • Sun Fire V490 Server
  •  
Related Categories
  • GCS>Sun Microsystems>Storage Software>Data Management Software - Disk
  •  
  • GCS>Sun Microsystems>Servers>Entry-Level Servers
  •  
  • GCS>Sun Microsystems>Enterprise Computing>High Availability Clustering
  •  

PreviouslyPublishedAs
204286


Description
A specific procedure must be used when replacing one of the internal disks in a system with internal fibre drives (Sun Fire[TM] 280R, Sun Fire[TM] V480, Sun Fire[TM] V490, Sun Fire[TM] V880, Sun Fire[TM] V890), especially if the disk is under Veritas Volume Manager (VxVM) control.

Although these disks are hot-swappable, the procedure below should be used to alert VxVM to the fact that the drive is being replaced. This document applies to Veritas Volume Manager (VxVM) 3.x and above. This document also assumes that you are running either:

   Solaris[TM] 8, 9 or 10 Operating System(OS),
Solaris[TM] 7 OS, with kernel patch 106541-08 or higher.  

This is required to get the functionality of the devfsadm command.

Failure to follow this procedure could result in a duplicate entry for the replaced disk in VxVM. This is most notable when running a 'vxdisk list' command.

For example:

  # vxdisk list
DEVICE       TYPE      DISK         GROUP        STATUS
c1t0d0s2     sliced    rootdisk     rootdg       online
c1t1d0s2     sliced    -            -            error
c1t1d0s2     sliced    -            -            error

The extra device will disappear after the next reboot, which seems to be the only way to remove it. Therefore, it is best to prevent the duplicate device from being created in the first place. This is accomplished by the following procedure. Steps 9a - 9c pertain only to Sun[TM] Cluster 3.x installations.

If the disk is not under VxVM control, you can skip steps 2,4,9-11



Steps to Follow
NOTE: All data on these devices should have been backed up.



Before replacing any disk under VxVM control, it should be in either a 'failed' or 'removed' state:

  # vxdisk list
DEVICE       TYPE      DISK         GROUP        STATUS
c1t0d0s2     sliced    rootdisk     rootdg       online
c1t1d0s2     sliced    -            -            online
-            -         disk01       rootdg       failed was:c1t1d0s2

If the disk does not show up as "failed was", as shown above, then you should run 'vxdiskadm' and choose option #4 to remove the disk for replacement. After running 'vxdiskadm', the output should look like this:

  # vxdisk list
DEVICE       TYPE      DISK         GROUP        STATUS
c1t0d0s2     sliced    rootdisk     rootdg       online
c1t1d0s2     sliced    -            -            online
-            -         disk01       rootdg       removed was:c1t1d0s2

NOTE:
If this is a root-disk or root-mirror, check the following removed disk
information, before this operation. This information is needed to change nvramrc.

  • WWN
     For example,
     # ls -al /dev/rdsk/c1t0d0s0
lrwxrwxrwx   1 root     root          74 Mar  6  2003 c1t0d0s0 -> ../../
devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0:a,raw
  • devalias and boot-device in nvramrc
     For example,
     # eeprom nvramrc
     devalias rootdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19920,0:a
devalias mirrdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a
     boot-device=rootdisk mirrdisk
1. If this is a root-disk or root-mirror, use the dumpadm command to ensure
that the dump-device is not on the failed disk. If it is, move it to the
good side of the mirror, for example:
     dumpadm -d /dev/dsk/c1t0d0s1
2. If vxdiskadm option 4 is used to remove the disk for replacement, instruct
VxVM to re-read the device tree by running the command
     # vxdctl enable
3. Put the disk into the "offline" state with the following command:
     # vxdisk offline c1t1d0s2
4. Verify the disk has been marked "offline" with "vxdisk list":
     # vxdisk list
DEVICE       TYPE      DISK         GROUP        STATUS
c1t0d0s2     sliced    rootdisk     rootdg       online
c1t1d0s2     sliced    -            -            offline
-            -         disk01       rootdg       removed was:c1t1d0s2
5. Once Veritas has recognized the disk as offline and ready for replacement,
you need to tell the operating system. This is done as follows:
     # /usr/sbin/luxadm remove_device /dev/rdsk/c1t1d0s2
   This will produce output similar to the following:
     WARNING!!! Please ensure that no file systems are mounted on these device(s).
     All data on these devices should have been backed up.
   The list of devices which will be removed is:
   1: Device name: /dev/rdsk/c1t1d0s2 Node WWN: 20000020371b1f31
Device Type: Disk device
Device Paths: /dev/rdsk/c1t1d0s2
Please verify the above list of devices and then enter c or <CR> to
Continue or q to Quit. [Default: c]:c
stopping: /dev/rdsk/c1t1d0s2.... Done
offlining: /dev/rdsk/c1t1d0s2.... Done
The drives are now off-line and spun down.
Physically remove the disk and press the Return key.
   Hit <Return> after removing the device(s).
   <date> <systemname> picld[87]: Device DISK1 removed
Device: /dev/rdsk/c1t1d0s2
No FC devices found. - /dev/rdsk/c1t1d0s2

NOTE: The picld daemon notifies the system that the disk has been removed.

If no errors are printed, continue to step 6.

Otherwise, if you receive any errors during this step:

  • physically pull the bad disk from the host
  • run the commands:
     # vxdisk rm c1t1d0s2
# luxadm -e offline /dev/rdsk/c1t1d0s2
  • if the disk is multipathed, run the 'luxadm -e offline' on the second path as well.
6. Initiate devfsadm cleanup subroutines by entering the following command:
     # /usr/sbin/devfsadm -C -c disk
   The default devfsadm operation, is to attempt to load every driver in the
system, and attach these drivers to all possible device instances.
The devfsadm command then creates device special files in the /devices
directory, and logical links in /dev.
   With the "-c disk" option, devfsadm will only update disk device files. This
saves time and is important on systems that have tape devices attached.
Rebuilding these tape devices could cause undesirable results on non-Sun
hardware.
   The -C option cleans up the /dev directory, and removes any lingering
logical links to the device link names.
   This should remove all the device paths for this particular disk. This can
be verified with:
     # ls -ld /dev/dsk/c1t1d*
     This should return no devices.
7. Verify that the reference to this disk is gone by running the commands
     # vxdisk list (if the disk is under vxvm control)
# format
   It is now safe to physically replace the disk.
8. After replacing the disk, create the necessary entries in the Solaris OS
device tree with one of the following commands:
   # devfsadm
   or
   # /usr/sbin/luxadm insert_device <enclosure_name,sx>
     where sx is the slot number.
   NOTE: In many cases, luxadm insert_device does not require the enclosure
name and slot number.
   Use the following to find the slot number:
   # luxadm display <enclosure_name>
   To find the <enclosure_name> use:
   # luxadm probe
   Run "ls -ld /dev/dsk/c1t1d*" to verify that the new device paths have been
created.
   NOTE: After inserting disk and running devfsadm(or luxadm), the old ssd id
was changed to a new one. So, just ignore this change.
   For example:
When an error occurs on the following disks(ssd3).
    WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0 (ssd3):
Error for Command: read(10)                Error Level: Retryable
Requested Block: 15392944                  Error Block: 15392958
    (After inserting disk)
    picld[287]: [ID 727222 daemon.error] Device DISK0 inserted
qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop ONLINE
scsi: [ID 799468 kern.info] ssd10 at fp2: name w21000011c63f0c94,0, bus
address ef
genunix: [ID 936769 kern.info] ssd10 is  /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0
scsi: [ID 365881 kern.info]    <SUN72G cyl 14087 alt 2 hd 24 sec 424>
genunix: [ID 408114 kern.info] /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c
63f0c94,0 (ssd10) online
9. Label the disk using the format command. 
   If the disk is under VxVM control, be sure to write an SMI label(Solaris 9
4/03 OS or later):
     # format -e /dev/rdsk/c1t1d0s2
...
format> l
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Auto configuration via format.dat[no]? no
Auto configuration via generic SCSI-2[no]? yes
Ready to label disk, continue? yes
   If the disk is not under VxVM control, label the disk to local requirements,
otherwise, it could be labeled with a standard vtoc.

Steps 9a - 9c are only required if this is a system running SunCluster

For Pre SunCluster 3.2

9a. /usr/cluster/bin/scdidadm -C

9b. /usr/cluster/bin/scdidadm -r

9c. /usr/cluster/bin/scgdevs

For SunCluster 3.2

9a. /usr/cluster/bin/cldevice clear *run on each node, or use "-n <node,...> to specify nodes

9b. /usr/cluster/bin/cldevice refresh *run on each node, or use "-n <node,...> to specify nodes

9c. /usr/cluster/bin/cldevice populate *run on any node

Note: It's possible to get errors from c0t0d0 which is the cdrom/dvd drive on Sun fire v480,v880 etc..

10. Instruct VxVM to re-read the device tree by running the command
    # vxdctl enable
11. The disk will remain in the "offline" state until the new disk is
initialized.
    To initialize it, use the command line first:
    # vxdisksetup -i c1t1d0
    Then, use 'vxdiskadm' and choose option #5 to replace the failed or removed disk.
    - OR -
    Run 'vxdiskadm' and choose option #5 to initialize it and replace the
failed or removed disk. If the 'vxdiskadm' command is run, and option #5
is chosen, it will show that "Access is disabled" for this new disk
(because it is still "offline"), and will be asked whether or not you wish
to "enable access" to it.  Answer 'yes' to this question.
12. The disk should now be online and functional, within the operating system
and VxVM. Confirm this with "vxdisk list".
    NOTE: Do not re-boot the system and Setp-13(modify nvramrc) until a
synchronization is completed. If it is re-booted, it cannot boot from
a new disk or modify devalias. Confirm this with "vxtask list":
          # vxtask list
13. If a swap partition had to be moved, move it back, for example:
          dumpadm -d /dev/dsk/c1t1d0s1
14. If this was a root-disk or a root-mirror, then you need to make sure and run /etc/vx/bin/vxbootsetup command. The vxbootsetup utility configures a disk by writing a boot track at the beginning of the disk and by  creating physical disk partitions in the UNIX VTOC that match the mirrors of the root, swap, /usr and /var.
    /etc/vx/bin/vxbootsetup -g rootdg rootdisk  
15. If this was a root-disk or root-mirror, then ensure the nvram aliases are
updated so you can boot.
               ls -al /dev/rdsk/<new disk>s0
example: ls -al /dev/rdsk/c1t1d0s0
    Check the WWN from the ls output with the appropriate root alias entries in
the NVRAM. (eeprom nvramrc) and look at rootmirror or rootdisk entries.
    NOTE: The change method of devalias in nvramrc.
From removed disk information to new disk information.
          For example,
          - List before modifying nvramrc.
(removed disk information) 
            # eeprom nvramrc
            devalias rootdisk
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19920,0:a
devalias mirrdisk
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a
          - List the new disk information
            # ls -al /dev/rdsk/c1t0d0s0
lrwxrwxrwx   1 root     root          74 Mar  6  2003 c1t0d0s0 -> ../../
devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c63f0c94,0:a,raw
          - Modify nvramrc
            (This example is written in the bourne shell)    
            # eeprom nvramrc='devalias root-disk
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0:a[enter]
devalias rootmirror
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19838,0:a'[enter]
          - List after modifying nvramrc.
            # eeprom nvramrc
devalias rootdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c63f0c94,0:a
devalias mirrdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a

NOTE:
If this is a root-disk or rootmirror, the device path contains the WWN of the
new disk. It is necessary to update the nvramrc devalias entries to the new
device path, so the system will be able to boot from the newly-replaced rootdisk
or rootmirror.



Product
VERITAS Volume Manager 3.0.2 Software
Sun Fire 280R Server
Sun Fire V880 Server
Sun Fire V480 Server
VERITAS Volume Manager 4.0 Software
VERITAS Volume Manager 3.5 Software
VERITAS Volume Manager 3.2 Software

Internal Comments
Procedure to Manually remove duplicate device issues

If the customer ends up with duplicate entries for the same device in vxdisk list, use the following procedure. If it does not get rid of duplicate entries, have the customer do a reconfiguration re-boot.

  1. First, check there are no old WWN entries in /dev/dsk/ /dev/rdsk/ /device/<physical_path_to_device> /etc/path_to_inst). If there are WWN entries, follow the appropriate sections from < Solution: 204444 >  to remove them.

  2. When finished, proceed with following <Document: 1002285.1>  to remove the duplicate entries under VxVM.

  3. If there are still duplicate disk ids under VxVM, do a reconfiguration reboot


      # touch /reconfigure
# init 6

VxVM, Volume Manager, suncluster, cluster, upgrade, upgrading, configure, configured, configuration, disk, failed, replace, replace disk, 480, 490, 280, 880, 890, V880, V890, V480, V490, 280R
Previously Published As
40842

Change History
Date: 2007-11-09
User Name: 31620
Action: Approved
Comment: Verified Metadata - ok
Verified Keywords - ok
Verified still correct for audience - currently set to contract
Audience left at contract as per FvF at
http://kmo.central/howto/content/voyager-contributor-standards.html
Checked review date - currently set to 2008-07-03
Checked for TM - ok as presented
Publishing under the current publication rules of 18 Apr 2005:
Checked for the word normalized - not present
Version: 54
Date: 2007-11-05
User Name: 31620
Action: Accept
Comment:
Version: 0

Date: 2007-11-05
User Name: 116529
Action: Approved
Comment: Looks fine, minor change. Please publish.
Version: 0

Date: 2007-11-05
User Name: 116529
Action: Accept
Comment:
Version: 0


Attachments
This solution has no attachment
  Copyright © 2011 Sun Microsystems, Inc.  All rights reserved.
 Feedback