Replacing Fiber Channel Drives

Copied from:

https://yavin.homelinux.org/index.php?id=23

Systemgeek’s Resource Site

Replacing Fiber Channel Drives Document ID: 40842
Title: Veritas Volume Manager – Procedure to Replace Internal FibreChannel (FC) Disks controlled by VxVM
Solaris[TM] 9 Operating System(OS),
Solaris[TM] 8 OS, or
Solaris[TM] 7 with kernel patch 106541-08 or higher.

This is required to get the functionality of the devfsadm command. Failure to follow this procedure could result in a duplicate entry for the replaced disk in VxVM. This is most notable when running a ‘vxdisk list’ command.
For example:
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c1t0d0s2 sliced rootdisk rootdg online
c1t1d0s2 sliced – – error
c1t1d0s2 sliced – – error

The extra device will disappear after the next reboot, which seems to be the only way to remove it. Therefore, it is best to prevent the duplicate device from being created in the first place. This is accomplished by the following procedure. Steps 8a – 8c pertain only to Sun[TM] Cluster 3.x installations. If the disk is not under VxVM control, you can skip steps 2,4,9-11
Document Body:
NOTE: All data on these devices should have been backed up.
Before replacing any disk under VxVM control, it should be in either a ‘failed’ or ‘removed’ state:
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c1t0d0s2 sliced rootdisk rootdg online
c1t1d0s2 sliced – – online
– – disk01 rootdg failed was:c1t1d0s2

If the disk does not show up as “failed was”, as shown above, then you should run ‘vxdiskadm’ and choose option #4 to remove the disk for replacement. After running ‘vxdiskadm’, the output should look like this:
# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c1t0d0s2 sliced rootdisk rootdg online
c1t1d0s2 sliced – – online
– – disk01 rootdg removed was:c1t1d0s2

Cautions:
If this is a root-disk or root-mirror, Please check the following removed disk information before this operation.
You needs this information to change nvramrc.
– WWN

For example,

# ls -al /dev/rdsk/c1t0d0s0
lrwxrwxrwx 1 root root 74 Mar 6 2003 c1t0d0s0 -> ../../
devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0:a,raw

- devalias and boot-device in nvramrc

For example,

# eeprom nvramrc

devalias rootdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19920,0:a
devalias mirrdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a

boot-device=rootdisk mirrdisk
1. If this is a root-disk or root-mirror, then ensure that the dump-device is not
on the failed disk, using the dumpadm command. If it is, then move it to the
good side of the mirror.

i.e. dumpadm -d /dev/dsk/c1t0d0s1

2. If vxdiskadm option 4 is used to remove the disk for replacement, instruct VxVM to re-read the device tree by running the command

# vxdctl enable

3. Put the disk into the “offline” state with the following command:

# vxdisk offline c1t1d0s2

4. Verify the disk has been marked “offline” with “vxdisk list”:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c1t0d0s2 sliced rootdisk rootdg online
c1t1d0s2 sliced – – offline
– – disk01 rootdg removed was:c1t1d0s2

5. Once Veritas has recognized the disk as offline and ready for replacement, you need to tell the operating system. This is done as follows:

# /usr/sbin/luxadm remove_device /dev/rdsk/c1t1d0s2

This will produce output similar to the following:
WARNING!!! Please ensure that no file systems are mounted on these device(s).
All data on these devices should have been backed up.
The list of devices which will be removed is:
1: Device name: /dev/rdsk/c1t1d0s2 Node WWN: 20000020371b1f31
Device Type: Disk device
Device Paths: /dev/rdsk/c1t1d0s2
Please verify the above list of devices and then enter c or to
Continue or q to Quit. [Default: c]:c
stopping: /dev/rdsk/c1t1d0s2…. Done
offlining: /dev/rdsk/c1t1d0s2…. Done
The drives are now off-line and spun down.
Physically remove the disk and press the Return key.

Hit after removing the device(s).

picld[87]: Device DISK1 removed
Device: /dev/rdsk/c1t1d0s2
No FC devices found. – /dev/rdsk/c1t1d0s2

Note: The picld daemon notifies the system that the disk has been removed.
If no errors are printed, continue on to step 5.
Otherwise, if you receive any errors during this step: physically pull the bad disk from the host
run the commands:
# vxdisk rm c1t1d0s2
# luxadm -e offline /dev/rdsk/c1t1d0s2

if the disk is multipathed, run the ‘luxadm -e offline’ on the second path as well.
6. Initiate devfsadm cleanup subroutines by entering the following command:

# /usr/sbin/devfsadm -C -c disk

The default devfsadm operation, is to attempt to load every driver in the system, and attach these drivers to all possible device instances. devfsadm then creates device special files in the /devices directory, and logical links in /dev. With the “-c disk” option, devfsadm will only update disk device files. This saves time and is important on systems that have tape devices attached. Rebuilding these tape devices could cause undesirable results on non-Sun hardware.
The -C option cleans up the /dev directory, and removes any lingering logical links to the device link names.
This should remove all the device paths for this particular disk. This can be verified with:
# ls -ld /dev/dsk/c1t1d*

This should return no devices.
7. Verify that the reference to this disk is gone by running the commands

# vxdisk list (if the disk is under vxvm control)
# format

It is now safe to physically replace the disk.
8. After replacing the disk, create the necessary entries in the Solaris OS device tree with one of the following commands:

# devfsadm

or
# /usr/sbin/luxadm insert_device

where sx is the slot number.
Please note. In many cases, luxadm insert_device does not require the enclosure name and slot number.
Use the following to find the slot number:
# luxadm display

To find the you can use:
# luxadm probe

Run “ls -ld /dev/dsk/c1t1d*” to verify that the new device paths have been created.
Cautions:
After inserting disk and running devfsadm(or luxadm),old ssd id was changed to new ssd id. So, you may ignore this changes.
For example,
When the error occurs on the following disks(ssd3).

WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19920,0 (ssd3):
Error for Command: read(10) Error Level: Retryable
Requested Block: 15392944 Error Block: 15392958

(After inserting disk)

picld[287]: [ID 727222 daemon.error] Device DISK0 inserted
qlc: [ID 686697 kern.info] NOTICE: Qlogic qlc(2): Loop ONLINE
scsi: [ID 799468 kern.info] ssd10 at fp2: name w21000011c63f0c94,0, bus
address ef
genunix: [ID 936769 kern.info] ssd10 is /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0
scsi: [ID 365881 kern.info]
genunix: [ID 408114 kern.info] /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c
63f0c94,0 (ssd10) online

9. Label the disk using format. If the disk is under VxVM control, be sure to
write an SMI label(Solaris 9 403 OS or later):

# format -e /dev/rdsk/c1t1d0s2

format> l
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Auto configuration via format.dat[no]? no
Auto configuration via generic SCSI-2[no]? yes
Ready to label disk, continue? yes

If the disk is not under VxVM control, you can label the disk to your requirements, otherwise, it could be labelled with a standard vtoc.
Steps 9a – 9c are only required if this is a system running SunCluster 3.x
9a. /usr/cluster/bin/scdidadm -C
9b. /usr/cluster/bin/scdidadm -r
9c. /usr/cluster/bin/scgdevs
Note: It’s possible to get errors from c0t0d0 which is the cdrom/dvd drive on 480,880 etc..
10. Instruct VxVM to re-read the device tree by running the command
# vxdctl enable

11. The disk will remain in the “offline” state until you initialize the new disk.
To initialize it, you can use the command line first:
# vxdisksetup -i c1t1d0

Then, use ‘vxdiskadm’ and choose option #5 to replace the failed or removed disk.
– or -

Run ‘vxdiskadm’ and choose option #5 to initialize it and replace the failed or removed disk. If you choose to run ‘vxdiskadm’ and choose option #5, you will be told that “Access is disabled” for this new disk (because it is still “offline”), and will be asked whether or not you wish to “enable access” to it. Answer ‘yes’ to this question.
12. Your disk should now be online, and functional within the operating system and VxVM. You can confirm this with “vxdisk list”.
Cautions:
Don’t reboot the system and Setp-13(modify nvramrc) until a synchronization is completed.
If it is rebooted, it cannot boot from a new disk or modify devalias.
You can confirm this with “vxtask list”.
# vxtask list

13. If you had to move swap then move it back: i.e. dumpadm -d /dev/dsk/c1t1d0s1
14. If this was a root-disk or root-mirror, then ensure the nvram aliases are
updated so you can boot.
ls -al /dev/rdsk/s0
i.e. ls -al /dev/rdsk/c1t1d0s0

Check the WWN from the ls output with the appropriate root alias entries in the NVRAM. (eeprom nvramrc) and look at rootmirror or rootdisk entries.
Cautions:
The change method of devalias in nvramrc.
From removed disk information to new disk information.
For example,

- List before you modify nvramrc.
(removed disk information)

# eeprom nvramrc

devalias rootdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19920,0:a
devalias mirrdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a

- List the new disk information

# ls -al /dev/rdsk/c1t0d0s0
lrwxrwxrwx 1 root root 74 Mar 6 2003 c1t0d0s0 -> ../../
devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c63f0c94,0:a,raw

- Modify nvramrc

(This example is written in the borne shell)

# eeprom nvramrc=’devalias root-disk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@
w21000011c63f0c94,0:a[enter]
devalias rootmirror /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000004cfa19838
,0:a’[enter]

- List after you modify nvramrc.

# eeprom nvramrc
devalias rootdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c63f0c94,0:a
devalias mirrdisk /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100004cfa19838,0:a

”If this is a root-disk or rootmirror then the device path contains the WWN of the new disk. It is necessary to update the nvramrc devalias entries to the new device path so the system will be able to boot from the newly replaced rootdisk or rootmirror.”