Solaris disk upgrade

This web page details how to upgrade a pair of mirrored 18GB disks to 146GB disks that use ODS, although I have detailed the system spec's you can use this document for any other ODS configuration you have. One point I will make is that I will NOT be extending any of the filesystems, you can use the extra space to breakup the existing filesystem if required.

The following is what I have

One V210 Sun server
Solaris 8 (02/04) not patched
Two 18GB disks mirrored via ODS (the ones we will be upgrading)
Two new shiny 146GB disks
4 filesystems including swap (/, /var, /opt and swap), I will not be extending these
4 metadevices (d0, d10, d30 and d40)

The following plan will be actioned

Collect System Information
Obtain the disk names (will use "disk A" and "disk B" through or plan)
Obtain the disk serial numbers (so we know which one to pull out when needed)
Obtain ODS information

Upgrade disk B
Deattach one side of the mirror (disk B)
Remove any ODS databases on disk B
Physically remove disk B
Insert the new 146GB disk (disk B)
Create the partitions on the new disk (disk B)
Create the ODS database on the new disk (disk B)
Create the metadevices on the new disk (disk B)
Create the boot block on the new disk (disk B)
Sync the disks and check everything

Server integrity check
Boot the server off the new disk (disk B)

Upgrade disk A
Dettach the mirror (disk A)
Remove any ODS databases on disk A
Physically remove disk A
Insert new 146GB disk (disk A)
Create the metadevices on the new disk (disk A)
Create the book block on the new disk (disk A)
Sync the disks and check everything

Final server integrity check
Boot the server off the new disk (disk A)
Final check
Have a well earned beer!!!!!!

Backout Plan
When it all goes pear shaped

Now you could probably do this without rebooting the server, but I am a big fan of checking the total integrity of all the changes, this means checking the boot block. There is nothing worse than rebooting a server at a later date only to find that someone forgot to install the boot block (now where did I put those solaris cd's). I also persuming that you have a bit of knowledge regarding ODS if not a have a look here.

Implementing the plan

Collect System and ODS information
Obtain system information	Before you begin you might want to capture the following information to a file and print off cat /etc/vfstab cat /etc/system metastat -p metastat metadb -i iostat -En echo \| format format -> partition -> print ## for each disk eeprom ## check that the rootdisk rootmir are setup cat /etc/path_to_inst df -k ifconfig -a netstat -rn /etc/passwd
Obtain the disk names	# echo\|format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 SUN18G cyl 7506 alt 2 hd 19 sec 248 /pci@1f,4000/scsi@3/sd@0,0 1. c0t8d0 SUN18G cyl 7506 alt 2 hd 19 sec 248 /pci@1f,4000/scsi@3/sd@8,0 Specify disk (enter its number): Specify disk (enter its number): Disk A: c0t0d0 Disk B: c0t8d0 Note: you can use what ever method you like to remember the disk devices
Obtain the Disk serial numbers	# iostat -En c0t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: SEAGATE Product: ST318203LSUN18G Revision: 034A Serial No: LR3940710000U009 Size: 18.11GB <18110967808 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c0t8d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: SEAGATE Product: ST318404LSUN18G Revision: 4203 Serial No: 3BT25ND600002127 Size: 18.11GB <18110967808 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 Disk A = c0t0d0 = LR3940710000U009 Disk B = c0t8d0 = 3BT25ND600002127 Physically go and read the front of the disks my disks have the following serial numbers disk A = 0020772-9936394071 you can see that 394071 is disk A disk B = 0020772-0101T25ND6 you can see that T25ND6 is disk B So now we know what physical disks we need to pull when required, you could go and label the disks A and B If you cannot determine the disks via the serial numbers you can also use the format->analyse->read to make the LED flash, this also confirms that you have the right disk
Next we need the ODS configuration	# df -k Filesystem kbytes used avail capacity Mounted on /dev/md/dsk/d0 7063913 859009 6134265 13% / /proc 0 0 0 0% /proc fd 0 0 0 0% /dev/fd mnttab 0 0 0 0% /etc/mnttab /dev/md/dsk/d30 4036062 9780 3985922 1% /var swap 2259080 16 2259064 1% /var/run swap 2259400 336 2259064 1% /tmp /dev/md/dsk/d40 4036062 4808 3990894 1% /opt # swap -l swapfile dev swaplo blocks free /dev/md/dsk/d10 85,10 16 2049696 2049696 # metastat -p d0 -m d1 d2 1 (root) d1 1 1 c0t0d0s0 disk A d2 1 1 c0t8d0s0 disk B d10 -m d11 d12 1 (swap) d11 1 1 c0t0d0s1 disk A d12 1 1 c0t8d0s1 disk B d30 -m d31 d32 1 (/var) d31 1 1 c0t0d0s3 disk A d32 1 1 c0t8d0s3 disk B d40 -m d41 1 (/opt) d41 1 1 c0t0d0s4 disk A d42 1 1 c0t8d0s4 disk B So we have the following / = d0 ( which is made up of d1 and d2 submirrors) swap = d10 ( which is made up of d11 and d12 submirrors) /var = d30 ( which is made up of d31 and d32 submirrors) /opt = d40 ( which is made up of d41 and d42 submirrors) Now we know what metadevices are on what disks disk A: d1, d11, d31 and d41 disk B: d2, d12, d32 and d42
Prepare for the worst
Create a backout /etc/vfstab file Will be used in the backout plan	In the event that all goes pear shaped, we create a /etc/vfstab that supports the underlining disks # cp vfstab SaveMyJob.vfstab Edit SaveMyJob.vfstab file and change the metadevices paths to the standard path names, use the above "metastat -p" command to find these #device device mount FS fsck mount mount #to mount to fsck point type pass at boot options # #/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr ufs 1 yes - fd - /dev/fd fd - no - /proc - /proc proc - no - swap - /tmp tmpfs - yes - # OLD SETTINGS #/dev/md/dsk/d10 - - swap - no - #/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no - #/dev/md/dsk/d30 /dev/md/rdsk/d30 /var ufs 1 no - #/dev/md/dsk/d40 /dev/md/rdsk/d40 /opt ufs 2 yes - # NEW SETTINGS /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 / ufs 1 no - /dev/dsk/c0t0d0s1 - - swap - no - /dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3 /var ufs 1 no - /dev/dsk/c0t0d0s4 /dev/rdsk/c0t0d0s4 /opt ufs 2 yes -
Create a backout /etc/system file Will be used in the backout plan	In the event that all goes pear shaped, we create a /etc/system that does not load ODS stuff, we will use this file in the backout plan # cp system SaveMyJob.system Edit SaveMyJob.system file and remove or comment out the ODS stuff * Begin MDD root info (do not edit) * forceload: misc/md_trans * forceload: misc/md_raid * forceload: misc/md_hotspares * forceload: misc/md_sp * forceload: misc/md_stripe * forceload: misc/md_mirror * forceload: drv/pcipsy * forceload: drv/glm * forceload: drv/sd * rootdev:/pseudo/md@0:0,0,blk * End MDD root info (do not edit) * Begin MDD database info (do not edit) * set md:mddb_bootlist1="sd:7:16 sd:7:1050 sd:7:2084 sd:63:16 sd:63:1050" * set md:mddb_bootlist2="sd:63:2084" * End MDD database info (do not edit)
Upgrade Disk B
deattach/remove metadevices from Disk B	Start removing disk B from ODS # metadetach d0 d2 # metadetach d10 d12 # metadetach d30 d32 # metadetach d40 d42 # metastat # metaclear d2 # metaclear d12 # metaclear d32 # metaclear d42 confirm that disk B metadevices have been dettached and removed # metastat
Check and remove any ODS databases	See if any ODS databases are on disk B # metadb -i flags first blk block count a m p luo 16 1034 /dev/dsk/c0t0d0s7 a p luo 1050 1034 /dev/dsk/c0t0d0s7 a p luo 2084 1034 /dev/dsk/c0t0d0s7 a p luo 16 1034 /dev/dsk/c0t8d0s7 a p luo 1050 1034 /dev/dsk/c0t8d0s7 a p luo 2084 1034 /dev/dsk/c0t8d0s7 o - replica active prior to last mddb configuration change u - replica is up to date l - locator for this replica was read successfully c - replica's location was in /etc/lvm/mddb.cf p - replica's location was patched in kernel m - replica is master, this is replica selected as input W - replica has device write errors a - replica is active, commits are occurring to this replica M - replica had problem with master blocks D - replica had problem with data blocks F - replica had format problems S - replica is too small to hold current data base R - replica had device read errors Yes we have ODS database information on disk B # metadb -d c0t8d0s7 Confirm they have been removed # metadb -i
Now remove the physical disk	using the serial number above you should remove the following disk (in my case) Disk B = c0t8d0 = 0020772-0101T25ND6 Note: you can also use the format-> analyse->read-only to make the LED flash, this also can confirm that you have the right disk
Insert the new 146GB disk	Insert the new disk and wait approx 30 seconds for the disk to spin up
Now update solaris so that it can see the new disk	# devfsadm -c disk # echo\|format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 SUN18G cyl 7506 alt 2 hd 19 sec 248 /pci@1f,4000/scsi@3/sd@0,0 1. c0t8d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /pci@1f,4000/scsi@3/sd@8,0 Specify disk (enter its number): Specify disk (enter its number): Success we have the 146GB disk installed
Create the partitions on the new disk	First we must obtain the filesystem sizes from disk A, you have to do this manually as the cylinder sizes will be different (don't use prtvtoc) disk A =============================================================================== Current partition table (original): Total disk cylinders available: 7506 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 3044 6.84GB (3045/0/0) 14348040 7GB 1 swap wu 3478 - 3912 1000.84MB (435/0/0) 2049720 1GB 2 backup wm 0 - 7505 16.86GB (7506/0/0) 35368272 3 var wm 4000 - 5739 3.91GB (1740/0/0) 8198880 4GB 4 unassigned wm 5741 - 7480 3.91GB (1740/0/0) 8198880 4GB 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 7481 - 7485 11.50MB (5/0/0) 23560 10MB Now create the partitions on the new disk (disk B) using format disk B =========================================================================== Current partition table (unnamed): Total disk cylinders available: 14087 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 721 7.01GB (722/0/0) 14694144 1 swap wu 722 - 825 1.01GB (104/0/0) 2116608 2 backup wu 0 - 14086 136.71GB (14087/0/0) 286698624 3 var wm 826 - 1238 4.01GB (413/0/0) 8405376 4 unassigned wm 1239 - 1651 4.01GB (413/0/0) 8405376 5 unassigned wu 0 0 (0/0/0) 0 6 unassigned wu 0 0 (0/0/0) 0 7 unassigned wm 1652 - 1653 19.88MB (2/0/0) 40704 Don't forget to label the disks to write out the information Note: you might have noticed that the partition sizes on the new disk (disk B) are slightly larger, I always make them larger to make sure that there are no partition size issues, this does not mean the filesystem size will grow
Create the ODS database on disk B	Create 3 copies of the ODS database on the new disk (disk B) # metadb -a -c 3 c0t8d0s7 # metadb -i
Create the metadevices on disk B	Now we have partitioned disk B we are ready to create the ODS metadevices # metainit d2 1 1 c0t8d0s0 # metainit d12 1 1 c0t8d0s1 # metainit d32 1 1 c0t8d0s3 # metainit d42 1 1 c0t8d0s4 Now attach the new metadevices and the re-sync operation will start # metattach d0 d2 # metattach d10 d12 # metattach d30 d32 # metattach d40 d42 Go and have a cup of coffee at this stage as the time it takes to resync the disks depends on the size of the filesystems Make sure we are all OK # metastat
Now install the boot block on disk B	# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t8d0s0
Server Integrity Check
Now perform some reboots	I always like to check the server integrity disk A check ================================================================================ # prtconf -pv \| grep 'bootpath' make a note of the disk we booted from # shutdown -i6 -g0 -y reboot as normal # metastat # prtconf -pv \| grep 'bootpath' # shutdown -i6 -g0 -y disk B check ================================================================================ # shutdown -i0 -g0 -y boot off the new disk (disk B) OBP> boot rootmir # prtconf -pv \| grep 'bootpath' make sure we booted from disk B (should be different from above) # metastat
Upgrade disk A
deatach/remove metadevices from Disk A	Now we are confident that the new disk (disk B) is working correctly we can replace disk A Make sure we have booted from disk B # prtconf -pv \| grep 'bootpath' Also make sure that ODS is ok (i did have some re-syncing going on when I reboot??) # metastat Now start to remove disk A from ODS # metadetach d0 d1 # metadetach d10 d11 # metadetach d30 d31 # metadetach d40 d41 # metastat # metaclear d1 # metaclear d11 # metaclear d31 # metaclear d41 confirm that disk A metadevices have been dettached and removed # metastat
Check and remove any ODS databases	See if any ODS databases are on disk A # metadb -i flags first blk block count a m p luo 16 1034 /dev/dsk/c0t0d0s7 a p luo 1050 1034 /dev/dsk/c0t0d0s7 a p luo 2084 1034 /dev/dsk/c0t0d0s7 a p luo 16 1034 /dev/dsk/c0t8d0s7 a p luo 1050 1034 /dev/dsk/c0t8d0s7 a p luo 2084 1034 /dev/dsk/c0t8d0s7 o - replica active prior to last mddb configuration change u - replica is up to date l - locator for this replica was read successfully c - replica's location was in /etc/lvm/mddb.cf p - replica's location was patched in kernel m - replica is master, this is replica selected as input W - replica has device write errors a - replica is active, commits are occurring to this replica M - replica had problem with master blocks D - replica had problem with data blocks F - replica had format problems S - replica is too small to hold current data base R - replica had device read errors Yes we have ODS database information on disk A # metadb -d c0t0d0s7 Confirm they have been removed # metadb -i
Now remove the physical disk	using the serial number above you should remove the following disk (in my case) Disk A = c0t0d0 = 0020772-9936394071 Note: you can also use the format-> analyse->read-only to make the LED flash, this also can confirm that you have the right disk
Insert the new 146GB disk	Insert the new disk and wait approx 30 seconds for the disk to spin up
Now update solaris so that it can see the new disk	# devfsadm -c disk # echo\|format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /pci@1f,4000/scsi@3/sd@0,0 1. c0t8d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /pci@1f,4000/scsi@3/sd@8,0 Specify disk (enter its number): Specify disk (enter its number): Success we have the 146GB disk installed, also note both disks are now 146GB
Create the partitions on the new disk	Because the disks are the same physically, we can use the prtvtoc command to make life easier for ourselves # prtvtoc /dev/rdsk/c0t8d0s2 \| fmthard -s - /dev/rdsk/c0t0d0s2
Create the ODS database on disk A	Create 3 copies of the ODS database on the new disk (disk A) # metadb -a -c 3 c0t0d0s7 # metadb -i
Create the metadevices on disk A	Now we have partitioned disk B we are ready to create the ODS metadevices # metainit d1 1 1 c0t0d0s0 # metainit d11 1 1 c0t0d0s1 # metainit d31 1 1 c0t0d0s3 # metainit d41 1 1 c0t0d0s4 Now attach the new metadevices and the re-sync operation will start # metattach d0 d1 # metattach d10 d11 # metattach d30 d31 # metattach d40 d41 Go and have another cup of coffee as the time it takes to resync the disks depends on the size of the filesystems Make sure we are all OK # metastat
Now install the boot block on disk A	# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0
Server Integrity Check
Now perform some reboots	I always like to check the server integrity disk B check ================================================================================ # prtconf -pv \| grep 'bootpath' make a note of the disk we booted from # shutdown -i6 -g0 -y reboot as normal # metastat # prtconf -pv \| grep 'bootpath' # shutdown -i6 -g0 -y disk A check ================================================================================ # shutdown -i0 -g0 -y boot off the new disk (disk A) OBP> boot rootdisk # prtconf -pv \| grep 'bootpath' make sure we booted from disk A (should be different from above) # metastat

Feel feel to cut any corners if you wish, especially if you do not want to reboot the server, but if you can get downtime it is always best to check the integrity of the server and to test any changes you have made.

What to do when it all goes pear shaped (Backout Plan)

This all depends on when you get the problem, but hopefully the two disks that have been removed are still fine, I managed to restore both disks just to prove it can be done.

Backout Plan

The first thing to do is to shutdown the server if not already

# shutdown -i5 -g0 -y

Now remove the new disks and insert one and only one disk (we don't want to corrupt both)

Start the server and perform a stop-A to drop to the OBP prompt

Insert a bootable Solaris CD and boot into single user mode

OBP> boot cdrom -s

Once at the command prompt we need to mount the disk

   # mount /dev/dsk/c0t0d0s0 /mnt
   # df -k
   # ls -l /mnt

Now we need to copy those backout files to /etc/vfstab and /etc/system (if you don't have the files then simply edit the system and vfstab files)

   # cd /mnt/etc
   # cp SaveMyJob.system system         (if you did not create the files simply edit the system file)
   # cp SaveMyJob.vfstab vfstab         (if you did not create the files simply edit the vfstab file)

   # cat system                         (always double check)
   # cat vfstab                         (always double check)

Now unmount the disk, otherwise you will have to perform a fsck at the next reboot

   # cd /
   # umount /mnt

Reboot the server

   # reboot

Hopefully the system should start, check that everything is OK and upto date.

Now you have two options

1. you can insert and install the other mirror disk and then repartition (prtvtoc), recreate metadevices (metainit) then mirror (metattach), dont forget the metadb (metadb) and boot blocks (installboot).

2. use another spare disk (if you have one - make sure its the same size) and then repartition (prtvtoc), recreate metadevices (metainit) then mirror (metattach), dont forget the metadb (metadb) and boot blocks (installboot).

The advantage with option 2 if you mess this up, then you have the other mirror disk to fall back on and if you corrupt both then a full restore is required.

Please feel free to email me any constructive criticism you have with this page or mistakes that I have made.