Solaris Hard Disk Upgrade

This web page details how to upgrade a pair of mirrored 18GB disks to 146GB disks that use ODS, although I have detailed the system spec's you can use this document for any other ODS configuration you have. One point I will make is that I will NOT be extending any of the filesystems, you can use the extra space to breakup the existing filesystem if required.

The following is what I have

The following plan will be actioned

Now you could probably do this without rebooting the server, but I am a big fan of checking the total integrity of all the changes, this means checking the boot block. There is nothing worse than rebooting a server at a later date only to find that someone forgot to install the boot block (now where did I put those solaris cd's). I also persuming that you have a bit of knowledge regarding ODS if not a have a look here.

Implementing the plan

Collect System and ODS information
Obtain system information

Before you begin you might want to capture the following information to a file and print off

cat /etc/vfstab
cat /etc/system
metastat -p
metastat
metadb -i
iostat -En
echo | format
format -> partition -> print     ## for each disk
eeprom                           ## check that the rootdisk rootmir are setup
cat /etc/path_to_inst
df -k
ifconfig -a
netstat -rn
/etc/passwd

Obtain the disk names

# echo|format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c0t0d0 SUN18G cyl 7506 alt 2 hd 19 sec 248
      /pci@1f,4000/scsi@3/sd@0,0
   1. c0t8d0 SUN18G cyl 7506 alt 2 hd 19 sec 248
      /pci@1f,4000/scsi@3/sd@8,0
Specify disk (enter its number): Specify disk (enter its number):

Disk A: c0t0d0
Disk B: c0t8d0

Note: you can use what ever method you like to remember the disk devices

Obtain the Disk serial numbers
# iostat -En
c0t0d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST318203LSUN18G  Revision: 034A Serial No: LR3940710000U009
Size: 18.11GB <18110967808 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

c0t8d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST318404LSUN18G  Revision: 4203 Serial No: 3BT25ND600002127
Size: 18.11GB <18110967808 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

Disk A = c0t0d0 = LR3940710000U009
Disk B = c0t8d0 = 3BT25ND600002127 Physically go and read the front of the disks my disks have the following serial numbers

disk A = 0020772-9936394071     you can see that 394071 is disk A
disk B = 0020772-0101T25ND6     you can see that T25ND6 is disk B So now we know what physical disks we need to pull when required, you could go and label the disks A and B If you cannot determine the disks via the serial numbers you can also use the format->analyse->read to make the LED flash, this also confirms that you have the right disk
Next we need the ODS configuration
# df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d0       7063913  859009 6134265    13%    /
/proc                      0       0       0     0%    /proc
fd                         0       0       0     0%    /dev/fd
mnttab                     0       0       0     0%    /etc/mnttab
/dev/md/dsk/d30      4036062    9780 3985922     1%    /var
swap                 2259080      16 2259064     1%    /var/run
swap                 2259400     336 2259064     1%    /tmp
/dev/md/dsk/d40      4036062    4808 3990894     1%    /opt


# swap -l
swapfile           dev    swaplo     blocks    free
/dev/md/dsk/d10    85,10      16     2049696   2049696

# metastat -p
d0 -m d1 d2 1         (root)
  d1 1 1 c0t0d0s0            disk A
  d2 1 1 c0t8d0s0            disk B
d10 -m d11 d12 1      (swap)
  d11 1 1 c0t0d0s1           disk A
  d12 1 1 c0t8d0s1           disk B
d30 -m d31 d32 1      (/var)
  d31 1 1 c0t0d0s3           disk A
  d32 1 1 c0t8d0s3           disk B
d40 -m d41 1          (/opt)
  d41 1 1 c0t0d0s4           disk A
  d42 1 1 c0t8d0s4           disk B

So we have the following

/    = d0  ( which is made up of d1 and d2 submirrors)
swap = d10 ( which is made up of d11 and d12 submirrors)
/var = d30 ( which is made up of d31 and d32 submirrors)
/opt = d40 ( which is made up of d41 and d42 submirrors)

Now we know what metadevices are on what disks

disk A: d1, d11, d31 and d41 disk B: d2, d12, d32 and d42
Prepare for the worst

Create a backout /etc/vfstab file

Will be used in the backout plan

In the event that all goes pear shaped, we create a /etc/vfstab that supports the underlining disks

# cp vfstab SaveMyJob.vfstab

Edit SaveMyJob.vfstab file and change the metadevices paths to the standard path names, use the above "metastat -p" command to find these

#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
#/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr          ufs     1       yes     -
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
swap          -                       /tmp    tmpfs   -       yes     -


# OLD SETTINGS
#/dev/md/dsk/d10        -                       -       swap    -       no      -
#/dev/md/dsk/d0         /dev/md/rdsk/d0         /       ufs     1       no      -
#/dev/md/dsk/d30        /dev/md/rdsk/d30        /var    ufs     1       no      -
#/dev/md/dsk/d40        /dev/md/rdsk/d40        /opt    ufs     2       yes     -


# NEW SETTINGS
/dev/dsk/c0t0d0s0   /dev/rdsk/c0t0d0s0   /      ufs   1   no   -
/dev/dsk/c0t0d0s1   -                    -      swap  -   no   -
/dev/dsk/c0t0d0s3   /dev/rdsk/c0t0d0s3   /var   ufs   1   no   -
/dev/dsk/c0t0d0s4   /dev/rdsk/c0t0d0s4   /opt   ufs   2   yes  -

Create a backout /etc/system file

Will be used in the backout plan

In the event that all goes pear shaped, we create a /etc/system that does not load ODS stuff, we will use this file in the backout plan

# cp system SaveMyJob.system

Edit SaveMyJob.system file and remove or comment out the ODS stuff

* Begin MDD root info (do not edit)
* forceload: misc/md_trans
* forceload: misc/md_raid
* forceload: misc/md_hotspares
* forceload: misc/md_sp
* forceload: misc/md_stripe
* forceload: misc/md_mirror
* forceload: drv/pcipsy
* forceload: drv/glm
* forceload: drv/sd
* rootdev:/pseudo/md@0:0,0,blk
* End MDD root info (do not edit)
* Begin MDD database info (do not edit)
* set md:mddb_bootlist1="sd:7:16 sd:7:1050 sd:7:2084 sd:63:16 sd:63:1050"
* set md:mddb_bootlist2="sd:63:2084"
* End MDD database info (do not edit)

Upgrade Disk B
deattach/remove metadevices from Disk B

Start removing disk B from ODS

# metadetach d0 d2
# metadetach d10 d12
# metadetach d30 d32
# metadetach d40 d42

# metastat

# metaclear d2
# metaclear d12
# metaclear d32
# metaclear d42

confirm that disk B metadevices have been dettached and removed

# metastat

Check and remove any ODS databases

See if any ODS databases are on disk B

# metadb -i
        flags           first blk       block count
     a m  p  luo        16              1034            /dev/dsk/c0t0d0s7
     a    p  luo        1050            1034            /dev/dsk/c0t0d0s7
     a    p  luo        2084            1034            /dev/dsk/c0t0d0s7
     a    p  luo        16              1034            /dev/dsk/c0t8d0s7
     a    p  luo        1050            1034            /dev/dsk/c0t8d0s7
     a    p  luo        2084            1034            /dev/dsk/c0t8d0s7
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

Yes we have ODS database information on disk B
# metadb -d c0t8d0s7

Confirm they have been removed
# metadb -i

Now remove the physical disk

using the serial number above you should remove the following disk (in my case)

Disk B = c0t8d0 = 0020772-0101T25ND6

Note: you can also use the format-> analyse->read-only to make the LED flash, this also can confirm that you have the right disk

Insert the new 146GB disk Insert the new disk and wait approx 30 seconds for the disk to spin up
Now update solaris so that it can see the new disk

# devfsadm -c disk

# echo|format
Searching for disks...done

      AVAILABLE DISK SELECTIONS:
      0. c0t0d0 SUN18G cyl 7506 alt 2 hd 19 sec 248
         /pci@1f,4000/scsi@3/sd@0,0
      1. c0t8d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
         /pci@1f,4000/scsi@3/sd@8,0
Specify disk (enter its number): Specify disk (enter its number):

Success we have the 146GB disk installed
Create the partitions on the new disk

First we must obtain the filesystem sizes from disk A, you have to do this manually as the cylinder sizes will be different (don't use prtvtoc)

disk A
===============================================================================

Current partition table (original): Total disk cylinders available: 7506 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 3044 6.84GB (3045/0/0) 14348040    7GB 1 swap wu 3478 - 3912 1000.84MB (435/0/0) 2049720    1GB 2 backup wm 0 - 7505 16.86GB (7506/0/0) 35368272     3 var wm 4000 - 5739 3.91GB (1740/0/0) 8198880    4GB 4 unassigned wm 5741 - 7480 3.91GB (1740/0/0) 8198880    4GB 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 7481 - 7485 11.50MB (5/0/0) 23560    10MB Now create the partitions on the new disk (disk B) using format

disk B =========================================================================== Current partition table (unnamed): Total disk cylinders available: 14087 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 721 7.01GB (722/0/0) 14694144 1 swap wu 722 - 825 1.01GB (104/0/0) 2116608 2 backup wu 0 - 14086 136.71GB (14087/0/0) 286698624 3 var wm 826 - 1238 4.01GB (413/0/0) 8405376 4 unassigned wm 1239 - 1651 4.01GB (413/0/0) 8405376 5 unassigned wu 0 0 (0/0/0) 0 6 unassigned wu 0 0 (0/0/0) 0 7 unassigned wm 1652 - 1653 19.88MB (2/0/0) 40704 Don't forget to label the disks to write out the information Note: you might have noticed that the partition sizes on the new disk (disk B) are slightly larger, I always make them larger to make sure that there are no partition size issues, this does not mean the filesystem size will grow
Create the ODS database on disk B

Create 3 copies of the ODS database on the new disk (disk B)

# metadb -a -c 3 c0t8d0s7

# metadb -i

Create the metadevices on disk B

Now we have partitioned disk B we are ready to create the ODS metadevices

# metainit d2 1 1 c0t8d0s0
# metainit d12 1 1 c0t8d0s1
# metainit d32 1 1 c0t8d0s3
# metainit d42 1 1 c0t8d0s4

Now attach the new metadevices and the re-sync operation will start

# metattach d0 d2
# metattach d10 d12
# metattach d30 d32
# metattach d40 d42

Go and have a cup of coffee at this stage as the time it takes to resync the disks depends on the size of the
filesystems

Make sure we are all OK
# metastat

Now install the boot block on disk B # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t8d0s0
Server Integrity Check
Now perform some reboots

I always like to check the server integrity

disk A check
================================================================================
# prtconf -pv | grep 'bootpath'           make a note of the disk we booted from
# shutdown -i6 -g0 -y                     reboot as normal
# metastat
# prtconf -pv | grep 'bootpath'
# shutdown -i6 -g0 -y

disk B check
================================================================================
# shutdown -i0 -g0 -y                     boot off the new disk (disk B)
OBP> boot rootmir
# prtconf -pv | grep 'bootpath'           make sure we booted from disk B (should be different from above)
# metastat

Upgrade disk A
deatach/remove metadevices from Disk A

Now we are confident that the new disk (disk B) is working correctly we can replace disk A

Make sure we have booted from disk B
# prtconf -pv | grep 'bootpath'

Also make sure that ODS is ok (i did have some re-syncing going on when I reboot??)
# metastat

Now start to remove disk A from ODS

# metadetach d0 d1
# metadetach d10 d11
# metadetach d30 d31
# metadetach d40 d41

# metastat

# metaclear d1
# metaclear d11
# metaclear d31
# metaclear d41

confirm that disk A metadevices have been dettached and removed

# metastat

Check and remove any ODS databases

See if any ODS databases are on disk A

# metadb -i
        flags           first blk       block count
     a m  p  luo        16              1034            /dev/dsk/c0t0d0s7
     a    p  luo        1050            1034            /dev/dsk/c0t0d0s7
     a    p  luo        2084            1034            /dev/dsk/c0t0d0s7
     a    p  luo        16              1034            /dev/dsk/c0t8d0s7
     a    p  luo        1050            1034            /dev/dsk/c0t8d0s7
     a    p  luo        2084            1034            /dev/dsk/c0t8d0s7
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

Yes we have ODS database information on disk A
# metadb -d c0t0d0s7

Confirm they have been removed
# metadb -i

Now remove the physical disk

using the serial number above you should remove the following disk (in my case)

Disk A = c0t0d0 = 0020772-9936394071

Note: you can also use the format-> analyse->read-only to make the LED flash, this also can confirm that you have the right disk

Insert the new 146GB disk Insert the new disk and wait approx 30 seconds for the disk to spin up
Now update solaris so that it can see the new disk

# devfsadm -c disk

# echo|format
Searching for disks...done

      AVAILABLE DISK SELECTIONS:
      0. c0t0d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
         /pci@1f,4000/scsi@3/sd@0,0
      1. c0t8d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
         /pci@1f,4000/scsi@3/sd@8,0
Specify disk (enter its number): Specify disk (enter its number):

Success we have the 146GB disk installed, also note both disks are now 146GB
    
Create the partitions on the new disk

Because the disks are the same physically, we can use the prtvtoc command to make life easier for ourselves

# prtvtoc /dev/rdsk/c0t8d0s2 | fmthard -s - /dev/rdsk/c0t0d0s2

Create the ODS database on disk A

Create 3 copies of the ODS database on the new disk (disk A)

# metadb -a -c 3 c0t0d0s7

# metadb -i

Create the metadevices on disk A

Now we have partitioned disk B we are ready to create the ODS metadevices

# metainit d1 1 1 c0t0d0s0
# metainit d11 1 1 c0t0d0s1
# metainit d31 1 1 c0t0d0s3
# metainit d41 1 1 c0t0d0s4

Now attach the new metadevices and the re-sync operation will start

# metattach d0 d1
# metattach d10 d11
# metattach d30 d31
# metattach d40 d41

Go and have another cup of coffee as the time it takes to resync the disks depends on the size of the filesystems

Make sure we are all OK
# metastat

Now install the boot block on disk A # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0
Server Integrity Check
Now perform some reboots

I always like to check the server integrity

disk B check
================================================================================
# prtconf -pv | grep 'bootpath'           make a note of the disk we booted from
# shutdown -i6 -g0 -y                     reboot as normal
# metastat
# prtconf -pv | grep 'bootpath'
# shutdown -i6 -g0 -y

disk A check
================================================================================
# shutdown -i0 -g0 -y                     boot off the new disk (disk A)
OBP> boot rootdisk
# prtconf -pv | grep 'bootpath'           make sure we booted from disk A (should be different from above)
# metastat

Feel feel to cut any corners if you wish, especially if you do not want to reboot the server, but if you can get downtime it is always best to check the integrity of the server and to test any changes you have made.

What to do when it all goes pear shaped (Backout Plan)

This all depends on when you get the problem, but hopefully the two disks that have been removed are still fine, I managed to restore both disks just to prove it can be done.

Backout Plan

The first thing to do is to shutdown the server if not already

   # shutdown -i5 -g0 -y

Now remove the new disks and insert one and only one disk (we don't want to corrupt both)

Start the server and perform a stop-A to drop to the OBP prompt

Insert a bootable Solaris CD and boot into single user mode

   OBP> boot cdrom -s

Once at the command prompt we need to mount the disk

   # mount /dev/dsk/c0t0d0s0 /mnt
   # df -k
   # ls -l /mnt

Now we need to copy those backout files to /etc/vfstab and /etc/system (if you don't have the files then simply edit the system and vfstab files)

   # cd /mnt/etc
   # cp SaveMyJob.system system         (if you did not create the files simply edit the system file)
   # cp SaveMyJob.vfstab vfstab         (if you did not create the files simply edit the vfstab file)

   # cat system                         (always double check)
   # cat vfstab                         (always double check)

Now unmount the disk, otherwise you will have to perform a fsck at the next reboot

   # cd /
   # umount /mnt

Reboot the server

   # reboot

Hopefully the system should start, check that everything is OK and upto date.

Now you have two options

1. you can insert and install the other mirror disk and then repartition (prtvtoc), recreate metadevices (metainit) then mirror (metattach), dont forget the metadb (metadb) and boot blocks (installboot).

2. use another spare disk (if you have one - make sure its the same size) and then repartition (prtvtoc), recreate metadevices (metainit) then mirror (metattach), dont forget the metadb (metadb) and boot blocks (installboot).

The advantage with option 2 if you mess this up, then you have the other mirror disk to fall back on and if you corrupt both then a full restore is required.

Please feel free to email me any constructive criticism you have with this page or mistakes that I have made.