Tuesday, 2 December 2014

SoftRAID +New server : Reloading server

In this thread we will see how to provision a server with Software RAID once the server is booted in OS.
1. Install OS with ISO over IPMI on sda drive.
2. yum install mdadm initramfs-tools
3. # Add raid1 or raid0 to /etc/modules
echo “raid1″ >> /etc/modules
(for raid0 add that module)
echo “raid0″ >> /etc/modules
4. # Regenerate initrd.img file
mkinitramfs -o /boot/test -r /dev/mapper/root (replace /dev/mapper/root with your actual LVM root partition)
5. # Rename old img file (replace .x for whatever kernel version you are using)
mv /boot/initrd.img-x.x.x /boot/initrd.img-x.x.x.original
6. # Rename new img file
mv /boot/test /boot/initrd.img-x.x.x
7. grub-install –no-floppy /dev/sda
grub-install –no-floppy /dev/sdb
8. # Change UUID for (hd0,0) on /boot/grub/menu.1st if your file has the UUID in it.
# This most likely involves replacing the root UUID=XXXXX line with root hd(0,0)
# vi /boot/grub/menu.1st
9. Change UUIDs for proper md0 devices on /etc/fstab if your file has the UUID in it.
# Now do the same thing for the /etc/fstab file.
# Replace the UUID=XXXXXXXXX /boot ext3 defaults 0 1 line with /dev/md0 /boot ext3 defaults 0
# vi /etc/fstab
10. Clone the partition table from 1st drive to 2nd
sfdisk -d /dev/sda | sfdisk –force /dev/sdb
11. Create md devices with second drive only
mdadm –create /dev/md0 –level=1 –raid-devices=2 missing /dev/sdb1
mdadm –create /dev/md1 –level=1 –raid-devices=2 missing /dev/sdb2
12. Save new mdconf file
mdadm –detail –scan >> /etc/mdadm/mdadm.conf
13. # Get boot device setup
mkfs.ext3 /dev/md0
mkdir /mnt/md0
mount /dev/md0 /mnt/md0
cp -ax /boot/* /mnt/md0
umount /mnt/md0
umount /boot; mount /dev/md0 /boot
sfdisk –change-id /dev/sda 1 fd
mdadm –add /dev/md0 /dev/sda1
14. # Setup data device
pvcreate /dev/md1
vgextend pve /dev/md1
# this step takes a LONG time
pvmove /dev/sda2 /dev/md1
vgreduce pve /dev/sda2
sfdisk –change-id /dev/sda 2 fd
mdadm –add /dev/md1 /dev/sda2
15. Rebuilding an array currently requires us to manually rebuild the partition table on the fresh hard drive. We need to know which drive is still active and which one is the new one. To see this let’s run the following command:
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[0]
104320 blocks [2/1] [U_]
md1 : active raid1 sdb2[0]
1052160 blocks [2/1] [U_]
md2 : active raid1 sdb3[0]
243039232 blocks [2/1] [U_]
unused devices: <none>
To read this look at the first managed disk, md0. If you are adding back in a partition on a drive that is not empty you may have to keep track of which drive different ones are on. For the current purposes we will assume that we are installing a fresh, unpartitioned drive.
16. We need to see what the current drive is partitioned as, so we can duplicate the same partition table on the new drive:
# fdisk -l /dev/sdb
Disk /dev/sdb: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14         144     1052257+  fd  Linux raid autodetect
/dev/sdb3             145       30401   243039352+  fd  Linux raid autodetect
… and just to verify that /dev/sda is blank …
# fdisk -l /dev/sda
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
17. Now we need to edit the partition table on /dev/sda to match exactly what we see on /dev/sdb.
# fdisk /dev/sda
The number of cylinders for this disk is set to 30401.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n
Command action
e   extended
p   primary partition (1-4)
p
Partition number (1-4): 1
For the cylinder start/stop values just refer to the existing partition table. It says “Start” and “End” values for each partition. If you just copy these exactly as the fdisk -l for that drive outputs it will create them exactly the same for you.
First cylinder (1-30401, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-30401, default 30401): 13
Command (m for help): p
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          13      104391   83  Linux
Command (m for help): n
Command action
e   extended
p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (14-30401, default 14): 14
Last cylinder or +size or +sizeM or +sizeK (14-30401, default 30401): 144
Command (m for help): p
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          13      104391   83  Linux
/dev/sda2              14         144     1052257+  83  Linux
Command (m for help): n
Command action
e   extended
p   primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (145-30401, default 145): 145
Last cylinder or +size or +sizeM or +sizeK (145-30401, default 30401): 30401
18. We need to set the boot partition as bootable or this drive won’t be very useful if the other dies
Command (m for help): a
Partition number (1-4): 1
19. Now we need to set the partition type to ‘fd’ for all of the partitions which is hex for a linux raid partition.
Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): fd
Changed system type of partition 2 to fd (Linux raid autodetect)
Command (m for help): t
Partition number (1-4): 3
Hex code (type L to list codes): fd
Changed system type of partition 3 to fd (Linux raid autodetect)
20. Let’s look at the partition table we created, it should be identical to the one above from the existing hard drive.
Command (m for help): p
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   fd  Linux raid autodetect
/dev/sda2              14         144     1052257+  fd  Linux raid autodetect
/dev/sda3             145       30401   243039352+  fd  Linux raid autodetect
21. If it looks good then we use w to write to the disk and exit
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
22. I usually run ‘cat /proc/mdstat’ again so I can see the partitions and compare as I add the newly created partitions back into the raid array.
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[0]
104320 blocks [2/1] [U_]
md1 : active raid1 sdb2[0]
1052160 blocks [2/1] [U_]
md2 : active raid1 sdb3[0]
243039232 blocks [2/1] [U_]
unused devices: <none>
23. Now we have to add in each of the partitions back into the managed disks, one at a time. I run ‘cat /proc/mdstat’ again after each addition to make sure it worked.
# mdadm /dev/md0 –add /dev/sda1
mdadm: added /dev/sda1
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[1] sdb1[0]
104320 blocks [2/2] [UU]
md1 : active raid1 sdb2[0]
1052160 blocks [2/1] [U_]
md2 : active raid1 sdb3[0]
243039232 blocks [2/1] [U_]
unused devices: <none>
# mdadm /dev/md1 –add /dev/sda2
mdadm: added /dev/sda2
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[1] sdb1[0]
104320 blocks [2/2] [UU]
md1 : active raid1 sda2[2] sdb2[0]
1052160 blocks [2/1] [U_]
[========>............]  recovery = 43.5% (458752/1052160) finish=0.1min speed=76458K/sec
md2 : active raid1 sdb3[0]
243039232 blocks [2/1] [U_]
unused devices: <none>
# mdadm /dev/md2 –add /dev/sda3
mdadm: added /dev/sda3
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[1] sdb1[0]
104320 blocks [2/2] [UU]
md1 : active raid1 sda2[1] sdb2[0]
1052160 blocks [2/2] [UU]
md2 : active raid1 sda3[2] sdb3[0]
243039232 blocks [2/1] [U_]
[>....................]  recovery =  0.1% (308480/243039232) finish=78.6min speed=51413K/sec
unused devices: <none>
24. The last step is to update/refresh the grub configuration on both drives. These steps need to be taken on the main drive (eg. /dev/sda not /dev/sda1) of each member of the RAID array:
# /sbin/grub
grub> device (hd0) /dev/sda
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdX
grub> root (hd0,0)
grub> setup (hd0)
NOTE: the device name changes, but the grub values (hd0) do not. This ensures that the drives are detected properly in a failure situation.
And that is it. We are rebuilding as you can see. In this case it should take almost an hour and a half to rebuild the largest partition, with the smaller ones done almost as fast as you can type the commands.

No comments:

Post a Comment