If you have waded through my previous post about adding a larger drive to an lvm volume running on an mdadm RAID array, you'll have seen it's a bit of a fiddly, complex process. I have been fiddling around a bit with ZFS on Linux on my old desktop machine (Core 2 Quad Q9300, 8GB RAM). I had a couple of old 1TB drives, and added the 2TB drive that I removed from the home theatre PC, mentioned in the previous post. Actually, I had put the 3TB drive in this old box initially, intending it to replace the HTPC. But it's a bit power hungry, using around 90W at idle, and I am having some niggling playback issues with it. But that's another story. So the 3TB WD Red drive gets replaced with the old Samsung 2TB.
With the three drives, I tried creating a ZFS raidz array, basically equivalent to RAID 5. It runs pretty well. Swapping the drives over (from 3TB to 2TB) gave me an opportunity to do the ZFS version of a drive swap.
It was a bit of an anticlimax, really. There was one command, the syntax from the man page is below:
zpool replace [-f] pool device [new_device]
In my case it was something like
sudo zpool replace tank /dev/disk/by-id/scsi-SATA...(the drive removed) /dev/disk/by-id/scsi-SATA....(the 2TB drive)
That started off the resilvering process (resynchronising the data across the drives). If I then were to replace each other drive, one at a time, with the above process, it would resize the array to match the new drive sizes.
The beauty of ZFS is it simplifies things so much. It is the RAID management and file system all in one.
I would have liked to run it in the HTPC, but ZFS is quite resource-hungry, in terms of processing power and memory. Since my HTPC just has a little Atom 330 chip and is maxed out with 4GB of memory (of which only 3GB is visible), I would be staying with mdadm and lvm. There is virtually no noticeable performance penalty with it. Oh well, going through all those steps is not exactly a frequent task.
Making the user experience of Ubuntu Linux and MythTV just that little bit better.
Monday, November 26, 2012
Adding a larger drive to a software RAID array - mdadm and lvm
The MythTV box I have in the lounge previously had two storage hard drives, in a RAID 1 configuration to prevent data loss in case of a drive failure. The drives were a 3TB Hitachi and a 2TB Samsung. I figured the Samsung drive was getting on a bit now, and it was time a new drive was installed. Might as well make it a 3TB model as well, to take advantage of all the space available on the other one that was sitting unused.
A 3TB Western Digital Red drive was picked up. I chose this as it is designed for use in a NAS environment: always on. It also has low power consumption and a good warranty. I considered a Seagate Barracuda 3TB - they were cheap, performance would be better than the Red, but they are only designed for desktop use, around 8 hours a day powered on. Warranty was pretty short as well.
Removing and replacing the old drive
The drives were configured in a software RAID 1 array, using mdadm, with lvm on top of that. This makes the array portable, and not dependent on a particular hardware controller.
The commands here were adapted from the excellent instructions found here at howtoforge.com.
Fortunately I had enough space on another PC that I was able to back up the contents of the array before starting any of this.
To remove the old drive, which on this machine was /dev/sdc, the following command was issued to mark the drive as failed, in the array /dev/md0:
sudo mdadm --manage /dev/md0 --fail /dev/sdc
Next step is to remove the drive from the array:
sudo mdadm --manage /dev/md0 --remove /dev/sdc
Then, the system could be shut down and the drive removed and replaced with the new one. After powering the system back up, the following command adds the new drive to the array:
sudo mdadm --manage /dev/md0 --add /dev/sdc
The array will then start synchronising the data, copying it to the new drive, which could take a few hours. Note that no partitioning was done on the disk, as I am just using the whole drive in the array.
While the sync is in progress, you can check how it is progressing via:
cat /proc/mdstat
It will show a percentage of completion as well as an estimated time remaining. Once it is done, the array is ready for use! I left the array like this for a day or so, just to make sure everything was working alright.
Expanding the array to fill the space available - the mdadm part
Once the synchronisation has completed, the size of the array was still only 2TB, since that is the largest a RAID 1 array could go when it consists of a 3TB and a 2TB drive. We need to tell mdadm to expand the array to fill the available space. More information on this can be found here.
This is where things got complicated for me. It is to do with the superblock format version used in the array. More detail can be found at this page of the Linux RAID wiki.
To sum up, the array I had was created with the version 0.90 superblock. The version was found by entering
sudo mdadm --detail /dev/md0
The problem, potentially, was that if I grew the array to larger than 2TB it may not work. As quoted by the wiki link above:
Now, Mythbuntu 12.04 runs the 3.2 kernel, so according to that it should be OK supporting up to 4TB. But I wasn't 100% sure on that, and couldn't find any references elsewhere about it. I decided the safest way to go about this was to convert the array to a later version of the superblock, that didn't have that size limitation. Besides, it would save time in the future in case of trying to repeat this with a drive larger than 4TB.
Following the suggestion of the wiki, I decided to update to a version 1.0 superblock, as it would store the superblock information in the same place as the 0.90.
Note: if you are trying this yourself, and the array is already version 1.0 or later, then the command to grow it is just as below (may not want to do it with 0.90 superblock and larger than 2TB):
mdadm --grow /dev/md0 --size=max
Since I was going to change the superblock version, it involved stopping the array and recreating it with the later version.
Once again, to check the details of the array at the moment:
sudo mdadm --detail /dev/md0
Now, since the array is in use by MythTV, I thought it safest to stop the program:
sudo service mythtv-backend stop
Also, I unmounted where the array was mounted:
sudo umount /var/lib/mythtv
Since the data is in an LVM volume on top of the array, I deactivated that as well (the volume is named raid1 in this instance):
sudo lvchange -a n raid1
The array is now ready to be stopped:
sudo mdadm --stop /dev/md0
Now it can be re-created, specifying the metadata (superblock) version, the RAID level, and the number and names of the drives used:
sudo mdadm --create /dev/md0 --metadata=1.0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc
The array will now start resynchronising. This took a number of hours for me, as there were around 770GB of recordings there. The RAID wiki link included --assume-clean in the above command, which would have skipped the resync. I elected to leave it out, for safety's sake.
Progress can be monitored with:
cat /proc/mdstat
The lvm volume can be restarted:
sudo lvchange -a y raid1
and the unmounted volumes can be re-mounted:
sudo mount -a
Check if they are all there with the
mount
command. The mythtv service can also be restarted:
sudo service mythtv-backend start
When the array is recreated, the UUID value of the array will be different. You can get the new value with:
sudo mdadm --detail /dev/md0
Edit the /etc/mdadm/mdadm.conf file, and change the UUID value in it to the new value. This will enable the array to be found on next boot.
Another thing to do before rebooting is to run
sudo update-initramfs -u
I didn't do this at first, and after rebooting, the array showed up named /dev/md127 rather than /dev/md0. Running the above command and rebooting again fixed it for me.
Expanding the array to fill the space available - the lvm part
Quite a long-winded process, isn't it? Using the lvm command to show the lvm physical volumes:
sudo pvdisplay
showed the array was still 1.82TiB (2TB). It needed to be extended. The following command will fill the volume to the available space:
sudo pvresize -v /dev/md0
To check the results, again run:
sudo pvdisplay
Now, running:
sudo vgdisplay
gave the following results for me:
--- Volume group ---
VG Name raid1
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 5
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 2
Max PV 0
Cur PV 1
Act PV 1
VG Size 2.73 TiB
PE Size 4.00 MiB
Total PE 715397
Alloc PE / Size 466125 / 1.78 TiB
Free PE / Size 249272 / 973.72 GiB
VG UUID gvfheX-ifvl-yW9h-v4L2-eyzs-95fe-sng2oN
A 3TB Western Digital Red drive was picked up. I chose this as it is designed for use in a NAS environment: always on. It also has low power consumption and a good warranty. I considered a Seagate Barracuda 3TB - they were cheap, performance would be better than the Red, but they are only designed for desktop use, around 8 hours a day powered on. Warranty was pretty short as well.
Removing and replacing the old drive
The drives were configured in a software RAID 1 array, using mdadm, with lvm on top of that. This makes the array portable, and not dependent on a particular hardware controller.
The commands here were adapted from the excellent instructions found here at howtoforge.com.
Fortunately I had enough space on another PC that I was able to back up the contents of the array before starting any of this.
To remove the old drive, which on this machine was /dev/sdc, the following command was issued to mark the drive as failed, in the array /dev/md0:
sudo mdadm --manage /dev/md0 --fail /dev/sdc
Next step is to remove the drive from the array:
sudo mdadm --manage /dev/md0 --remove /dev/sdc
Then, the system could be shut down and the drive removed and replaced with the new one. After powering the system back up, the following command adds the new drive to the array:
sudo mdadm --manage /dev/md0 --add /dev/sdc
The array will then start synchronising the data, copying it to the new drive, which could take a few hours. Note that no partitioning was done on the disk, as I am just using the whole drive in the array.
While the sync is in progress, you can check how it is progressing via:
cat /proc/mdstat
It will show a percentage of completion as well as an estimated time remaining. Once it is done, the array is ready for use! I left the array like this for a day or so, just to make sure everything was working alright.
Expanding the array to fill the space available - the mdadm part
Once the synchronisation has completed, the size of the array was still only 2TB, since that is the largest a RAID 1 array could go when it consists of a 3TB and a 2TB drive. We need to tell mdadm to expand the array to fill the available space. More information on this can be found here.
This is where things got complicated for me. It is to do with the superblock format version used in the array. More detail can be found at this page of the Linux RAID wiki.
To sum up, the array I had was created with the version 0.90 superblock. The version was found by entering
sudo mdadm --detail /dev/md0
The problem, potentially, was that if I grew the array to larger than 2TB it may not work. As quoted by the wiki link above:
The version-0.90 superblock limits the number of component devices within an array to 28, and limits each component device to a maximum size of 2TB on kernel version [earlier than] 3.1 and 4TB on kernel version 3.1 [or later].
Now, Mythbuntu 12.04 runs the 3.2 kernel, so according to that it should be OK supporting up to 4TB. But I wasn't 100% sure on that, and couldn't find any references elsewhere about it. I decided the safest way to go about this was to convert the array to a later version of the superblock, that didn't have that size limitation. Besides, it would save time in the future in case of trying to repeat this with a drive larger than 4TB.
Following the suggestion of the wiki, I decided to update to a version 1.0 superblock, as it would store the superblock information in the same place as the 0.90.
Note: if you are trying this yourself, and the array is already version 1.0 or later, then the command to grow it is just as below (may not want to do it with 0.90 superblock and larger than 2TB):
mdadm --grow /dev/md0 --size=max
Since I was going to change the superblock version, it involved stopping the array and recreating it with the later version.
Once again, to check the details of the array at the moment:
sudo mdadm --detail /dev/md0
Now, since the array is in use by MythTV, I thought it safest to stop the program:
sudo service mythtv-backend stop
Also, I unmounted where the array was mounted:
sudo umount /var/lib/mythtv
Since the data is in an LVM volume on top of the array, I deactivated that as well (the volume is named raid1 in this instance):
sudo lvchange -a n raid1
The array is now ready to be stopped:
sudo mdadm --stop /dev/md0
Now it can be re-created, specifying the metadata (superblock) version, the RAID level, and the number and names of the drives used:
sudo mdadm --create /dev/md0 --metadata=1.0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc
The array will now start resynchronising. This took a number of hours for me, as there were around 770GB of recordings there. The RAID wiki link included --assume-clean in the above command, which would have skipped the resync. I elected to leave it out, for safety's sake.
Progress can be monitored with:
cat /proc/mdstat
The lvm volume can be restarted:
sudo lvchange -a y raid1
and the unmounted volumes can be re-mounted:
sudo mount -a
Check if they are all there with the
mount
command. The mythtv service can also be restarted:
sudo service mythtv-backend start
When the array is recreated, the UUID value of the array will be different. You can get the new value with:
sudo mdadm --detail /dev/md0
Edit the /etc/mdadm/mdadm.conf file, and change the UUID value in it to the new value. This will enable the array to be found on next boot.
Another thing to do before rebooting is to run
sudo update-initramfs -u
I didn't do this at first, and after rebooting, the array showed up named /dev/md127 rather than /dev/md0. Running the above command and rebooting again fixed it for me.
Expanding the array to fill the space available - the lvm part
Quite a long-winded process, isn't it? Using the lvm command to show the lvm physical volumes:
sudo pvdisplay
showed the array was still 1.82TiB (2TB). It needed to be extended. The following command will fill the volume to the available space:
sudo pvresize -v /dev/md0
To check the results, again run:
sudo pvdisplay
Now, running:
sudo vgdisplay
gave the following results for me:
--- Volume group ---
VG Name raid1
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 5
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 2
Max PV 0
Cur PV 1
Act PV 1
VG Size 2.73 TiB
PE Size 4.00 MiB
Total PE 715397
Alloc PE / Size 466125 / 1.78 TiB
Free PE / Size 249272 / 973.72 GiB
VG UUID gvfheX-ifvl-yW9h-v4L2-eyzs-95fe-sng2oN
Running:
sudo lvdisplay
gives the following result:
--- Logical volume ---
LV Name /dev/raid1/tv
VG Name raid1
LV UUID Dokbch-ZJkg-QmRW-d9vR-wfM8-BFxb-3Z0krs
LV Write Access read/write
LV Status available
# open 1
LV Size 1.70 TiB
Current LE 445645
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 252:0
I have a couple of smaller logical volumes also in this volume group, that I have not shown. That's why there's a bit of a difference between the Alloc PE value in the volume group, and the Current LE value in the logical volume. As you can see from the lines shown in bold text, the volume group raid1 has 249272 physical extents (PE) free, and the logical volume /dev/raid1/tv is currently sized 445645. To use all space, I made the size 249272+445645, which is 694917.
The command to resize a logical volume is lvresize. Logical.
sudo lvresize -l 694917 /dev/raid1/tv
Alternatively, if you want to avoid all the maths, an alternative command is
sudo lvresize -l +100%FREE /dev/raid1/tv
That command just tells lvm to use 100% of the free space. I didn't try it myself (I only found it after running the command before it).
Now, after that has been run, to check the results, enter:
sudo lvdisplay
The results:
--- Logical volume ---
LV Name /dev/raid1/tv
VG Name raid1
LV UUID Dokbch-ZJkg-QmRW-d9vR-wfM8-BFxb-3Z0krs
LV Write Access read/write
LV Status available
# open 1
LV Size 2.65 TiB
Current LE 694917
Segments 2
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 252:0
and
sudo vgdisplay
gives:
--- Volume group ---
VG Name raid1
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 6
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 2
Max PV 0
Cur PV 1
Act PV 1
VG Size 2.73 TiB
PE Size 4.00 MiB
Total PE 715397
Alloc PE / Size 715397 / 2.73 TiB
Free PE / Size 0 / 0
VG UUID gvfheX-ifvl-yW9h-v4L2-eyzs-95fe-sng2oN
No free space shown; the lvm volume group is using the whole mdadm array, which in turn is using the whole of the two disks.
The final step for me was to grow the partition that is on the logical volume. I had formatted it with XFS, as it is good with large video files. XFS allows increasing size on a mounted partition, so the command used was:
sudo xfs_growfs -d /var/lib/mythtv
Finally, it is complete!
Subscribe to:
Posts (Atom)