We have a very large monolithic database that we’ve recently migrated into Amazon EC2. This DB has been growing quite fast these past weeks, and we were about to hit the 1TB cap for EBS volumes on Amazon EC2. I’m not quite sure why do they have this limit, given the size of Amazon’s infrastructure, or why doesn’t Amazon offer an official alternative (like RAID1 via the Web Interface) to overcome this limitation. I’m quite sure this is a problem that is affecting a lot of customers, and attacking it would seem a priority to me.
Anyways, we need a logical volume that can hold over 1TB of data. Our first approach was to create several 1TB EBS disks, and use symlinks for some directories. But this does not scale, and we would have had to look for another solution in the short term: one of our directories has over 600.000 files, almost 900GB and growing about 20GB a week, so the 1TB limit would have been reached very shortly.
We had two options: RAID or LVM. Although RAID is the more standard way of approaching these cases, I read a lot of posts on Amazon’s forums about issues with snapshotting and recovering the data. Also, growing a RAID array on the fly is not a trivial task. So we decided to use LVM, and it’s been really easy so far.
The first step is to create the EBS disks, in our case, we created two 1TB disks, and attached them to our instance as /dev/sdf and /dev/sdh. Then we used the pvcreate utility to initialize these disks to be used by LVM:
root@backend:/mnt# pvcreate /dev/sdf /dev/sdh
Then we create a volume group called vgebs. This volume group includes our two disks:
root@backend:/mnt# vgcreate vgebs /dev/sdf /dev/sdh
We then create the logical volume, and tell the LVM subsystem to use all the free space (in our case we just want one volume that spans the whole array of disks):
root@backend:/mnt# lvcreate -i 2 -I 2M -l 100%FREE -n lvebs vgebs
This will create a new device located at /dev/mapper/vgebs-lvebs. This is our LVM volume, and we can now create a filesystem and mount it:
root@backend:/mnt# mkfs.ext3 /dev/mapper/vgebs-lvebs root@backend:/mnt# mount /dev/mapper/vgebs-lvebs /mnt/disk3
And that’s it, we now have a LVM volume with a size of 2.0TB:
root@backend:/mnt# df -h /mnt/disk3 /dev/mapper/vgebs-lvebs 2.0T 48M 2.0T 0% /mnt/disk3 root@backend:/mnt#
There are a couple of things to note: when snapshotting the EBS volumes we must make sure that all members of the LVM array are in a consistent state, that is, there can not be writes being performed on any of the disks while we snapshot any of them. To accomplish this we must suspend all operations on the volume, snapshot, and resume operations:
root@backend:/mnt# dmsetup suspend vgebs-lvebs root@backend:/mnt# python /opt/bigjocker/manage_snapshots.py vol-9e7a4886 3 'LVM disk 1' root@backend:/mnt# python /opt/bigjocker/manage_snapshots.py vol-8454fc1c 3 'LVM disk 2' root@backend:/mnt# dmsetup resume vgebs-lvebs
The snapshots will then be consistent with each other, and the LVM array can be rebuilt with lvscan mounted on another instance (never the same instance! unless you’ve detached the original LVM array, and even then, you should use another instance) using two EBS volumes created from the snapshots.