Recently I shot myself in the foot pretty bad. We have a ~12TB data array that was set up as a raw LVM2 device- no partition table. There were some issues with one of the cluster members, so I went to rescue boot to attempt to correct it. Redhat’s rescue boot, in the name of spreading democracy, proactively offers to put a partition table on anything that doesn’t have one already, and I fat-fingered the key that made this happen right over the top of our array. Whoops.
This isn’t impossible to recover from- the only thing that got overwritten was, for the most part, LVM data, and it so happens that the boundaries of the volumes is more or less known (25% each), so recovery should be possible just by re-writing correct (or near-correct) headers setting the boundaries in the right spot. Problem with this approach is that you get one shot, and it’s not like we have another 12TB sitting around that we can copy the data to. Here’s where dm-mapper came in to save my ass.
Using dm-mapper’s snapshot target, you can create a working copy of a large device that you want to test a few smallish changes to- like, for example, rewriting the partition table and fscking.
The snapshot target actually uses two separate device-mapper types, snapshot-origin, which is the device that you want to create the snapshot of, and snapshot, which becomes the device that you’ll be able to make your ephemeral changes to. The snapshot device requires a snapshot-origin device as well as a device to hold the written changes. For this, it’s easy to use a file-backed loopback device.
First, we set up the snapshot-origin device:
DEVSIZE=`sudo /sbin/blockdev --getsize /dev/sdb`
sudo /sbin/dmsetup create rescue-base-real --table \
"0 $DEVSIZE linear /dev/loop0 0"
sudo /sbin/dmsetup create rescue-base --table \
"0 $DEVSIZE snapshot-origin "
Next, we make a file as large as we expect the changes to be and create a loopback device for it:
dd if=/dev/zero of=/tmp/backing-store bs=1048576 count=1024 # 1GB # get name of first available loopback BACKINGDEV=`sudo /sbin/losetup -f` sudo /sbin/losetup -f /tmp/backing-store
Now, finally, create the snapshot device:
sudo /sbin/dmsetup create rescue-snap --table \
"0 $DEVSIZE snapshot /dev/mapper/rescue-base $BACKINGDEV n 8"
The last two arguments to the above command instruct dm-mapper that this is a non-persistent device, and to use a chunk size of 8 for copy-on-write operations.
Now you’ve got yourself a device you can play with that won’t cause any permanent changes if you guess the extents incorrectly, or if a fsck does the wrong thing. As an example, it took me about three tries to get the extents set up correctly for maximum data recovery. If I’d been working with the bare device, this wouldn’t have ended as well.





