2009/07/07

Recover a disk failure for LVM (CentOS)

I've put together old hardware pieces at home for a download & file sharing box. It got 4 old disks (30G, 250G, 500G, 80G) together. Knowing that disks will out of its life sooner or later, I rsync important stuff to the mac next to it. Couple weeks ago, shit finally happen.

The broken disk is the 30G, so most of stuff are still in the other disks. To recover, I put into a new disk, re-install CentOS, so the box is bootable again. Now I need to re-activate and mount the old disks.

Re-install CentOS
1.Put in the new disk, go to Bios and make sure CD-ROM is having first boot priority, and the new disk is the boot disk in Hard Disk boot priority setting.
2.Put in the CentOS DVD, boot from it. In the install process, make sure you don’t re-partition the old disks.
3.If you are not sure what to do, you can un-plug the power cores of all old disks, install on the new disk first. Once the installation is finished, plug back the power of old disks, boot from DVD again, and reset the boot loader.
4.Now you’ve got a bootable box with all the disks in place.

RHEL systems like Red Hat, CentOS or Fedora, partition the disks automatically at install time. By default, it sets up the partitions using LVM for the root device.
By default, the OS set up a volume group called VolGroup00, with two logical volumes, LogVol00 and LogVol01, the first for the root directory and the second for swap.

Since the new disk is big enough, I set up the new disk (not using the volume). As you can see in the partition layout of physical disks below (/dev/sda is the new disk):
[root@xxx]# sfdisk -l

Disk /dev/hda: 19457 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/hda1 * 0+ 19456 19457- 156288321 8e Linux LVM
/dev/hda2 0 - 0 0 0 Empty
/dev/hda3 0 - 0 0 0 Empty
/dev/hda4 0 - 0 0 0 Empty

Disk /dev/hdc: 24792 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/hdc1 * 0+ 24791 24792- 199141708+ 8e Linux LVM
/dev/hdc2 0 - 0 0 0 Empty
/dev/hdc3 0 - 0 0 0 Empty
/dev/hdc4 0 - 0 0 0 Empty

Disk /dev/sda: 38913 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sda1 * 0+ 12 13- 104391 83 Linux
/dev/sda2 267 38912 38646 310423995 83 Linux
/dev/sda3 13 266 254 2040255 82 Linux swap / Solaris
/dev/sda4 0 - 0 0 0 Empty

Disk /dev/sdb: 60801 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sdb1 * 0+ 60800 60801- 488384001 8e Linux LVM
/dev/sdb2 0 - 0 0 0 Empty
/dev/sdb3 0 - 0 0 0 Empty
/dev/sdb4 0 - 0 0 0 Empty


To re-activate and re-mount
vgchange is the utility to update/change your volume group attribute. Since one of the disk in previous volume is missing, I need to vgchange –ay –partial to activate the volume.

[root@xxx]# vgchange –ay –partial
Couldn't find device with uuid 'AiegxE-NfQy-AA5E-OUcn-bvpY-UG3o-5oT1gY'.
/dev/mapper/VolGroup00-LogVol00-missing_3_0: read failed after 0 of 4096 at 0: Input/output error
/dev/mapper/VolGroup00-LogVol01-missing_0_0: read failed after 0 of 4096 at 0: Input/output error
/dev/VolGroup00/LogVol01: read failed after 0 of 4096 at 0: Input/output error
Couldn't find device with uuid 'AiegxE-NfQy-AA5E-OUcn-bvpY-UG3o-5oT1gY'.
2 logical volume(s) in volume group "VolGroup00" now active


Well, a lot of error due to the missing disk, but the volume is active now. You can verify by:

[root@xxx]# lvscan
/dev/VolGroup00/LogVol00: read failed after 0 of 4096 at 892413607936: Input/output error
/dev/VolGroup00/LogVol00: read failed after 0 of 4096 at 892413665280: Input/output error
/dev/mapper/VolGroup00-LogVol01-missing_0_0: read failed after 0 of 4096 at 2080309248: Input/output error
/dev/mapper/VolGroup00-LogVol01-missing_0_0: read failed after 0 of 4096 at 4096: Input/output error
/dev/mapper/VolGroup00-LogVol01-missing_0_0: read failed after 0 of 4096 at 0: Input/output error
/dev/VolGroup00/LogVol01: read failed after 0 of 4096 at 2080309248: Input/output error
/dev/VolGroup00/LogVol01: read failed after 0 of 4096 at 2080366592: Input/output error
Couldn't find device with uuid 'AiegxE-NfQy-AA5E-OUcn-bvpY-UG3o-5oT1gY'.
/dev/mapper/VolGroup00-LogVol00-missing_3_0: read failed after 0 of 4096 at 0: Input/output error
/dev/VolGroup00/LogVol01: read failed after 0 of 4096 at 0: Input/output error
Couldn't find device with uuid 'AiegxE-NfQy-AA5E-OUcn-bvpY-UG3o-5oT1gY'.
/dev/mapper/VolGroup00-LogVol00-missing_3_0: read failed after 0 of 4096 at 0: Input/output error
/dev/VolGroup00/LogVol01: read failed after 0 of 4096 at 0: Input/output error
Couldn't find device with uuid 'AiegxE-NfQy-AA5E-OUcn-bvpY-UG3o-5oT1gY'.
ACTIVE '/dev/VolGroup00/LogVol00' [831.12 GB] inherit
ACTIVE '/dev/VolGroup00/LogVol01' [1.94 GB] inherit


Now you can mount the active volume by:
[root@xxx]# mount /dev/VolGroup00/LogVol00 /mnt

The old volume is now accessible at /mnt
[root@xxx]# ls –al /mnt

What Next
I copy the stuff I want to recover. What to do with the old drives? I need to think about...Having a old box on all time is actually costly (in term of electric bill). But having a box on is handy in many occurrences..

Ref:
http://www.linuxjournal.com/article/8874
http://fedoraforum.org/forum/archive/index.php/t-64964.html

沒有留言:

Mercury簡易改裝

有同好有一樣的困擾 - 如何使用自己的data logging軟體,因此寫了這篇來分享我的簡易改裝。 Background 雲豆子 MERCURY roaster 烘豆機的設計是使用自行開發的軟體,來:1. 操控風門/火力; 2. data logging/自動烘焙。 ...