Corrupted LVM in degraded RAID 5

published on | edited on

Some of my readers remember I have HP DL380 G5 in production for a while. This is the post where you will see I and the disk array were in the edge of destroying the whole business.

I need to warn readers of this acident before. Do your own research. Please take this post as an research material but not copy and paste tutorial. Your problem might be different than mine or maybe solved easily with vgcfgrestore or something. Therefore, I am not taking any damages you made on your system. Again. Do your own research.

The machine runs XCP-ng 8.2 with HP Smart Array E200 RAID controller. There are two arrays in the system: RAID 1 with 2 disks for root filesystem and RAID 5 with 6 disks for VG_XenStorage. Partitioning is done and has been used since Citrix XenServer 7 was installed on the machine.

Also worth to note the machine is running company internal services in old operating systems such as Microsoft Windows Server 2003 R2 and brand new services such as vtiger CRM on Ubuntu 16.04.

Each virtual machine have own backup solution with automated scripts, and there is no automated virtual machine backup system in place. Therefore, yes, you are thinking that losing the system in a night will have catastrophic effects in our business while if I need to restore the host machine.

I will write this article in best effort of remembering my thought process. I need to thank every single netizen1 on Stack Overflow and CentOS IRC channel. Funnily enough I could not find answer on the official forum of XCP-ng. You can take a look at the posting on the forum. However, I will summarize how the issue started in here.

Outage in the morning

Host machine has 2 power supplies in N+1 mode. One of them is backed by UPS and other one is working directly. This setup have been working with no issue since the machine installed. The day in the morning the UPS did not kick in and the machine crashed at 5 AM. The system booted up at 7 AM thanks to the automated power-on in HP iLO. However, none of the monitored machines booted up after 20 minutes. I logged in remotely and realized that local storage is not accessible and xsconsole would not let me reattach or de-attach the storage. I wanted to check the health of the LVM and I kept getting corrupted metadata–oh dear…

DISK6 is degraded when I looked in the iLO interface so RAID 5 array is degraded as well. At the same time, the RAID battery is in warning mode; however, there were no alarms in the iLO about RAID battery for the past year. So, that was obviously an issue with the hardware and especially about the RAID controller. The controller is an entry level HP E200 RAID controller which comes with 128 MB Battery Backed Write Cache (BBWC) add-on. Otherwise, E200 would not let me build an array in RAID 5. Worth to mention that SAS disks are hot-swappable which later in this post you will how they can be useful.

Before moving on fixing hardware related parts, I need to know the LVM in what status. XCP-ng is based on CentOS 7.5 and using Xen as the hypervisor instead of KVM like Proxmox does. Xen provides great advantages over running old operating systems–still running Windows 2003 R2 with no changes made in the driver side. For an additional node: XCP-ng is started in emergency mode.

There is no fancy things happening in the LVM structure. When I ran any single LVM command, I immediately get lots of checksum error in the metadata. Keep in mind that the issue is about checksum of metadata.

[root@sysrescue ~/Desktop]# pvscan
  /dev/sda3: Checksum error at offset 257024
  WARNING: invalid metadata text from /dev/sda3 at 257024.
  WARNING: metadata on /dev/sda3 at 257024 has invalid summary for VG.
  WARNING: bad metadata text on /dev/sda3 in mda1
  WARNING: scanning /dev/sda3 mda1 failed to read metadata summary.
  WARNING: repair VG metadata on /dev/sda3 with vgck --updatemetadata.
  WARNING: scan failed to get metadata summary from /dev/sda3 PVID x7Na0EdSN2QyRL0psbBd3MHmXZBpEPiG
  WARNING: Metadata location on /dev/sdb at 268800 begins with invalid VG name.
  WARNING: bad metadata text on /dev/sdb in mda1
  WARNING: scanning /dev/sdb mda1 failed to read metadata summary.
  WARNING: repair VG metadata on /dev/sdb with vgck --updatemetadata.
  WARNING: scan failed to get metadata summary from /dev/sdb PVID 8ivdRed8EqB7bAiJAeWCK65wJ1Nzlyvp
  WARNING: PV /dev/sda3 is marked in use but no VG was found using it.
  WARNING: PV /dev/sda3 might need repairing.
  WARNING: PV /dev/sdb is marked in use but no VG was found using it.
  WARNING: PV /dev/sdb might need repairing.
  PV /dev/sda3                      lvm2 [<95.20 GiB]
  PV /dev/sdb                       lvm2 [<683.51 GiB]
  Total: 2 [778.70 GiB] / in use: 0 [0   ] / in no VG: 2 [778.70 GiB]

Of course taking operating system class had own benefits but I wouldn’t think I will use simple information: metadata resides in the beginning of the harddisk. I can easily delete the first few megabytes (but don’t know how much) and recreate LVM on top of it?

Wait for a second. /dev/sda3 is in pv0 and pv0 is DISK0 and DISK1, but DISK6 is the nonfunctional one? So, things are going weird to too-weird.

I also need to make sure that I have a valid LVM backup otherwise I don’t even need to investigate more and start restoring from backup. LVM backups are automatically made (distribution specific) and stored under /etc/lvm/backup/. Well, the directory was empty… but, there is also /etc/lvm/archive/ and something in there with name starts VG_XenStorage. Inside of that file, a beauty that might save my day

# Generated by LVM2 version 2.02.180(2)-RHEL7 (2018-07-20): Fri Mar 19 10:26:38 2021

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing '/sbin/lvremove -f /dev/VG_XenStorage-654eedce-8bcc-1839-4824-1b969a770c1e/VHD-9993a196-c973-43ad-b9c1-da49e4004804'"

creation_host = "xenserver-wwirrdji"	# Linux xenserver-wwirrdji 4.19.0+1 #1 SMP Wed Feb 24 12:42:33 CET 2021 x86_64
creation_time = 1616149598	# Fri Mar 19 10:26:38 2021

VG_XenStorage-654eedce-8bcc-1839-4824-1b969a770c1e {
	id = "ExBnbF-l6de-U0jT-8ltY-yBzP-xxxx-xxxxxx"
	seqno = 154
	format = "lvm2"			# informational
	status = ["RESIZEABLE", "READ", "WRITE"]
	flags = []
	extent_size = 8192		# 4 Megabytes
	max_lv = 0
	max_pv = 0
	metadata_copies = 0

	physical_volumes {

		pv0 {
			id = "x7Na0E-dSN2-QyRL-0psb-Bd3M-HmXZ-BpEPiG"
			device = "[unknown]"	# Hint only

			status = ["ALLOCATABLE"]
			flags = ["MISSING"]
			dev_size = 199643231	# 95.1973 Gigabytes
			pe_start = 22528
			pe_count = 24367	# 95.1836 Gigabytes
		}

		pv1 {
			id = "8ivdRe-d8Eq-B7bA-iJAe-WCK6-5wJ1-Nzlyvp"
			device = "/dev/sdb"	# Hint only

			status = ["ALLOCATABLE"]
			flags = []
			dev_size = 1433416496	# 683.506 Gigabytes
			pe_start = 22528
			pe_count = 174974	# 683.492 Gigabytes
		}

	}

	logical_volumes {
    	# omited because we really don't need to know
  	}
}

I skimmed through the metadata information and also it confirms that there is a missing physical device pv0.

So, I wrote what I have in the IRC channel and another CentOS user also recommends to delete the first 2 megabytes. The web result also shows deleting 2 MB from the beginning of HDD can be the beginning of the recovery since LVM will think that the HDD is not used.

But before running any single command, I got backups of each disk’s 100 MB of beginnings. I copied the VG_XenStorage file to somewhere else and removed "MISSING" flag by hand after that added /dev/sda3. Then I started running first destructive command

dd if=/dev/zero bs=1k count=2 of=/dev/sda3
pvcreate --uuid x7Na0E-dSN2-QyRL-0psb-Bd3M-HmXZ-BpEPiG
         --restorefile VG_XenStorage-654eedce-8bcc-1839-4824-1b969a770c1e
         /dev/sda3

I also applied these for /dev/sdb with its UUID.

vgcfgrestore --force
             --file VG_XenStorage-654eedce-8bcc-1839-4824-1b969a770c1e
             VG_XenStorage-654eedce-8bcc-1839-4824-1b969a770c1e

vgcfgrestore tells me that the restoration was failed, but pvscan or vgscan shows me every single bit with no metadata checksum error.

I then rebooted the system and open up the xsconsole. Every single server and service already started.

I saved the console history which I used to remember the steps I took. If you see something missing there maybe that is because I jump in the USB rescue system or run some commands in Vi. I have also actions taken list which I used that as a reference to day (even drinking coffee is noted).

Conclusion

Running multiple old disk in an array is not a great idea. Do not even try to have an old disk on your system. LVM is amazing and complex. RAID is also making things hard and hiding information about the HDD. Therefore, next system I will be making either will be using advanced filesystems like btrfs or zfs or will take advantage of tools use mdadm.

RAID controllers are great but HBA controllers are better.

This document is mostly about my journey and I will add more details if any netizen willing to tell me what could be the cause or is this the only way to solve?

I would like to chat on this and you can find details in contact page.


  1. Person who is part of community in the internet ↩︎