LUKS superblock damaged by `mdadm --create` or user error?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* LUKS superblock damaged by `mdadm --create` or user error?
@ 2011-08-04 19:27 Paul Menzel
  2011-08-04 22:25 ` Paul Menzel
  2011-08-06  2:24 ` Phil Turmel
  0 siblings, 2 replies; 3+ messages in thread
From: Paul Menzel @ 2011-08-04 19:27 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 8046 bytes --]

Dear Linux RAID folks,

I hope I did not annoy you too much on #linux-raid and I am contacting
this list to reach a broader audience for help and for archival
purposes. My message to the list dm-crypt [1] was a little long and so
is this one. I am sorry.

After having grown `/dev/sda2` using `fdisk /dev/sda2` with mdadm not
running I forgot to grow the RAID1 and probably overwrote the md
metadata (0.90) or made it unavaible because it was not at the end of
the partition anymore after growing the physical and logical LVM
volumes and filesystems.

    # blkid
    /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9"
TYPE="linux_raid_member"
    /dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22"
TYPE="crypto_LUKS" # In `fdisk` I had set it to »Linux raid
autodetect« (0xfd) though.

I could not boot anymore because `/dev/md1` could not be assembled.

        # mdadm --examine /dev/sda2
        mdadm: No md superblock detected on /dev/sda2.

        # mdadm --examine /dev/sda1
        /dev/sda1:
                  Magic : a92b4efc
                Version : 0.90.00
                   UUID : fb7f3dc5:d183cab6:12123120:1a2207b9
          Creation Time : Wed Mar 26 11:49:57 2008
             Raid Level : raid1
          Used Dev Size : 497856 (486.27 MiB 509.80 MB)
             Array Size : 497856 (486.27 MiB 509.80 MB)
           Raid Devices : 2
          Total Devices : 1
        Preferred Minor : 0

            Update Time : Wed Aug  3 21:11:43 2011
                  State : clean
         Active Devices : 1
        Working Devices : 1
         Failed Devices : 1
          Spare Devices : 0
               Checksum : 388e903a - correct
                 Events : 20332

              Number   Major   Minor   RaidDevice State
        this     0       8        1        0      active sync   /dev/sda1

           0     0       8        1        0      active sync   /dev/sda1
           1     1       0        0        1      faulty removed

        # mdadm --verbose --assemble /dev/md1
--uuid=52ff2cf2:40981859:e58d8dd6:5faec42c
        mdadm: looking for devices for /dev/md1
        mdadm: no recogniseable superblock on /dev/dm-8
        mdadm: /dev/dm-8 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-7
        mdadm: /dev/dm-7 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-6
        mdadm: /dev/dm-6 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-5
        mdadm: /dev/dm-5 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-4
        mdadm: /dev/dm-4 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-3
        mdadm: /dev/dm-3 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-2
        mdadm: /dev/dm-2 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/dm-1
        mdadm: /dev/dm-1 has wrong uuid.
        mdadm: cannot open device /dev/dm-0: Device or resource busy
        mdadm: /dev/dm-0 has wrong uuid.
        mdadm: no recogniseable superblock on /dev/md0
        mdadm: /dev/md0 has wrong uuid.
        mdadm: cannot open device /dev/loop0: Device or resource busy
        mdadm: /dev/loop0 has wrong uuid.
        mdadm: cannot open device /dev/sdb4: Device or resource busy
        mdadm: /dev/sdb4 has wrong uuid.
        mdadm: cannot open device /dev/sdb: Device or resource busy
        mdadm: /dev/sdb has wrong uuid.
        mdadm: cannot open device /dev/sda2: Device or resource busy
        mdadm: /dev/sda2 has wrong uuid.
        mdadm: cannot open device /dev/sda1: Device or resource busy
        mdadm: /dev/sda1 has wrong uuid.
        mdadm: cannot open device /dev/sda: Device or resource busy
        mdadm: /dev/sda has wrong uuid.

and `mdadm --examine /dev/sda2` could not find any metadata.
`/dev/sda2` could still be decrypted using `cryptsetup luksOpen
/dev/sda2 sda2_crypt`. Not knowing about metadata and their storage
(0.90) I read several Web resourses and joined IRC channels and came
to the conclusion that I should just create a new (degraded) RAID1 and
everything would be fine, since I had only one disk.

Booting from the live system Grml [3], which does *not* start `mdadm`
or `lvm` during boot, I tried to create a new RAID1 using the
following command (a).

   # command (a)
   mdadm --verbose --create /dev/md1 \
   --assume-clean \
   --level=1 \
   --raid-devices=2 \
   --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \
   /dev/sda2 missing

I ignored the warning about overwriting metadata because it only
referred to booting. Unfortunately `cryptsetup luksOpen /dev/md1
md1_crypt` did not find any LUKS superblock. Therefore I stopped
`/dev/md1` and `cryptsetup luksOpen /dev/sda2 sda2_crypt` still
worked. Then I remembered that the metadata version was originally
0.90 and added `--metadata=0.90` and executed the following (b).

   # command (b)
   mdadm --verbose --create /dev/md1 \
   --assume-clean \
   --level=1 \
   --raid-devices=2 \
   --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \
   --metadata=0.90
   /dev/sda2 missing

Lucky me I thought, `cryptsetup luksOpen /dev/md1 md1_crypt` asked me
for the passphrase but I entered it three times and it would not
unlock. Instead of trying it again – I do not know if it would have
worked – I tried `cryptsetup luksOpen /dev/sda2 sda2_crypt` and it
asked me for the passphrase too. The third time I seem to have entered
it correctly, but I got an error message that it could not be mapped.

--- dmesg ---
Aug  4 00:16:01 grml kernel: [ 7964.786362] device-mapper:
table: 253:0: crypt: Device lookup failed
       Aug  4 00:16:01 grml kernel: [ 7964.786367] device-mapper:
ioctl: error adding target to table
       Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6,
/dev/dm-0, 10) failed: No such file or directory
       Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6,
/dev/dm-0, 10) failed: No such file or directory

       Aug  4 00:17:14 grml kernel: [ 8038.196371] md1: detected
capacity change from 1999886286848 to 0
       Aug  4 00:17:14 grml kernel: [ 8038.196395] md: md1 stopped.
       Aug  4 00:17:14 grml kernel: [ 8038.196407] md: unbind<sda2>
       Aug  4 00:17:14 grml kernel: [ 8038.212653] md: export_rdev(sda2)
--- dmesg ---

Then I realized that I had probably forgotten to stop `/dev/md1`.
After stopping it, `cryptsetup luksOpen /dev/sda2 sda2_crypt` did not
succeed anymore and I cannot access my data.

1. Does the `dmesg` output suggest that accessing `/dev/sda2` while
assembled caused any breakage?
2. On #lvm and #linux-raid the common explanation was that command (a)
had overwritten the LUKS superblock and damaged it. Is that possible?
I could not find the magic number 0xa92b4efc in the first megabyte of
`/dev/sda2`. Did `--assume-clean` prevent that?
3. Is command (b) to blame, or did it probably work and I had a typo
in the passphrase?

I am thankful for any hint to get my data back.

Thanks and sorry for the long message. Any hints on how to shorten it
next time are much appreciated.

Paul

PS: A month ago I head `dd` the content of a 500 GB drive to this one.
That is why I wanted to resize the partitions. The old drive is still
functional and I am attaching several outputs from commands from the
current 2 TB drive and the old drive. The `luksDump` output is from
the current drive but with the LUKS header from the 500 GB drive. I
know that I am publishing the key to access my drive, but if it helps
to get my data back I will encrypt from scratch again afterward. I
also have the dump of the first MB (in this case) of the partition
(`luksHeaderBackup`) from the old and new drive. But attaching them
would be over the message size limit.

[1] http://www.saout.de/pipermail/dm-crypt/2011-August/001857.html
[2] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition
[3] http://grml.org/

[-- Attachment #2: 20110804--Nachricht-dm-crypt.txt --]
[-- Type: text/plain, Size: 9081 bytes --]

Accesing dm-crypt volume after failed resizing with mdadm/RAID1, dm-crypt/LUKS, LVM

Dear dm-crypt folks,

as you might guess I am another lost guy turning to you as the last resort to rescue his data.

I am sorry for the long text, but I am trying to be as elaborate as possible.

I have a RAID1 (mirroring) setup where only one drive is assembled though. It is setup `/dev/md0` ← `/dev/sda1` and `/dev/md1` ← `/dev/sda2`. `/dev/md1` is encrypted with LUKS and contains a LVM setup with the logical volumes used by `/home/` and `/root/`.

A month ago the 500 GB drive was replaced by a 2 TB drive and I copied the whole data with `dd_rescue` without any errors from the old to the new drive. As we all know as a consequence only the old size of 500 GB is usable and the partitions have to be resized/grown to be able to use the whole 2 TB. But to emphasize it again, I still do have the old drive available.

I wanted to resize the partitions today. Therefore I followed the guide from Uwe Hermann [1] which I had done also some years ago where it had worked without any problem.

So I booted from a USB medium with Grml 5.2011 (cryptsetup 1.3.0) and followed the steps from the guide. Please note that Grml by default does not assemble any RAIDs, that means `mdadm` was not run. (And having only one drive I did not think about that the RAID1 might have been taken care for too.)

1. `fdisk /dev/sda`
2. Remove second partition.
3. Create new partition with the same starting sector (automatically 63 was chosen, since there are only two partitions and it was the second).
4. Choose proposed end sector, which was the maximum.
5. Choose type `autoraid detection`.
6. Saved it using w.

Afterward I did `cryptsetup luksOpen /dev/sda2 foo`, `pvresize /dev/mapper/foo`, `service lvm2 start` and `lvresize -L +300GB /dev/speicher/home` and `lvresize -L +20GB /dev/speicher/other`. Then I ran `fsck -f /dev/speicher/home` and `xfs_check /mnt/other_mounted` and there were no errors at all. Doing `resize2fs /dev/speicher/home` and `xfs_resize /mnt/other_mounted`(?) I rebooted just to be surprised that I was not asked for the LUKS password when booting into Debian. I only saw `evms_activate is not available`.

Then I booted with Grml again to recreated the initrd.img, `update-initramfs -u` thinking it needed to be updated too. I was happy to see that I could still access `/dev/sda2` just fine using `cryptsetup luksOpen /dev/sda2 sda2_crypt` and to mount everything in it – `service lvm2 start` all volumes – for using `chroot` [3] and rebuild the initrd image. But updating the initrd image was to no avail although the `evms_activate is not available message` disappeared.

Here I probably also have to mention that I have `mdadm` on hold on the Debian system for quite some time because of some problems and I did not dare to touch it.

Anyway I found out that the system was not able to assemble `/dev/md1` from `/dev/sda2`. This did also not work under Grml and `mdadm` could not find the md superblock on `/dev/sda2`.

	# mdadm --examine /dev/sda2
	mdadm: No md superblock detected on /dev/sda2.
	# blkid
	/dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member"
	/dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22" TYPE="crypto_LUKS" # different UUID than before and “wrong” type
	# cryptsetup luksOpen /dev/sda2 sda2_crypt # still worked

On `#debian` somebody told me, that the md superblock is stored on the end of the partition and that it was probably overwritten when enlarging the partition and I should have growm the RAID too.

Searching the Internet for help I found several suggestions and I tried to recreate the RAID with the following command.

	# mdadm --create /dev/md1 --assume-clean --uuid=52ff2cf2-4098-1859-e58d-8dd65faec42c /dev/sda2 missing

I got a warning that there is metadata at the beginning and that I should not go on when this is used for `/boot` and use `--metadata=0.90`. But since it was not used for `/boot` I chose to go on. Then the RAID was created but `cryptsetup luksOpen /dev/md1 md1_crypt` said that it was no LUKS device. Therefore I stopped the RAID and `cryptsetup luksOpen /dev/sda2 sda2_crypt` still worked.

Then I was told on IRC that when only having one drive in a RAID1 it does not matter if you alter `/dev/sda2` or `/dev/md1` and that I should try to create the RAID again. Remembering that before the resizing the RAID metadata (also on /dev/sda1) was `0.90` I passed `--metadata=0.90` to the `mdadm --create` command.

	# mdadm --create /dev/md1 --assume-clean --uuid=52ff2cf2-4098-1859-e58d-8dd65faec42c /dev/sda2 missing

I got an error message that the device is already part of a RAID and I ignored it and went on. I first was happy because

	# cryptsetup luksOpen /dev/md1 md1_crypt

worked and asked me for the passphrase. But I typed the correct passphrase several times and it was rejected. Then I probably forgot to stop the RAID and

	# cryptsetup luksOpen /dev/sda2 sda2_crypt

showed the same behavior but that was probably typos and it seemed to work once. But I got an error message which is the following making me realize the RAID was probably still running and I stopped it right away.

	Aug  4 00:16:01 grml kernel: [ 7964.786362] device-mapper: table: 253:0: crypt: Device lookup failed
	Aug  4 00:16:01 grml kernel: [ 7964.786367] device-mapper: ioctl: error adding target to table
	Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory
	Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory

	Aug  4 00:17:14 grml kernel: [ 8038.196371] md1: detected capacity change from 1999886286848 to 0
	Aug  4 00:17:14 grml kernel: [ 8038.196395] md: md1 stopped.
	Aug  4 00:17:14 grml kernel: [ 8038.196407] md: unbind<sda2>
	Aug  4 00:17:14 grml kernel: [ 8038.212653] md: export_rdev(sda2)

After that `cryptsetup luksOpen /dev/sda2 sda2_crypt` always failed.

Now wanting to be smart I saved the LUKS header

	# cryptsetup luksHeaderBackup /dev/sda2 --header-backup-file /home/grml/20110804--031--luksHeaderBackup

shut the system down, connected the old drive, booted Grml, saved the LUKS header from `/dev/sda2` from the 500 GB drive, switched the drives again and restoring the old header from before the resizing to the new drive.

	# cryptsetup luksHeaderRestore /dev/sda2 --header-backup-file /home/grml/20110804--031--luksHeaderBackup

Only to find out that this did also not help. I have some system information from the late recovery attemps and still the old 500 GB drive. Is there any way to recover the data?

The current situation is, that `luksOpen` does not succeed on `/dev/md1` or `/dev/sda2`, that means it is detected as a LUKS device but the passphrase is not accepted (I even typed it clear to the console and copied it into the prompt).

### New drive ###

	# mdadm --examine /dev/sda2
	/dev/sda2:
		  Magic : a92b4efc
		Version : 0.90.00
		   UUID : 52ff2cf2:40981859:d8b78f65:99226e41 (local to host grml)
	  Creation Time : Thu Aug  4 00:05:57 2011
	     Raid Level : raid1
	  Used Dev Size : 1953013952 (1862.54 GiB 1999.89 GB)
	     Array Size : 1953013952 (1862.54 GiB 1999.89 GB)
	   Raid Devices : 2
	  Total Devices : 2
	Preferred Minor : 1

	    Update Time : Thu Aug  4 00:05:57 2011
		  State : clean
	 Active Devices : 1
	Working Devices : 1
	 Failed Devices : 1
	  Spare Devices : 0
	       Checksum : bf78bfbf - correct
		 Events : 1

	      Number   Major   Minor   RaidDevice State
	this     0       8        2        0      active sync   /dev/sda2

	   0     0       8        2        0      active sync   /dev/sda2
	   1     0       0        0        0      spare
	# blkid 
	/dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" 
	/dev/sda2: UUID="52ff2cf2-4098-1859-d8b7-8f6599226e41" TYPE="linux_raid_member"

### Old drive ###
	/dev/sda2:
		  Magic : a92b4efc
		Version : 0.90.00
		   UUID : 52ff2cf2:40981859:e58d8dd6:5faec42c
	  Creation Time : Wed Mar 26 11:50:04 2008
	     Raid Level : raid1
	  Used Dev Size : 487885952 (465.28 GiB 499.60 GB)
	     Array Size : 487885952 (465.28 GiB 499.60 GB)
	   Raid Devices : 2
	  Total Devices : 1
	Preferred Minor : 1

	    Update Time : Sat Jun 18 14:25:10 2011
		  State : clean
	 Active Devices : 1
	Working Devices : 1
	 Failed Devices : 1
	  Spare Devices : 0
	       Checksum : 380692fc - correct
		 Events : 25570832

	      Number   Major   Minor   RaidDevice State
	this     0       8        2        0      active sync   /dev/sda2

	   0     0       8        2        0      active sync   /dev/sda2
	   1     1       0        0        1      faulty removed

Please tell me what other information you need.

Thanks in advance,

Paul

PS: Please excuse this long message and probably mistakes in it. It is almost four in the morning and after 10 hours debugging I am quite lost.

[1] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition
[2] http://grml.org/
[3] http://wiki.debian.org/DebianInstaller/Rescue/Crypto

[-- Attachment #3: 20110804--new-drive--blkid --]
[-- Type: application/octet-stream, Size: 241 bytes --]

/dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" 
/dev/sda2: UUID="52ff2cf2-4098-1859-d8b7-8f6599226e41" TYPE="linux_raid_member" 
/dev/sdb4: LABEL="grml64 2011.05" TYPE="iso9660" 
/dev/loop0: TYPE="squashfs" 

[-- Attachment #4: 20110804--new-drive--fdisk-l-sda --]
[-- Type: application/octet-stream, Size: 609 bytes --]

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x0009bf9c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          62      497983+  fd  Linux raid autodetect
Partition 1 does not start on physical sector boundary.
/dev/sda2              63      243201  1953014017+  fd  Linux raid autodetect
Partition 2 does not start on physical sector boundary.

[-- Attachment #5: 20110804--new-drive--fdisk-s-sda1 --]
[-- Type: application/octet-stream, Size: 7 bytes --]

497983

[-- Attachment #6: 20110804--new-drive--fdisk-s-sda2 --]
[-- Type: application/octet-stream, Size: 11 bytes --]

1953014017

[-- Attachment #7: 20110804--new-drive--mdadm--examine-sda1 --]
[-- Type: application/octet-stream, Size: 790 bytes --]

/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : fb7f3dc5:d183cab6:12123120:1a2207b9
  Creation Time : Wed Mar 26 11:49:57 2008
     Raid Level : raid1
  Used Dev Size : 497856 (486.27 MiB 509.80 MB)
     Array Size : 497856 (486.27 MiB 509.80 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0

    Update Time : Wed Aug  3 21:39:58 2011
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 388e96dd - correct
         Events : 20334

      Number   Major   Minor   RaidDevice State
this     0       8        1        0      active sync   /dev/sda1

   0     0       8        1        0      active sync   /dev/sda1
   1     1       0        0        1      faulty removed

[-- Attachment #8: 20110804--new-drive--mdadm--examine-sda2 --]
[-- Type: application/octet-stream, Size: 810 bytes --]

/dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 52ff2cf2:40981859:d8b78f65:99226e41 (local to host grml)
  Creation Time : Thu Aug  4 00:05:57 2011
     Raid Level : raid1
  Used Dev Size : 1953013952 (1862.54 GiB 1999.89 GB)
     Array Size : 1953013952 (1862.54 GiB 1999.89 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

    Update Time : Thu Aug  4 00:05:57 2011
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
       Checksum : bf78bfbf - correct
         Events : 1

      Number   Major   Minor   RaidDevice State
this     0       8        2        0      active sync   /dev/sda2

   0     0       8        2        0      active sync   /dev/sda2
   1     0       0        0        0      spare

[-- Attachment #9: 20110804--new-drive--sfdisk-d-sda --]
[-- Type: application/octet-stream, Size: 260 bytes --]

# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size=   995967, Id=fd, bootable
/dev/sda2 : start=   996030, size=3906028035, Id=fd
/dev/sda3 : start=        0, size=        0, Id= 0
/dev/sda4 : start=        0, size=        0, Id= 0

[-- Attachment #10: 20110804--new-drive--with-header-from-old-drive--cryptsetup-luksDump --]
[-- Type: application/octet-stream, Size: 1537 bytes --]

# cryptsetup --verbose --debug luksDump /dev/sda2
# cryptsetup 1.3.0 processing "cryptsetup --verbose --debug luksDump /dev/sda2"
# Running command luksDump.
# Locking memory.
# Allocating crypt device /dev/sda2 context.
# Trying to open and read device /dev/sda2.
# Initialising device-mapper backend, UDEV is enabled.
# Detected dm-crypt version 1.10.0, dm-ioctl version 4.19.1.
# Trying to load LUKS1 crypt type from device /dev/sda2.
# Initialising gcrypt crypto backend.
# Reading LUKS header of size 1024 from device /dev/sda2
LUKS header information for /dev/sda2

Version:        1
Cipher name:    aes
Cipher mode:    cbc-essiv:sha256
Hash spec:      sha1
Payload offset: 2056
MK bits:        256
MK digest:      e9 d2 56 b7 87 5f 89 92 fd 2b 96 ea f0 7b 05 66 55 f6 c5 a5
MK salt:        19 1d ee 74 09 5b db 95 57 68 4c 7c f1 ef 96 5e
                a9 eb 1d 57 7a e1 b7 b8 22 a9 97 23 a1 c9 b6 3a
MK iterations:  10
UUID:           cb6681b4-4dda-4548-8c59-3d9838de9c22

Key Slot 0: ENABLED
        Iterations:             75892
        Salt:                   0f 9e a8 20 a2 68 36 09 6e 5e ed b8 dd 50 63 8c
                                9c 06 24 ca 61 63 64 8d f5 6d 74 7b 0f aa 75 d0
        Key material offset:    8
        AF stripes:             4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
# Releasing crypt device /dev/sda2 context.
# Releasing device-mapper backend.
# Unlocking memory.
Command successful.

[-- Attachment #11: 20110804--old-drive--blkid --]
[-- Type: application/octet-stream, Size: 241 bytes --]

/dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" 
/dev/sda2: UUID="52ff2cf2-4098-1859-e58d-8dd65faec42c" TYPE="linux_raid_member" 
/dev/sdb4: LABEL="grml64 2011.05" TYPE="iso9660" 
/dev/loop0: TYPE="squashfs" 

[-- Attachment #12: 20110804--old-drive--fdisk-l --]
[-- Type: application/octet-stream, Size: 1276 bytes --]

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009bf9c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          62      497983+  fd  Linux raid autodetect
/dev/sda2              63       60801   487886017+  fd  Linux raid autodetect

Disk /dev/sdb: 2099 MB, 2099249152 bytes
16 heads, 32 sectors/track, 8008 cylinders
Units = cylinders of 512 * 512 = 262144 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc9a2be38

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb4   *           1        2793      715008   96  Unknown

Disk /dev/sdb4: 732 MB, 732168192 bytes
16 heads, 32 sectors/track, 2793 cylinders
Units = cylinders of 512 * 512 = 262144 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc9a2be38

     Device Boot      Start         End      Blocks   Id  System
/dev/sdb4p4   *           1        2793      715008   96  Unknown

[-- Attachment #13: 20110804--old-drive--fdisk-s-sda1 --]
[-- Type: application/octet-stream, Size: 7 bytes --]

497983

[-- Attachment #14: 20110804--old-drive--fdisk-s-sda2 --]
[-- Type: application/octet-stream, Size: 10 bytes --]

487886017

[-- Attachment #15: 20110804--new-drive--sfdisk-d-sda --]
[-- Type: application/octet-stream, Size: 260 bytes --]

# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size=   995967, Id=fd, bootable
/dev/sda2 : start=   996030, size=3906028035, Id=fd
/dev/sda3 : start=        0, size=        0, Id= 0
/dev/sda4 : start=        0, size=        0, Id= 0

[-- Attachment #16: 20110804--old-drive--mdadm--examine-sda1 --]
[-- Type: application/octet-stream, Size: 790 bytes --]

/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : fb7f3dc5:d183cab6:12123120:1a2207b9
  Creation Time : Wed Mar 26 11:49:57 2008
     Raid Level : raid1
  Used Dev Size : 497856 (486.27 MiB 509.80 MB)
     Array Size : 497856 (486.27 MiB 509.80 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0

    Update Time : Sat Jun 18 14:25:10 2011
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 38518335 - correct
         Events : 19214

      Number   Major   Minor   RaidDevice State
this     0       8        1        0      active sync   /dev/sda1

   0     0       8        1        0      active sync   /dev/sda1
   1     1       0        0        1      faulty removed

[-- Attachment #17: 20110804--old-drive--mdadm--examine-sda2 --]
[-- Type: application/octet-stream, Size: 799 bytes --]

/dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 52ff2cf2:40981859:e58d8dd6:5faec42c
  Creation Time : Wed Mar 26 11:50:04 2008
     Raid Level : raid1
  Used Dev Size : 487885952 (465.28 GiB 499.60 GB)
     Array Size : 487885952 (465.28 GiB 499.60 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1

    Update Time : Sat Jun 18 14:25:10 2011
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 380692fc - correct
         Events : 25570832

      Number   Major   Minor   RaidDevice State
this     0       8        2        0      active sync   /dev/sda2

   0     0       8        2        0      active sync   /dev/sda2
   1     1       0        0        1      faulty removed

[-- Attachment #18: 20110804--old-drive--sfdisk-d-sda --]
[-- Type: application/octet-stream, Size: 259 bytes --]

# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size=   995967, Id=fd, bootable
/dev/sda2 : start=   996030, size=975772035, Id=fd
/dev/sda3 : start=        0, size=        0, Id= 0
/dev/sda4 : start=        0, size=        0, Id= 0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: LUKS superblock damaged by `mdadm --create` or user error?
  2011-08-04 19:27 LUKS superblock damaged by `mdadm --create` or user error? Paul Menzel
@ 2011-08-04 22:25 ` Paul Menzel
  2011-08-06  2:24 ` Phil Turmel
  1 sibling, 0 replies; 3+ messages in thread
From: Paul Menzel @ 2011-08-04 22:25 UTC (permalink / raw)
  To: linux-raid

2011/8/4 Paul Menzel <pm.debian@googlemail.com>:

[…]

> After having grown `/dev/sda2` using `fdisk /dev/sda2` with mdadm not
> running I forgot to grow the RAID1 and probably overwrote the md
> metadata (0.90) or made it unavaible because it was not at the end of
> the partition anymore after growing the physical and logical LVM
> volumes and filesystems.
>
>    # blkid
>    /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9"
> TYPE="linux_raid_member"
>    /dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22"
> TYPE="crypto_LUKS" # In `fdisk` I had set it to »Linux raid
> autodetect« (0xfd) though.

It looks like `fdisk` did a bad/incomplete job. `sfdisk` shows a warning(?).

--- 8< --- sfdisk output --- >8 ---
% sudo sfdisk -l /dev/sda

Disk /dev/sda: 243201 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
/dev/sda1   *      0+     61      62-    497983+  fd  Linux raid autodetect
/dev/sda2         62  243200  243139  1953014017+  fd  Linux raid autodetect
                end: (c,h,s) expected (1023,254,63) found (512,254,63)
/dev/sda3          0       -       0          0    0  Empty
/dev/sda4          0       -       0          0    0  Empty
% sudo sfdisk -V /dev/sda
partition 2: end: (c,h,s) expected (1023,254,63) found (512,254,63)
/dev/sda: OK
--- 8< --- sfdisk output --- >8 ---

Could that be related? Tomorrow I will look into this.

[…]


Thanks,

Paul


> [1] http://www.saout.de/pipermail/dm-crypt/2011-August/001857.html
> [2] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition
> [3] http://grml.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: LUKS superblock damaged by `mdadm --create` or user error?
  2011-08-04 19:27 LUKS superblock damaged by `mdadm --create` or user error? Paul Menzel
  2011-08-04 22:25 ` Paul Menzel
@ 2011-08-06  2:24 ` Phil Turmel
  1 sibling, 0 replies; 3+ messages in thread
From: Phil Turmel @ 2011-08-06  2:24 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-raid

Hi Paul,

On 08/04/2011 03:27 PM, Paul Menzel wrote:
> Dear Linux RAID folks,
> 
> 
> I hope I did not annoy you too much on #linux-raid and I am contacting
> this list to reach a broader audience for help and for archival
> purposes. My message to the list dm-crypt [1] was a little long and so
> is this one. I am sorry.

Don't sweat it.  More information is usually better than less when deciphering these sorts of problems.

> After having grown `/dev/sda2` using `fdisk /dev/sda2` with mdadm not
> running I forgot to grow the RAID1 and probably overwrote the md
> metadata (0.90) or made it unavaible because it was not at the end of
> the partition anymore after growing the physical and logical LVM
> volumes and filesystems.

Yes.  Enlarging your filesystem to fit the new size of /dev/sda2 certainly destroyed the metadata.  Even without the FS resize, the repartitioning would have hidden the 0.90 metadata.

>     # blkid
>     /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9"
> TYPE="linux_raid_member"
>     /dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22"
> TYPE="crypto_LUKS" # In `fdisk` I had set it to »Linux raid
> autodetect« (0xfd) though.
> 
> I could not boot anymore because `/dev/md1` could not be assembled.
> 
>         # mdadm --examine /dev/sda2
>         mdadm: No md superblock detected on /dev/sda2.
> 
>         # mdadm --examine /dev/sda1
>         /dev/sda1:
>                   Magic : a92b4efc
>                 Version : 0.90.00
>                    UUID : fb7f3dc5:d183cab6:12123120:1a2207b9
>           Creation Time : Wed Mar 26 11:49:57 2008
>              Raid Level : raid1
>           Used Dev Size : 497856 (486.27 MiB 509.80 MB)
>              Array Size : 497856 (486.27 MiB 509.80 MB)
>            Raid Devices : 2
>           Total Devices : 1
>         Preferred Minor : 0
> 
>             Update Time : Wed Aug  3 21:11:43 2011
>                   State : clean
>          Active Devices : 1
>         Working Devices : 1
>          Failed Devices : 1
>           Spare Devices : 0
>                Checksum : 388e903a - correct
>                  Events : 20332
> 
> 
>               Number   Major   Minor   RaidDevice State
>         this     0       8        1        0      active sync   /dev/sda1
> 
>            0     0       8        1        0      active sync   /dev/sda1
>            1     1       0        0        1      faulty removed
> 
>         # mdadm --verbose --assemble /dev/md1
> --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c
>         mdadm: looking for devices for /dev/md1
>         mdadm: no recogniseable superblock on /dev/dm-8
>         mdadm: /dev/dm-8 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-7
>         mdadm: /dev/dm-7 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-6
>         mdadm: /dev/dm-6 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-5
>         mdadm: /dev/dm-5 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-4
>         mdadm: /dev/dm-4 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-3
>         mdadm: /dev/dm-3 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-2
>         mdadm: /dev/dm-2 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/dm-1
>         mdadm: /dev/dm-1 has wrong uuid.
>         mdadm: cannot open device /dev/dm-0: Device or resource busy
>         mdadm: /dev/dm-0 has wrong uuid.
>         mdadm: no recogniseable superblock on /dev/md0
>         mdadm: /dev/md0 has wrong uuid.
>         mdadm: cannot open device /dev/loop0: Device or resource busy
>         mdadm: /dev/loop0 has wrong uuid.
>         mdadm: cannot open device /dev/sdb4: Device or resource busy
>         mdadm: /dev/sdb4 has wrong uuid.
>         mdadm: cannot open device /dev/sdb: Device or resource busy
>         mdadm: /dev/sdb has wrong uuid.
>         mdadm: cannot open device /dev/sda2: Device or resource busy
>         mdadm: /dev/sda2 has wrong uuid.
>         mdadm: cannot open device /dev/sda1: Device or resource busy
>         mdadm: /dev/sda1 has wrong uuid.
>         mdadm: cannot open device /dev/sda: Device or resource busy
>         mdadm: /dev/sda has wrong uuid.
> 
> and `mdadm --examine /dev/sda2` could not find any metadata.
> `/dev/sda2` could still be decrypted using `cryptsetup luksOpen
> /dev/sda2 sda2_crypt`. Not knowing about metadata and their storage
> (0.90) I read several Web resourses and joined IRC channels and came
> to the conclusion that I should just create a new (degraded) RAID1 and
> everything would be fine, since I had only one disk.

Here's where you started going wrong.  MD raid1 with end-of-device metadata has the handy property that its content appears to be equally accessible via direct access to the underlying device.  This is reliably true only for *read*.  /dev/md1 would have a size shorter than /dev/sda2, protecting the metadata from being overwritten.  Using the partition directly with luksOpen, without specifying "--readonly", put you on the path to destruction.

In particular, you now have the problem that the enlarged LVM PV inside the luks encryption is too big, and its tail has been overwritten with the MD v0.90 metadata.

> Booting from the live system Grml [3], which does *not* start `mdadm`
> or `lvm` during boot, I tried to create a new RAID1 using the
> following command (a).
> 
>    # command (a)
>    mdadm --verbose --create /dev/md1 \
>    --assume-clean \
>    --level=1 \
>    --raid-devices=2 \
>    --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \
>    /dev/sda2 missing
> 
> I ignored the warning about overwriting metadata because it only
> referred to booting. Unfortunately `cryptsetup luksOpen /dev/md1
> md1_crypt` did not find any LUKS superblock. Therefore I stopped
> `/dev/md1` and `cryptsetup luksOpen /dev/sda2 sda2_crypt` still
> worked. Then I remembered that the metadata version was originally
> 0.90 and added `--metadata=0.90` and executed the following (b).

Too late.  The v1.2 header (modern default) was written at this point, destroying the luks header.  This metadata is deliberately offset by 4k, so it didn't destroy the signature part of the luks header, but it destroyed all or part of your key slot.

>    # command (b)
>    mdadm --verbose --create /dev/md1 \
>    --assume-clean \
>    --level=1 \
>    --raid-devices=2 \
>    --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \
>    --metadata=0.90
>    /dev/sda2 missing
> 
> Lucky me I thought, `cryptsetup luksOpen /dev/md1 md1_crypt` asked me
> for the passphrase but I entered it three times and it would not
> unlock. Instead of trying it again – I do not know if it would have
> worked – I tried `cryptsetup luksOpen /dev/sda2 sda2_crypt` and it
> asked me for the passphrase too. The third time I seem to have entered
> it correctly, but I got an error message that it could not be mapped.
> 
> --- dmesg ---
> Aug  4 00:16:01 grml kernel: [ 7964.786362] device-mapper:
> table: 253:0: crypt: Device lookup failed
>        Aug  4 00:16:01 grml kernel: [ 7964.786367] device-mapper:
> ioctl: error adding target to table
>        Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6,
> /dev/dm-0, 10) failed: No such file or directory
>        Aug  4 00:16:01 grml udevd[2409]: inotify_add_watch(6,
> /dev/dm-0, 10) failed: No such file or directory
> 
>        Aug  4 00:17:14 grml kernel: [ 8038.196371] md1: detected
> capacity change from 1999886286848 to 0
>        Aug  4 00:17:14 grml kernel: [ 8038.196395] md: md1 stopped.
>        Aug  4 00:17:14 grml kernel: [ 8038.196407] md: unbind<sda2>
>        Aug  4 00:17:14 grml kernel: [ 8038.212653] md: export_rdev(sda2)
> --- dmesg ---
> 
> Then I realized that I had probably forgotten to stop `/dev/md1`.
> After stopping it, `cryptsetup luksOpen /dev/sda2 sda2_crypt` did not
> succeed anymore and I cannot access my data.

You probably keyed it in correctly every time.

> 1. Does the `dmesg` output suggest that accessing `/dev/sda2` while
> assembled caused any breakage?

No.

> 2. On #lvm and #linux-raid the common explanation was that command (a)
> had overwritten the LUKS superblock and damaged it. Is that possible?
> I could not find the magic number 0xa92b4efc in the first megabyte of
> `/dev/sda2`. Did `--assume-clean` prevent that?

Command (a) destroyed one or more luks keyslots.

> 3. Is command (b) to blame, or did it probably work and I had a typo
> in the passphrase?

Command (b) worked, but the damage was already done.

> I am thankful for any hint to get my data back.

Restoring the keyslot, or the entire luks header should do the trick.

> Thanks and sorry for the long message. Any hints on how to shorten it
> next time are much appreciated.
> 
> Paul
> 
> 
> PS: A month ago I head `dd` the content of a 500 GB drive to this one.
> That is why I wanted to resize the partitions. The old drive is still
> functional and I am attaching several outputs from commands from the
> current 2 TB drive and the old drive. The `luksDump` output is from
> the current drive but with the LUKS header from the 500 GB drive. I
> know that I am publishing the key to access my drive, but if it helps
> to get my data back I will encrypt from scratch again afterward. I
> also have the dump of the first MB (in this case) of the partition
> (`luksHeaderBackup`) from the old and new drive. But attaching them
> would be over the message size limit.

I recommend you dd the first 16 sectors (8k) of your old /dev/sda2 to the new /dev/sda2.  This should give you access to the encrypted contents again, via direct decryption of /dev/sda2.  You can try assembling /dev/md1 and decrypting it, but I doubt LVM will tolerate the truncated PV.

Either way, take a backup.

You can try to shrink the LVM PV if you haven't already resized your LV(s) to use it all.  Then assembling /dev/md1 and decrypting should work.

However, I recommend you start over with this array, and use modern v1.2 metadata.  Create a new luks device inside it, then the LVM pieces, and then restore from your fresh backup.

The v1.2 metadata will protect you from this sort of failure in the future.  (luksOpen will no longer work on the bare partition.)

> [1] http://www.saout.de/pipermail/dm-crypt/2011-August/001857.html
> [2] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition
> [3] http://grml.org/

Reference #2, with an assumption on your part, led you astray, as its example wasn't layered on top of MD raid.  You assumed that luksOpen with /dev/sda2 was OK.  It appears to work, and is *readable*, but it does not maintain the integrity of the raid layer.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-08-06  2:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-04 19:27 LUKS superblock damaged by `mdadm --create` or user error? Paul Menzel
2011-08-04 22:25 ` Paul Menzel
2011-08-06  2:24 ` Phil Turmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).