All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-crypt] How to recover partially overwritten LUKS volume?
@ 2012-08-25  6:14 András Korn
  2012-08-25 11:20 ` Arno Wagner
  0 siblings, 1 reply; 10+ messages in thread
From: András Korn @ 2012-08-25  6:14 UTC (permalink / raw)
  To: dm-crypt

Hi,

I had an mdraid5 array made from four partitions of four disks, and
had LUKS on top of it.

One of the drives developed a few bad sectors (but didn't fail
completely), so I replaced it. While the array was resyncing, one of
the other drives threw a few errors, so mdadm marked it as failed and
stopped the array.

At this point, I made a mistake. I re-created the degraded array with:

mdadm --create /dev/md2 --level=5 --raid-devices=4 --assume-clean
missing /dev/sda4 /dev/sdc4 /dev/sdb4

However, I forgot to specify --metadata=0.90 (which the original array
used). I immediately rectified this, but by then mdadm had written a
raid superblock somewhere where originally there was none, and now
trying to luksOpen the volume with a known good passphrase results in
"No key available with this passphrase".

I still have the drive I removed, intact.

I have some backups but they're older than I'd like; is there anything
sensible I might to that could help me recover the LUKS volume?

My first idea is to re-create the array with the removed drive
included (making sure to specify the metadata version). This entails
uploading ~1TB over a 10MB/s link, so I thought I'd ask first whether
this had any chance of succeeding at all, or whether there was
anything else to try.

Thanks.

Andras

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-25  6:14 [dm-crypt] How to recover partially overwritten LUKS volume? András Korn
@ 2012-08-25 11:20 ` Arno Wagner
  2012-08-25 15:07   ` András Korn
  0 siblings, 1 reply; 10+ messages in thread
From: Arno Wagner @ 2012-08-25 11:20 UTC (permalink / raw)
  To: dm-crypt

On Sat, Aug 25, 2012 at 08:14:34AM +0200, Andr?s Korn wrote:
> Hi,
> 
> I had an mdraid5 array made from four partitions of four disks, and
> had LUKS on top of it.
> 
> One of the drives developed a few bad sectors (but didn't fail
> completely), so I replaced it. While the array was resyncing, one of
> the other drives threw a few errors, so mdadm marked it as failed and
> stopped the array.
> 
> At this point, I made a mistake. I re-created the degraded array with:
> 
> mdadm --create /dev/md2 --level=5 --raid-devices=4 --assume-clean
> missing /dev/sda4 /dev/sdc4 /dev/sdb4
> 
> However, I forgot to specify --metadata=0.90 (which the original array

Not good. Never, ever, ever recreate RAID arrays, filesystems,
etc. without a full binary backup of the originals, unless you
are prepared to lose all data that was on the devices.

> used). I immediately rectified this, but by then mdadm had written a
> raid superblock somewhere where originally there was none, and now
> trying to luksOpen the volume with a known good passphrase results in
> "No key available with this passphrase".

The default is metadata 1.2 for current mdadm. That put the 
superblock at 4k from the start and right in the middle of the
first key-slot. 

> I still have the drive I removed, intact.

It is unlikely but possible that what you lost is on there.
To determine this you would need to find out where exactly the
mdadm superbloick landed, extract the rest of the key-slot
and see whether you dinf that on the removed disk. If so, 
you may have the data missing from the key-slot on the
removed disk.

> I have some backups but they're older than I'd like; is there anything
> sensible I might to that could help me recover the LUKS volume?

Not really. The only faint hope is to have the missing data
on the removed disk. Nothing else that I can see. Chances are 
roughly 25% that the missing part is on the removed disk.

So unless you want to do some serious digging through raw
disk data on sector-level (and possibly writing some tools
for that yourself), no, nothing sensible.

> My first idea is to re-create the array with the removed drive
> included (making sure to specify the metadata version). T

Don't do that! It will likely only destroy more data.

> his entails
> uploading ~1TB over a 10MB/s link, so I thought I'd ask first whether
> this had any chance of succeeding at all, or whether there was
> anything else to try.

See above. 

The one, most important rule for data-recovery is though: 

==> Never ever ever write anything to the originals! <==

You can try to puzzle the header back together 
on different media, you do not need a data area.
You can alos use a detached header (newer cryptsetup)
and work in a file. As soon as you get an unlock, you
can then try to repair the old header with the recovered
one, but not before.

Of copurse all of this will require digging deep into
the RAID metadata on-disk formats, the LUKS header
format (FAQ Item 6.12 has the short version). You may
also have to experiment to recreated the RAID disk order,
stripe size, etc. by finding and recovering the 0.90
metadata block.

Arno
-- 
Arno Wagner,    Dr. sc. techn., Dipl. Inform.,   Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
One of the painful things about our time is that those who feel certainty 
are stupid, and those with any imagination and understanding are filled 
with doubt and indecision. -- Bertrand Russell 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-25 11:20 ` Arno Wagner
@ 2012-08-25 15:07   ` András Korn
  2012-08-25 18:59     ` Heinz Diehl
  2012-08-25 20:21     ` Arno Wagner
  0 siblings, 2 replies; 10+ messages in thread
From: András Korn @ 2012-08-25 15:07 UTC (permalink / raw)
  To: dm-crypt

On Sat, Aug 25, 2012 at 1:20 PM, Arno Wagner <arno@wagner.name> wrote:
> > At this point, I made a mistake. I re-created the degraded array with:
> >
> > mdadm --create /dev/md2 --level=5 --raid-devices=4 --assume-clean
> > missing /dev/sda4 /dev/sdc4 /dev/sdb4
> >
> > However, I forgot to specify --metadata=0.90 (which the original array
>
> Not good. Never, ever, ever recreate RAID arrays, filesystems,
> etc. without a full binary backup of the originals, unless you
> are prepared to lose all data that was on the devices.

This is very good advice and I often give it too. :)

> > used). I immediately rectified this, but by then mdadm had written a
> > raid superblock somewhere where originally there was none, and now
> > trying to luksOpen the volume with a known good passphrase results in
> > "No key available with this passphrase".
>
> The default is metadata 1.2 for current mdadm. That put the
> superblock at 4k from the start and right in the middle of the
> first key-slot.
>
> > I still have the drive I removed, intact.
>
> It is unlikely but possible that what you lost is on there.

The original RAID5 array used a chunksize of 64k, which seems to
suggest that the first 64k of the 0th device (which is the one I had
removed) should still contain the overwritten LUKS data; however, the
header was considerably larger than 64k (see below), so it seems I'm
out of luck.

> To determine this you would need to find out where exactly the
> mdadm superbloick landed, extract the rest of the key-slot
> and see whether you dinf that on the removed disk. If so,
> you may have the data missing from the key-slot on the
> removed disk.

The trouble is though that three of four disks were overwritten...

> > I have some backups but they're older than I'd like; is there anything
> > sensible I might to that could help me recover the LUKS volume?
>
> Not really. The only faint hope is to have the missing data
> on the removed disk. Nothing else that I can see. Chances are
> roughly 25% that the missing part is on the removed disk.

Even if it was RAID device #0 in the original array? Its first four
bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
device (if I try to luksOpen it separately).

> So unless you want to do some serious digging through raw
> disk data on sector-level (and possibly writing some tools
> for that yourself), no, nothing sensible.

I'd be up to some digging and tool-writing, but I don't know what it
is I should be doing. :)

I think the data area that got overwritten on disks #1, #2 and #3 was
intact on disk #0, but that didn't help (see below).

> > My first idea is to re-create the array with the removed drive
> > included (making sure to specify the metadata version). T
>
> Don't do that! It will likely only destroy more data.

I meant using copies, of course.

That's what I did now: I copied the first and last 96MB of all four
partitions to equally sized partitions on four other disks and tried
to re-create the array with the correct parameters using these. The
parameters are known correct now (metadata version, disk order as well
as chunksize).

However, luksOpen still says "No key available with this passphrase."

Would it make sense to try a luksFormat with the same passphrase? I
suppose not, because a random key is likely involved...?

I also assume that using more than the first and last 96MB of each
partition won't do much good either, right?

Am I correct in surmising that I'm screwed?

> You can try to puzzle the header back together
> on different media, you do not need a data area.
> You can alos use a detached header (newer cryptsetup)
> and work in a file. As soon as you get an unlock, you
> can then try to repair the old header with the recovered
> one, but not before.

How would I proceed with the detached header? Dump the header from the
corrupted (and reassembled) RAID array into a file and experiment with
that? How is that better than using a (partial) copy of the corrupted
array?

luksHeaderBackup produces a file that is 528384 bytes in size. This is
more than 8 RAID chunks, so it was certainly hit by the new RAID
superblock in 3 places (on disks #1, #2 and #3).

luksDump says:

Version:        1
Cipher name:    aes
Cipher mode:    cbc-essiv:sha256
Hash spec:      sha1
Payload offset: 1032
MK bits:        128
MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
                da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
MK iterations:  10
UUID:           5852d626-0428-4382-bca6-c04350559ceb

Key Slot 0: ENABLED
        Iterations:             141780
        Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
                                63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
        Key material offset:    8
        AF stripes:             4000
Key Slot 1: ENABLED
        Iterations:             95596
        Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
                                16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
        Key material offset:    136
        AF stripes:             4000
Key Slot 2: ENABLED
        Iterations:             109766
        Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
                                77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
        Key material offset:    264
        AF stripes:             4000
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED

FWIW, I know all three keyphrases but none of them work.

Andras

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-25 15:07   ` András Korn
@ 2012-08-25 18:59     ` Heinz Diehl
  2012-08-25 20:21     ` Arno Wagner
  1 sibling, 0 replies; 10+ messages in thread
From: Heinz Diehl @ 2012-08-25 18:59 UTC (permalink / raw)
  To: dm-crypt

On 25.08.2012, András Korn wrote: 

> Would it make sense to try a luksFormat with the same passphrase? I
> suppose not, because a random key is likely involved...?

The passphrase is used to unlock the master key, which is _not_
derived from the passphrase. So it makes no sense at all, but will
destroy your header.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-25 15:07   ` András Korn
  2012-08-25 18:59     ` Heinz Diehl
@ 2012-08-25 20:21     ` Arno Wagner
  2012-08-26 14:33       ` András Korn
  1 sibling, 1 reply; 10+ messages in thread
From: Arno Wagner @ 2012-08-25 20:21 UTC (permalink / raw)
  To: dm-crypt

On Sat, Aug 25, 2012 at 05:07:14PM +0200, Andr?s Korn wrote:
> On Sat, Aug 25, 2012 at 1:20 PM, Arno Wagner <arno@wagner.name> wrote:
> > > At this point, I made a mistake. I re-created the degraded array with:
> > >
> > > mdadm --create /dev/md2 --level=5 --raid-devices=4 --assume-clean
> > > missing /dev/sda4 /dev/sdc4 /dev/sdb4
> > >
> > > However, I forgot to specify --metadata=0.90 (which the original array
> >
> > Not good. Never, ever, ever recreate RAID arrays, filesystems,
> > etc. without a full binary backup of the originals, unless you
> > are prepared to lose all data that was on the devices.
> 
> This is very good advice and I often give it too. :)

Ah, yes. Giving advice is easier than to follow it. 
Happens to me too from time to time.

> > > used). I immediately rectified this, but by then mdadm had written a
> > > raid superblock somewhere where originally there was none, and now
> > > trying to luksOpen the volume with a known good passphrase results in
> > > "No key available with this passphrase".
> >
> > The default is metadata 1.2 for current mdadm. That put the
> > superblock at 4k from the start and right in the middle of the
> > first key-slot.
> >
> > > I still have the drive I removed, intact.
> >
> > It is unlikely but possible that what you lost is on there.
> 
> The original RAID5 array used a chunksize of 64k, which seems to
> suggest that the first 64k of the 0th device (which is the one I had
> removed) should still contain the overwritten LUKS data; however, the
> header was considerably larger than 64k (see below), so it seems I'm
> out of luck.

Not necessarily. You just need the ehader (~600 bytes) and one
intact key-slot. The keyslots are 128kiB with default
values (256kiB with XTS mode). The smaller one may have survived,
see below. An XTS keyslot will not have survived though.
 
> > To determine this you would need to find out where exactly the
> > mdadm superbloick landed, extract the rest of the key-slot
> > and see whether you dinf that on the removed disk. If so,
> > you may have the data missing from the key-slot on the
> > removed disk.
> 
> The trouble is though that three of four disks were overwritten...

Thinking about this again, you are right. The resync will 
have done additional damage. Now resync for RAID5 only writes
the parity of all other disks to one. It does this rotating by 
stripes. To 64k written to disk 0, next 64k to disk 1, ...
With 4 disks, ther pattern is that 192k stay intact on each disk 
and then 64k of parity is overwriten. Repeat until end. 

LUKS keyslots are 128kiB in size. So you may still be in luck,
but this is going to take a lot of time to test out and sounds 
rather unlikely. 

> > > I have some backups but they're older than I'd like; is there anything
> > > sensible I might to that could help me recover the LUKS volume?
> >
> > Not really. The only faint hope is to have the missing data
> > on the removed disk. Nothing else that I can see. Chances are
> > roughly 25% that the missing part is on the removed disk.
> 
> Even if it was RAID device #0 in the original array? Its first four
> bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
> device (if I try to luksOpen it separately).

Then the first 64k to 192k (see above) will be valid header data.
Have you tried unlocking just the removed disk with the password?
 
> > So unless you want to do some serious digging through raw
> > disk data on sector-level (and possibly writing some tools
> > for that yourself), no, nothing sensible.
> 
> I'd be up to some digging and tool-writing, but I don't know what it
> is I should be doing. :)
> 
> I think the data area that got overwritten on disks #1, #2 and #3 was
> intact on disk #0, but that didn't help (see below).
> 
> > > My first idea is to re-create the array with the removed drive
> > > included (making sure to specify the metadata version). T
> >
> > Don't do that! It will likely only destroy more data.
> 
> I meant using copies, of course.
> 
> That's what I did now: I copied the first and last 96MB of all four
> partitions to equally sized partitions on four other disks and tried
> to re-create the array with the correct parameters using these. The
> parameters are known correct now (metadata version, disk order as well
> as chunksize).

Ok, that is enough data.
 
> However, luksOpen still says "No key available with this passphrase."

Yes, because ther MD header 1.2 killed data at 4kB offset
in the first key-slot.

> Would it make sense to try a luksFormat with the same passphrase? I
> suppose not, because a random key is likely involved...?

No. See FAQ Secion 1.2 "Warnings". One of the very few
disadvantages (in this situation) of LUKS over plain dm-crypt. 

> I also assume that using more than the first and last 96MB of each
> partition won't do much good either, right?

You can actually reduce that to the first 3MiB or so if you are 
going to try to recover only the LUKS header and keyslot.
 
> Am I correct in surmising that I'm screwed?

No. Unclear at this time. But expect a lot of fiddeling
and you may be screwed after all.

> > You can try to puzzle the header back together
> > on different media, you do not need a data area.
> > You can alos use a detached header (newer cryptsetup)
> > and work in a file. As soon as you get an unlock, you
> > can then try to repair the old header with the recovered
> > one, but not before.
> 
> How would I proceed with the detached header? Dump the header from the
> corrupted (and reassembled) RAID array into a file and experiment with
> that? 

Yes. Or use a loop-mounted file (see FAQ item 2.3)

> How is that better than using a (partial) copy of the corrupted
> array?

It is easier to handle. Doing raw sector reads/writes on disks is
harder than just reading/writing in files with offsets. With
files you can also easily do things like using "head" and "tail"
to combine pieces. For example
 
  head -c 64K /dev/sdx 

gives you the first 64kiB of disk sdx, or

  tail -c 64K /dev/sdx | head -c 64K 

gives you the second 64kiB. And you can combine with cat and
">>".

Come to think of it you could do all the analysis just with 
shell-scripts.

> luksHeaderBackup produces a file that is 528384 bytes in size. This is
> more than 8 RAID chunks, so it was certainly hit by the new RAID
> superblock in 3 places (on disks #1, #2 and #3).

Yes, but see above that up to 192k may be intact in a rotating fasion. 
Depends on how the RAID code distributes the parity stripes. 
If you are lucky, one of the key-slots made it.

> luksDump says:
> 
> Version:        1
> Cipher name:    aes
> Cipher mode:    cbc-essiv:sha256
> Hash spec:      sha1
> Payload offset: 1032
> MK bits:        128
> MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
> MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
>                 da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
> MK iterations:  10
> UUID:           5852d626-0428-4382-bca6-c04350559ceb
> 
> Key Slot 0: ENABLED
>         Iterations:             141780
>         Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
>                                 63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
>         Key material offset:    8
>         AF stripes:             4000
> Key Slot 1: ENABLED
>         Iterations:             95596
>         Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
>                                 16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
>         Key material offset:    136
>         AF stripes:             4000
> Key Slot 2: ENABLED
>         Iterations:             109766
>         Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
>                                 77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
>         Key material offset:    264
>         AF stripes:             4000
> Key Slot 3: DISABLED
> Key Slot 4: DISABLED
> Key Slot 5: DISABLED
> Key Slot 6: DISABLED
> Key Slot 7: DISABLED
> 
> FWIW, I know all three keyphrases but none of them work.

Now you have a chance of one of the keyslots being
intact on the removed disk or revoceralbe using all
disks. Try the thing above on the removed disk.

If this fails, you can start to try all 5 block combinations 
of the first 5 64kiB blocks from the removed and non-removed 
disks (LUKS header and 3 key-slots) and see whether any 
combination can be unlocked with one of the 3 passprases. 

This is about 10M combinations so trying for one of
the keyslots at a time and respect ordering would be
a good idea. 

BTW, if you manage to unlock any of the keyslots, the
next thing is to get and backup the master key, see FAQ 
item 6.10. 


Still, you may be screwed. My completey non-scientific guess is 
that you have something like 1/4 chance of recovering a working 
keyslot (and hence most of the data). You already _have_ a working 
header, keep that safe if you are going to to try to invest the 
effort.


Probably best value for effort: See whether any key-slot is 
intact on the removed disk, if not, cut your losses and use the 
backup.


Arno
-- 
Arno Wagner,    Dr. sc. techn., Dipl. Inform.,   Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
One of the painful things about our time is that those who feel certainty 
are stupid, and those with any imagination and understanding are filled 
with doubt and indecision. -- Bertrand Russell 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-25 20:21     ` Arno Wagner
@ 2012-08-26 14:33       ` András Korn
  2012-08-26 14:42         ` Heinz Diehl
  2012-08-26 15:21         ` Arno Wagner
  0 siblings, 2 replies; 10+ messages in thread
From: András Korn @ 2012-08-26 14:33 UTC (permalink / raw)
  To: dm-crypt

On Sat, Aug 25, 2012 at 10:21 PM, Arno Wagner <arno@wagner.name> wrote:

>> The original RAID5 array used a chunksize of 64k, which seems to
>> suggest that the first 64k of the 0th device (which is the one I had
>> removed) should still contain the overwritten LUKS data; however, the
>> header was considerably larger than 64k (see below), so it seems I'm
>> out of luck.
>
> Not necessarily. You just need the ehader (~600 bytes) and one
> intact key-slot. The keyslots are 128kiB with default
> values (256kiB with XTS mode). The smaller one may have survived,
> see below. An XTS keyslot will not have survived though.

I used the defaults.

>> > To determine this you would need to find out where exactly the
>> > mdadm superbloick landed, extract the rest of the key-slot
>> > and see whether you dinf that on the removed disk. If so,
>> > you may have the data missing from the key-slot on the
>> > removed disk.
>>
>> The trouble is though that three of four disks were overwritten...
>
> Thinking about this again, you are right. The resync will
> have done additional damage.

I think the resync did no damage because it only wrote to the new
disk; the originals only had to be read.

I was thinking that maybe I could try to assemble the (copies of the)
original array with the 0th, 2nd and 3rd drive, leaving the 1st out
initially and then re-add it, allowing RAID5 to sync the data to it,
thereby hopefully regenerating the LUKS metadata. However, this won't
work either.

My array used the default left-symmetric layout, which afaik is:

D0 D1 D2 P0
D4 D5 P1 D3
D8 P2 D6 D7
...

D1, D2 and P0 are damaged. Everything else is intact.

So even omitting D1, the data that would be used to reconstruct it
would be incorrect.

> Now resync for RAID5 only writes
> the parity of all other disks to one.

Not quite. If you add a missing disk, it will contain data as well as
parity after the resync. The parity will be computed based on the data
on the other disks as you said, while the data will be computed as a
function of the data and the parity on the other disks.

> LUKS keyslots are 128kiB in size. So you may still be in luck,
> but this is going to take a lot of time to test out and sounds
> rather unlikely.

All chunks are 64k long. The first keyslot started somewhere in the
first part of D0, covered all of D1 and ended in the first part of D2.
It's gone for sure, becuase part of D1 got overwritten. If next
keyslot followed immediately, it covered D3 and ended in D4. It was
likely also overwritten because a raid superblock was written to
D2+4k. However, the third keyslot had to start in D4 (or even D5,
depending on the specific layout; D0+D1+D2+D3 together only have room
for two keyslots).

Therefore, the third keyslot should still be intact.

Now, how do I get to it?

>> > > I have some backups but they're older than I'd like; is there anything
>> > > sensible I might to that could help me recover the LUKS volume?
>> >
>> > Not really. The only faint hope is to have the missing data
>> > on the removed disk. Nothing else that I can see. Chances are
>> > roughly 25% that the missing part is on the removed disk.
>>
>> Even if it was RAID device #0 in the original array? Its first four
>> bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
>> device (if I try to luksOpen it separately).
>
> Then the first 64k to 192k (see above) will be valid header data.
> Have you tried unlocking just the removed disk with the password?

I have, and it didn't work; but it can't be expected to work, can it?
That disk contains D0, D4, D8 etc. in this order.

>> However, luksOpen still says "No key available with this passphrase."
>
> Yes, because ther MD header 1.2 killed data at 4kB offset
> in the first key-slot.

But I also tried with the other keys.

>> How would I proceed with the detached header? Dump the header from the
>> corrupted (and reassembled) RAID array into a file and experiment with
>> that?
>
> Yes. Or use a loop-mounted file (see FAQ item 2.3)
>
>> How is that better than using a (partial) copy of the corrupted
>> array?
>
> It is easier to handle. Doing raw sector reads/writes on disks is
> harder than just reading/writing in files with offsets. With
> files you can also easily do things like using "head" and "tail"
> to combine pieces. For example
>
>   head -c 64K /dev/sdx
>
> gives you the first 64kiB of disk sdx, or
>
>   tail -c 64K /dev/sdx | head -c 64K
>
> gives you the second 64kiB. And you can combine with cat and
> ">>".

Fwiw, this also works with disks (using dd), but I see what you mean.

> Yes, but see above that up to 192k may be intact in a rotating fasion.
> Depends on how the RAID code distributes the parity stripes.
> If you are lucky, one of the key-slots made it.

It would appear that the third keyslot (#2) must have made it.

>> luksDump says:
>>
>> Version:        1
>> Cipher name:    aes
>> Cipher mode:    cbc-essiv:sha256
>> Hash spec:      sha1
>> Payload offset: 1032
>> MK bits:        128
>> MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
>> MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
>>                 da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
>> MK iterations:  10
>> UUID:           5852d626-0428-4382-bca6-c04350559ceb
>>
>> Key Slot 0: ENABLED
>>         Iterations:             141780
>>         Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
>>                                 63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
>>         Key material offset:    8
>>         AF stripes:             4000
>> Key Slot 1: ENABLED
>>         Iterations:             95596
>>         Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
>>                                 16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
>>         Key material offset:    136
>>         AF stripes:             4000
>> Key Slot 2: ENABLED
>>         Iterations:             109766
>>         Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
>>                                 77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
>>         Key material offset:    264
>>         AF stripes:             4000
>> Key Slot 3: DISABLED
>> Key Slot 4: DISABLED
>> Key Slot 5: DISABLED
>> Key Slot 6: DISABLED
>> Key Slot 7: DISABLED
>>
>
> Now you have a chance of one of the keyslots being
> intact on the removed disk or revoceralbe using all
> disks. Try the thing above on the removed disk.
>
> If this fails, you can start to try all 5 block combinations
> of the first 5 64kiB blocks from the removed and non-removed
> disks (LUKS header and 3 key-slots) and see whether any
> combination can be unlocked with one of the 3 passprases.

The way I see it, my only chance is with the 3rd slot.

However, I don't understand the above paragraph. What 5 block
combinations do you mean?

> Probably best value for effort: See whether any key-slot is
> intact on the removed disk, if not, cut your losses and use the
> backup.

I now believe that keyslot #2 is intact (provided keyslots are really
at least 128k in size, and that the left-symmetric raid5 layout is
what I think it is). I'm also fairly certain the passphrases for
keyslots #1 and #2 (the 2nd and 3rd keyslot) are identical (I'm
certain of the 3rd passphrase because I added it just a few days ago,
and I'm almost certain of the 2nd one).

Could it be that cryptsetup tries to use keyslot #1 based on the
passphrase I enter, realizes that it's corrupt and throws an error
without ever trying keyslot #2? But apparently no, because specifying
an explicit --key-slot also fails.

Any suggestions?

Andras

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-26 14:33       ` András Korn
@ 2012-08-26 14:42         ` Heinz Diehl
  2012-08-26 15:21         ` Arno Wagner
  1 sibling, 0 replies; 10+ messages in thread
From: Heinz Diehl @ 2012-08-26 14:42 UTC (permalink / raw)
  To: dm-crypt

On 26.08.2012, András Korn wrote: 

> Could it be that cryptsetup tries to use keyslot #1 based on the
> passphrase I enter, realizes that it's corrupt and throws an error
> without ever trying keyslot #2? But apparently no, because specifying
> an explicit --key-slot also fails.

AS far as I know, the passphrase is probed against all keyslots, and
the one which matches is taken.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-26 14:33       ` András Korn
  2012-08-26 14:42         ` Heinz Diehl
@ 2012-08-26 15:21         ` Arno Wagner
  2012-08-26 15:45           ` András Korn
  1 sibling, 1 reply; 10+ messages in thread
From: Arno Wagner @ 2012-08-26 15:21 UTC (permalink / raw)
  To: dm-crypt

On Sun, Aug 26, 2012 at 04:33:58PM +0200, Andr?s Korn wrote:
> On Sat, Aug 25, 2012 at 10:21 PM, Arno Wagner <arno@wagner.name> wrote:
> 
> >> The original RAID5 array used a chunksize of 64k, which seems to
> >> suggest that the first 64k of the 0th device (which is the one I had
> >> removed) should still contain the overwritten LUKS data; however, the
> >> header was considerably larger than 64k (see below), so it seems I'm
> >> out of luck.
> >
> > Not necessarily. You just need the ehader (~600 bytes) and one
> > intact key-slot. The keyslots are 128kiB with default
> > values (256kiB with XTS mode). The smaller one may have survived,
> > see below. An XTS keyslot will not have survived though.
> 
> I used the defaults.
> 
> >> > To determine this you would need to find out where exactly the
> >> > mdadm superbloick landed, extract the rest of the key-slot
> >> > and see whether you dinf that on the removed disk. If so,
> >> > you may have the data missing from the key-slot on the
> >> > removed disk.
> >>
> >> The trouble is though that three of four disks were overwritten...
> >
> > Thinking about this again, you are right. The resync will
> > have done additional damage.
> 
> I think the resync did no damage because it only wrote to the new
> disk; the originals only had to be read.

I am not sure. Only recreating parity (my original though) 
would also be a valid approach for a new array, and that is
what you were creating. On adding a disk to a degraded
array, you would be perfectly correct. 
 
> I was thinking that maybe I could try to assemble the (copies of the)
> original array with the 0th, 2nd and 3rd drive, leaving the 1st out
> initially and then re-add it, allowing RAID5 to sync the data to it,
> thereby hopefully regenerating the LUKS metadata. However, this won't
> work either.
> 
> My array used the default left-symmetric layout, which afaik is:
> 
> D0 D1 D2 P0
> D4 D5 P1 D3
> D8 P2 D6 D7
> ...
> 
> D1, D2 and P0 are damaged. Everything else is intact.
> 
> So even omitting D1, the data that would be used to reconstruct it
> would be incorrect.

No idea.

> > Now resync for RAID5 only writes
> > the parity of all other disks to one.
> 
> Not quite. If you add a missing disk, it will contain data as well as
> parity after the resync. The parity will be computed based on the data
> on the other disks as you said, while the data will be computed as a
> function of the data and the parity on the other disks.
> 
> > LUKS keyslots are 128kiB in size. So you may still be in luck,
> > but this is going to take a lot of time to test out and sounds
> > rather unlikely.
> 
> All chunks are 64k long. The first keyslot started somewhere in the
> first part of D0, covered all of D1 and ended in the first part of D2.
> It's gone for sure, becuase part of D1 got overwritten. If next
> keyslot followed immediately, it covered D3 and ended in D4. It was
> likely also overwritten because a raid superblock was written to
> D2+4k. However, the third keyslot had to start in D4 (or even D5,
> depending on the specific layout; D0+D1+D2+D3 together only have room
> for two keyslots).
> 
> Therefore, the third keyslot should still be intact.
> 
> Now, how do I get to it?

Ah. Use the info in the FAQ and RAID geometry to extract it, and 
place it in a file (starting with the intact header) at the correct 
offset.


> >> > > I have some backups but they're older than I'd like; is there anything
> >> > > sensible I might to that could help me recover the LUKS volume?
> >> >
> >> > Not really. The only faint hope is to have the missing data
> >> > on the removed disk. Nothing else that I can see. Chances are
> >> > roughly 25% that the missing part is on the removed disk.
> >>
> >> Even if it was RAID device #0 in the original array? Its first four
> >> bytes do say LUKS, and cryptsetup appears to recognise it as a LUKS
> >> device (if I try to luksOpen it separately).
> >
> > Then the first 64k to 192k (see above) will be valid header data.
> > Have you tried unlocking just the removed disk with the password?
> 
> I have, and it didn't work; but it can't be expected to work, can it?
> That disk contains D0, D4, D8 etc. in this order.

Indeed. Overlooked that. 
 
> >> However, luksOpen still says "No key available with this passphrase."
> >
> > Yes, because ther MD header 1.2 killed data at 4kB offset
> > in the first key-slot.
> 
> But I also tried with the other keys.
>
> >> How would I proceed with the detached header? Dump the header from the
> >> corrupted (and reassembled) RAID array into a file and experiment with
> >> that?
> >
> > Yes. Or use a loop-mounted file (see FAQ item 2.3)
> >
> >> How is that better than using a (partial) copy of the corrupted
> >> array?
> >
> > It is easier to handle. Doing raw sector reads/writes on disks is
> > harder than just reading/writing in files with offsets. With
> > files you can also easily do things like using "head" and "tail"
> > to combine pieces. For example
> >
> >   head -c 64K /dev/sdx
> >
> > gives you the first 64kiB of disk sdx, or
> >
> >   tail -c 64K /dev/sdx | head -c 64K
> >
> > gives you the second 64kiB. And you can combine with cat and
> > ">>".
> 
> Fwiw, this also works with disks (using dd), but I see what you mean.
> 
> > Yes, but see above that up to 192k may be intact in a rotating fasion.
> > Depends on how the RAID code distributes the parity stripes.
> > If you are lucky, one of the key-slots made it.
> 
> It would appear that the third keyslot (#2) must have made it.
> 
> >> luksDump says:
> >>
> >> Version:        1
> >> Cipher name:    aes
> >> Cipher mode:    cbc-essiv:sha256
> >> Hash spec:      sha1
> >> Payload offset: 1032
> >> MK bits:        128
> >> MK digest:      b9 68 70 a2 ac ca f7 f6 f6 8f b8 ba 33 59 3c 61 f3 e0 68 98
> >> MK salt:        4a 42 a9 ab e0 74 0f ee 8a 98 5b f8 d7 80 f7 73
> >>                 da a4 dd 16 5f 2e 18 48 f9 28 c7 7e e9 07 5f bf
> >> MK iterations:  10
> >> UUID:           5852d626-0428-4382-bca6-c04350559ceb
> >>
> >> Key Slot 0: ENABLED
> >>         Iterations:             141780
> >>         Salt:                   58 a9 bb e9 4d 31 03 54 1b b1 85 27 24 73 5f e0
> >>                                 63 52 18 cd 4f 3b ff fb 5f ed 26 b8 40 dd c7 b4
> >>         Key material offset:    8
> >>         AF stripes:             4000
> >> Key Slot 1: ENABLED
> >>         Iterations:             95596
> >>         Salt:                   41 fc a7 02 38 4d ff 6d d1 39 fb 6f 8f 3a 0f 0a
> >>                                 16 e0 e9 a6 b6 b2 86 e8 ae 01 f7 fc 41 6b 2e b4
> >>         Key material offset:    136
> >>         AF stripes:             4000
> >> Key Slot 2: ENABLED
> >>         Iterations:             109766
> >>         Salt:                   cd 00 34 39 60 d3 0b d3 d8 c5 b6 72 b3 a1 cd 01
> >>                                 77 a8 d4 84 0e bf 67 5c c2 73 b2 7e b7 ca de 75
> >>         Key material offset:    264
> >>         AF stripes:             4000
> >> Key Slot 3: DISABLED
> >> Key Slot 4: DISABLED
> >> Key Slot 5: DISABLED
> >> Key Slot 6: DISABLED
> >> Key Slot 7: DISABLED
> >>
> >
> > Now you have a chance of one of the keyslots being
> > intact on the removed disk or revoceralbe using all
> > disks. Try the thing above on the removed disk.
> >
> > If this fails, you can start to try all 5 block combinations
> > of the first 5 64kiB blocks from the removed and non-removed
> > disks (LUKS header and 3 key-slots) and see whether any
> > combination can be unlocked with one of the 3 passprases.
> 
> The way I see it, my only chance is with the 3rd slot.
> 
> However, I don't understand the above paragraph. What 5 block
> combinations do you mean?

Forget about that. It was a brute-force idea, but it
it too complex anyways.

> > Probably best value for effort: See whether any key-slot is
> > intact on the removed disk, if not, cut your losses and use the
> > backup.
> 
> I now believe that keyslot #2 is intact (provided keyslots are really
> at least 128k in size, and that the left-symmetric raid5 layout is
> what I think it is). I'm also fairly certain the passphrases for
> keyslots #1 and #2 (the 2nd and 3rd keyslot) are identical (I'm
> certain of the 3rd passphrase because I added it just a few days ago,
> and I'm almost certain of the 2nd one).
> 
> Could it be that cryptsetup tries to use keyslot #1 based on the
> passphrase I enter, realizes that it's corrupt and throws an error
> without ever trying keyslot #2? But apparently no, because specifying
> an explicit --key-slot also fails.

No. A corrupt keyslot just makes cryptsetup skip to the next one.

> Any suggestions?

Put the pieces from the removed disk into a file at the correct
offsets (where they were when the array was assembled) and try
with that.

Arno
-- 
Arno Wagner,    Dr. sc. techn., Dipl. Inform.,   Email: arno@wagner.name 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
One of the painful things about our time is that those who feel certainty 
are stupid, and those with any imagination and understanding are filled 
with doubt and indecision. -- Bertrand Russell 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-26 15:21         ` Arno Wagner
@ 2012-08-26 15:45           ` András Korn
  2012-08-26 20:51             ` András Korn
  0 siblings, 1 reply; 10+ messages in thread
From: András Korn @ 2012-08-26 15:45 UTC (permalink / raw)
  To: dm-crypt

On Sun, Aug 26, 2012 at 5:21 PM, Arno Wagner <arno@wagner.name> wrote:
>> I think the resync did no damage because it only wrote to the new
>> disk; the originals only had to be read.
>
> I am not sure. Only recreating parity (my original though)
> would also be a valid approach for a new array, and that is
> what you were creating.

But I specified --assume-clean and was creating a degraded array (so a
resync wouldn't even have been possible at that point).

I then added a new drive to a supposedly clean, but degraded array, so
the resync should only have written to that.

>> I was thinking that maybe I could try to assemble the (copies of the)
>> original array with the 0th, 2nd and 3rd drive, leaving the 1st out
>> initially and then re-add it, allowing RAID5 to sync the data to it,
>> thereby hopefully regenerating the LUKS metadata. However, this won't
>> work either.
>>
>> My array used the default left-symmetric layout, which afaik is:
>>
>> D0 D1 D2 P0
>> D4 D5 P1 D3
>> D8 P2 D6 D7
>> ...
>>
>> D1, D2 and P0 are damaged. Everything else is intact.
>>
>> So even omitting D1, the data that would be used to reconstruct it
>> would be incorrect.
>
> No idea.

Well, P0 is the parity block computed from D0, D1 and D2. Since D1, D2
and P0 are damaged, there is no way to use the parity to reconstruct
the contents of any of these blocks.

>> > Now resync for RAID5 only writes
>> > the parity of all other disks to one.
>>
>> Not quite. If you add a missing disk, it will contain data as well as
>> parity after the resync. The parity will be computed based on the data
>> on the other disks as you said, while the data will be computed as a
>> function of the data and the parity on the other disks.
>>
>> > LUKS keyslots are 128kiB in size. So you may still be in luck,
>> > but this is going to take a lot of time to test out and sounds
>> > rather unlikely.
>>
>> All chunks are 64k long. The first keyslot started somewhere in the
>> first part of D0, covered all of D1 and ended in the first part of D2.
>> It's gone for sure, becuase part of D1 got overwritten. If next
>> keyslot followed immediately, it covered D3 and ended in D4. It was
>> likely also overwritten because a raid superblock was written to
>> D2+4k. However, the third keyslot had to start in D4 (or even D5,
>> depending on the specific layout; D0+D1+D2+D3 together only have room
>> for two keyslots).
>>
>> Therefore, the third keyslot should still be intact.
>>
>> Now, how do I get to it?
>
> Ah. Use the info in the FAQ and RAID geometry to extract it, and
> place it in a file (starting with the intact header) at the correct
> offset.

O-kaaaaay... I'll try that and come back if I have specific questions.

>> Any suggestions?
>
> Put the pieces from the removed disk into a file at the correct
> offsets (where they were when the array was assembled) and try
> with that.

So, to reiterate: I use luksHeaderBackup to dump the entire header of
the reassembled, but corrupted array to a file. I then extract keyslot
#2 from the pristine disk and write it to the offset in the header
file where keyslot #0 should be. Then I try to luksOpen my array with
the modified header file. Is this what you're suggesting?

Thanks a lot for the help, btw!

Andras

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dm-crypt] How to recover partially overwritten LUKS volume?
  2012-08-26 15:45           ` András Korn
@ 2012-08-26 20:51             ` András Korn
  0 siblings, 0 replies; 10+ messages in thread
From: András Korn @ 2012-08-26 20:51 UTC (permalink / raw)
  To: dm-crypt

On Sun, Aug 26, 2012 at 5:45 PM, András Korn <korn.andras@gmail.com> wrote:
>>> Therefore, the third keyslot should still be intact.
>>>
>>> Now, how do I get to it?
>>
>> Ah. Use the info in the FAQ and RAID geometry to extract it, and
>> place it in a file (starting with the intact header) at the correct
>> offset.
>
> O-kaaaaay... I'll try that and come back if I have specific questions.

Well. Based on the FAQ, the first keyslot should be at offset 0x1000,
and sure enough, there is some data there:

000000 4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00  >LUKS....aes.....<
000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  >................<
000020 00 00 00 00 00 00 00 00 63 62 63 2d 65 73 73 69  >........cbc-essi<
000030 76 3a 73 68 61 32 35 36 00 00 00 00 00 00 00 00  >v:sha256........<
000040 00 00 00 00 00 00 00 00 73 68 61 31 00 00 00 00  >........sha1....<
[...]
000250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  >................<
*
001000 e7 82 83 cd a1 51 35 e6 60 bc bb aa d8 74 a6 31  >.....Q5.`....t.1<
001010 17 e7 f0 a8 f9 f2 b6 5e 19 db 22 69 69 ef 23 b6  >.......^.."ii.#.<

However, the 2nd key block should be at 0x21000, but there is nothing there:

0020ff0 0c 05 02 02 01 01 00 00 fe fc e6 eb 00 0d 01 01  >................<
021000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  >................<
*
021200 69 3b 84 eb 78 96 66 31 29 02 99 1b 9c e0 b3 2e  >i;..x.f1).......<

This is plausible as this is where the RAID superblock got written to.
To recap, this is what the RAID5 layout looks like:

D0 D1 D2 P0
D4 D5 P1 D3
D8 P2 D6 D7
P3 D9 D10 D11

0x21000 is 2x64k+4k, so we're looking at the RAID5 superblock in D2 (I
suppose it's all zeroes because mdadm zeroed it when I re-created the
array with 0.90 metadata, to avoid confusing itself with the 1.2
metadata that would've been in this superblock).

Looking further, where key slot #2 (the third one) should be, there
are also only zeroes:

040000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  >................<
*
050000 00 1d ff 01 00 00 02 06 01 00 ff fe fc fd fe fd  >................<

Now, 0x40000 is 4x64k, so this should be in D4, which should be intact.

This is the data I read both from the reconstructed RAID5 array as
well as from a file created by concatenating the 64k chunks from the
RAID5 disks in the correct order. I also explicitly checked the 2nd
64k block of original RAID5 drive #0, which is D4. It's all zeroes.
luksDump says keyslot 3 is in use, which is correct to the best of my
knowledge. Where is the key? This drive, the one I'm reading the
all-zero 64k chunk from, was unaffected by my mdadm --create (in all
experiments, I used a copy).

According to the FAQ, the 2nd 64k block on this drive shouldn't be all zeroes.

This is an old LUKS device (version 1). Could the offsets be different?

Andras

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-08-26 20:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-25  6:14 [dm-crypt] How to recover partially overwritten LUKS volume? András Korn
2012-08-25 11:20 ` Arno Wagner
2012-08-25 15:07   ` András Korn
2012-08-25 18:59     ` Heinz Diehl
2012-08-25 20:21     ` Arno Wagner
2012-08-26 14:33       ` András Korn
2012-08-26 14:42         ` Heinz Diehl
2012-08-26 15:21         ` Arno Wagner
2012-08-26 15:45           ` András Korn
2012-08-26 20:51             ` András Korn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.