* Suggested use of --invalid-backup?
@ 2013-04-02 20:20 Barrett Lewis
2013-04-08 19:13 ` Barrett Lewis
2013-04-09 2:49 ` Sam Bingner
0 siblings, 2 replies; 4+ messages in thread
From: Barrett Lewis @ 2013-04-02 20:20 UTC (permalink / raw)
To: linux-raid
I was reshaping a 5x2tb raid5 to a 6x2tb raid6. Not knowing that
ubuntu deletes the /tmp/ folder each reboot, I specified my
--backup-file as /tmp/raid-backup.bak (this is not part of the array).
At 15.1% the system hung sufficiently that REISUB and the reset
button were ignored and I had to hold the power button down to reset
the server. After booting back from the crash, the array would not
start, and ubuntu had deleted the backup file (and everything else in
/tmp).
The superblock already says it's raid6, all members are present and
the event counters are the same on all disks. I tried
ubuntu@ubuntu:~$ sudo mdadm --assemble --force --run --verbose
/dev/md0 /dev/sd[abcdef]
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
mdadm:/dev/md0 has an active reshape - checking if critical section
needs to be restored
mdadm: Failed to find backup of critical section
mdadm: Failed to restore critical section for reshape, sorry.
Possibly you needed to specify the --backup-file
My understanding is that the backup file is only for some early
critical part of the reshape and that it isn’t even used after that.
15% into 8tb is well over a terrabyte so wouldn’t that be far past any
filesystem metadata? So what exactly is implied (about the state of
the reshape) by the fact that programmatically it is still requiring
the backup file?
I have read the manpage on the --invalid-backup command but I didn't
clearly get "use it here, not here" type of information. I have the
OS drive (with deleted /tmp/raid-backup.bak) in a data recovery
process. If I actually get the backup file recovered, it could
potentially have corrupted bits. Is the best course of action to:
Supply the (potentially corrupted, but maybe some percent ok)
recovered backup file as the legitimate backup file (without
--invalid-backup)? (could this be worse than --invalid-backup and a
blank file?)
Supply the (potentially corrupted) recovered backup file WITH --invalid-backup?
Supply --invalid-backup and an empty file?
Or if I am on the wrong path, let me know of any other thoughts or
suggestions you might have.
If I get nothing useful back from data recovery, and I have to supply
--invalid-backup with a blank file, considering the reshape made it to
15%, how much chance is there that the array could assemble and resume
reshape? I would gladly accept the corruption of some files vs losing
the whole file system (obviously).
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Suggested use of --invalid-backup?
2013-04-02 20:20 Suggested use of --invalid-backup? Barrett Lewis
@ 2013-04-08 19:13 ` Barrett Lewis
2013-04-09 4:28 ` NeilBrown
2013-04-09 2:49 ` Sam Bingner
1 sibling, 1 reply; 4+ messages in thread
From: Barrett Lewis @ 2013-04-08 19:13 UTC (permalink / raw)
To: linux-raid
As much as I hate to bump, are there no thoughts on this?
The most important question is if I have a possibly corrupted version
of a backup file, should I supply it with the --invalid-backup flag?
Or does that expect a blank file only?
On Tue, Apr 2, 2013 at 3:20 PM, Barrett Lewis
<barrett.lewis.mitsi@gmail.com> wrote:
> I was reshaping a 5x2tb raid5 to a 6x2tb raid6. Not knowing that
> ubuntu deletes the /tmp/ folder each reboot, I specified my
> --backup-file as /tmp/raid-backup.bak (this is not part of the array).
> At 15.1% the system hung sufficiently that REISUB and the reset
> button were ignored and I had to hold the power button down to reset
> the server. After booting back from the crash, the array would not
> start, and ubuntu had deleted the backup file (and everything else in
> /tmp).
>
> The superblock already says it's raid6, all members are present and
> the event counters are the same on all disks. I tried
>
> ubuntu@ubuntu:~$ sudo mdadm --assemble --force --run --verbose
> /dev/md0 /dev/sd[abcdef]
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
> mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
> mdadm:/dev/md0 has an active reshape - checking if critical section
> needs to be restored
> mdadm: Failed to find backup of critical section
> mdadm: Failed to restore critical section for reshape, sorry.
> Possibly you needed to specify the --backup-file
>
>
> My understanding is that the backup file is only for some early
> critical part of the reshape and that it isn’t even used after that.
> 15% into 8tb is well over a terrabyte so wouldn’t that be far past any
> filesystem metadata? So what exactly is implied (about the state of
> the reshape) by the fact that programmatically it is still requiring
> the backup file?
>
> I have read the manpage on the --invalid-backup command but I didn't
> clearly get "use it here, not here" type of information. I have the
> OS drive (with deleted /tmp/raid-backup.bak) in a data recovery
> process. If I actually get the backup file recovered, it could
> potentially have corrupted bits. Is the best course of action to:
> Supply the (potentially corrupted, but maybe some percent ok)
> recovered backup file as the legitimate backup file (without
> --invalid-backup)? (could this be worse than --invalid-backup and a
> blank file?)
> Supply the (potentially corrupted) recovered backup file WITH --invalid-backup?
> Supply --invalid-backup and an empty file?
>
> Or if I am on the wrong path, let me know of any other thoughts or
> suggestions you might have.
>
> If I get nothing useful back from data recovery, and I have to supply
> --invalid-backup with a blank file, considering the reshape made it to
> 15%, how much chance is there that the array could assemble and resume
> reshape? I would gladly accept the corruption of some files vs losing
> the whole file system (obviously).
>
> Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Suggested use of --invalid-backup?
2013-04-08 19:13 ` Barrett Lewis
@ 2013-04-09 4:28 ` NeilBrown
0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2013-04-09 4:28 UTC (permalink / raw)
To: Barrett Lewis; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 4210 bytes --]
On Mon, 8 Apr 2013 14:13:31 -0500 Barrett Lewis
<barrett.lewis.mitsi@gmail.com> wrote:
> As much as I hate to bump, are there no thoughts on this?
>
> The most important question is if I have a possibly corrupted version
> of a backup file, should I supply it with the --invalid-backup flag?
> Or does that expect a blank file only?
>
> On Tue, Apr 2, 2013 at 3:20 PM, Barrett Lewis
> <barrett.lewis.mitsi@gmail.com> wrote:
> > I was reshaping a 5x2tb raid5 to a 6x2tb raid6. Not knowing that
> > ubuntu deletes the /tmp/ folder each reboot, I specified my
> > --backup-file as /tmp/raid-backup.bak (this is not part of the array).
> > At 15.1% the system hung sufficiently that REISUB and the reset
> > button were ignored and I had to hold the power button down to reset
> > the server. After booting back from the crash, the array would not
> > start, and ubuntu had deleted the backup file (and everything else in
> > /tmp).
> >
> > The superblock already says it's raid6, all members are present and
> > the event counters are the same on all disks. I tried
> >
> > ubuntu@ubuntu:~$ sudo mdadm --assemble --force --run --verbose
> > /dev/md0 /dev/sd[abcdef]
> > mdadm: looking for devices for /dev/md0
> > mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
> > mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> > mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
> > mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
> > mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> > mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
> > mdadm:/dev/md0 has an active reshape - checking if critical section
> > needs to be restored
> > mdadm: Failed to find backup of critical section
> > mdadm: Failed to restore critical section for reshape, sorry.
> > Possibly you needed to specify the --backup-file
> >
> >
> > My understanding is that the backup file is only for some early
> > critical part of the reshape and that it isn’t even used after that.
> > 15% into 8tb is well over a terrabyte so wouldn’t that be far past any
> > filesystem metadata? So what exactly is implied (about the state of
> > the reshape) by the fact that programmatically it is still requiring
> > the backup file?
> >
> > I have read the manpage on the --invalid-backup command but I didn't
> > clearly get "use it here, not here" type of information. I have the
> > OS drive (with deleted /tmp/raid-backup.bak) in a data recovery
> > process. If I actually get the backup file recovered, it could
> > potentially have corrupted bits. Is the best course of action to:
> > Supply the (potentially corrupted, but maybe some percent ok)
> > recovered backup file as the legitimate backup file (without
> > --invalid-backup)? (could this be worse than --invalid-backup and a
> > blank file?)
> > Supply the (potentially corrupted) recovered backup file WITH --invalid-backup?
> > Supply --invalid-backup and an empty file?
> >
> > Or if I am on the wrong path, let me know of any other thoughts or
> > suggestions you might have.
> >
> > If I get nothing useful back from data recovery, and I have to supply
> > --invalid-backup with a blank file, considering the reshape made it to
> > 15%, how much chance is there that the array could assemble and resume
> > reshape? I would gladly accept the corruption of some files vs losing
> > the whole file system (obviously).
> >
There is no risk in providing an backup file - if it doesn't look good it
will be ignored.
When md does an in-place reshape like this it:
- read several stripes
- writes them to the backup file
- writes them back to the devices
- updates the metadata
If your crash was during the "writes them back" section, then you will have
some corruption that you cannot avoid without having exactly the right backup
file.
With luck the corruption should be fairly limited.
There is nothing better that you can do then reassemble the array with the
best backup file you can find, and with --invalid-backup. Then 'fsck' and do
whatever you can to validate your data.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Suggested use of --invalid-backup?
2013-04-02 20:20 Suggested use of --invalid-backup? Barrett Lewis
2013-04-08 19:13 ` Barrett Lewis
@ 2013-04-09 2:49 ` Sam Bingner
1 sibling, 0 replies; 4+ messages in thread
From: Sam Bingner @ 2013-04-09 2:49 UTC (permalink / raw)
To: Barrett Lewis; +Cc: <linux-raid@vger.kernel.org>
On Apr 2, 2013, at 10:20 AM, Barrett Lewis <barrett.lewis.mitsi@gmail.com> wrote:
> I was reshaping a 5x2tb raid5 to a 6x2tb raid6. Not knowing that
> ubuntu deletes the /tmp/ folder each reboot, I specified my
> --backup-file as /tmp/raid-backup.bak (this is not part of the array).
> At 15.1% the system hung sufficiently that REISUB and the reset
> button were ignored and I had to hold the power button down to reset
> the server. After booting back from the crash, the array would not
> start, and ubuntu had deleted the backup file (and everything else in
> /tmp).
>
> The superblock already says it's raid6, all members are present and
> the event counters are the same on all disks. I tried
>
> ubuntu@ubuntu:~$ sudo mdadm --assemble --force --run --verbose
> /dev/md0 /dev/sd[abcdef]
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
> mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 1.
> mdadm:/dev/md0 has an active reshape - checking if critical section
> needs to be restored
> mdadm: Failed to find backup of critical section
> mdadm: Failed to restore critical section for reshape, sorry.
> Possibly you needed to specify the --backup-file
>
>
> My understanding is that the backup file is only for some early
> critical part of the reshape and that it isn’t even used after that.
> 15% into 8tb is well over a terrabyte so wouldn’t that be far past any
> filesystem metadata? So what exactly is implied (about the state of
> the reshape) by the fact that programmatically it is still requiring
> the backup file?
>
> I have read the manpage on the --invalid-backup command but I didn't
> clearly get "use it here, not here" type of information. I have the
> OS drive (with deleted /tmp/raid-backup.bak) in a data recovery
> process. If I actually get the backup file recovered, it could
> potentially have corrupted bits. Is the best course of action to:
> Supply the (potentially corrupted, but maybe some percent ok)
> recovered backup file as the legitimate backup file (without
> --invalid-backup)? (could this be worse than --invalid-backup and a
> blank file?)
> Supply the (potentially corrupted) recovered backup file WITH --invalid-backup?
> Supply --invalid-backup and an empty file?
>
> Or if I am on the wrong path, let me know of any other thoughts or
> suggestions you might have.
>
> If I get nothing useful back from data recovery, and I have to supply
> --invalid-backup with a blank file, considering the reshape made it to
> 15%, how much chance is there that the array could assemble and resume
> reshape? I would gladly accept the corruption of some files vs losing
> the whole file system (obviously).
>
> Thanks!
> --
>
I'm not sure about how exactly a reshape from 5-6 with no capacity change works... it is possible that it works like a shrink in that the critical section is at the end, did you try giving it a backup path with nothing there?
Otherwise, I would suggest making copies of all your devices and working with the copy... when I had a similar problem, I copied my member disks into lvms then worked with snapshots so that I could undo any changes that had problems. Once I knew exactly what happened I then went back to the original disks and corrected the problem while retaining my backup in case of further problems.
Also look in your recovered backup - is there actually data in it?
Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-04-09 4:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-02 20:20 Suggested use of --invalid-backup? Barrett Lewis
2013-04-08 19:13 ` Barrett Lewis
2013-04-09 4:28 ` NeilBrown
2013-04-09 2:49 ` Sam Bingner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox