* Failed during rebuild (raid5)
@ 2013-05-03 11:23 Andreas Boman
2013-05-03 11:38 ` Benjamin ESTRABAUD
2013-05-03 12:26 ` Ole Tange
0 siblings, 2 replies; 24+ messages in thread
From: Andreas Boman @ 2013-05-03 11:23 UTC (permalink / raw)
To: linux-raid
I have (had?) a raid 5 array, with 5 disks (1.5TB/EA), smartd warned one
was getting bad, so I replaced it with an identical disk.
I issued mdadm --manage --add /dev/md127 /dev/sdX
The array seemed to be rebuilding, was at around 15% when I went to bed.
This morning I came up to see the array degraded with two missing
drives, another failed during the rebuild.
I powered the system down, and since I have the disk smartd flagged as
bad and tried to just plug that in and power up hoping to see the array
come back up -no such luck (not enough disks).
I powered the system down again, and now I'm trying to evaluate my best
options to recover. Hoping to have some good advice in my inbox when I
get back from the office. I'll be able to boot the thing up and get log
info this afternoon.
Thanks!
Andreas
(please cc me, not subscribed)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 11:23 Failed during rebuild (raid5) Andreas Boman
@ 2013-05-03 11:38 ` Benjamin ESTRABAUD
2013-05-03 12:40 ` Robin Hill
2013-05-03 12:26 ` Ole Tange
1 sibling, 1 reply; 24+ messages in thread
From: Benjamin ESTRABAUD @ 2013-05-03 11:38 UTC (permalink / raw)
To: linux-raid; +Cc: Andreas Boman
On 03/05/13 12:23, Andreas Boman wrote:
> I have (had?) a raid 5 array, with 5 disks (1.5TB/EA), smartd warned
> one was getting bad, so I replaced it with an identical disk.
> I issued mdadm --manage --add /dev/md127 /dev/sdX
>
> The array seemed to be rebuilding, was at around 15% when I went to bed.
>
> This morning I came up to see the array degraded with two missing
> drives, another failed during the rebuild.
>
> I powered the system down, and since I have the disk smartd flagged as
> bad and tried to just plug that in and power up hoping to see the
> array come back up -no such luck (not enough disks).
>
Unfortunately this happens way too often: Your RAID members silently
fail over time. They will get some bad blocks, and you won't know about
it until you try to read or write one of the bad blocks. When that
happen a disk will get kicked out. At this stage you'll replace the
disk, not knowing that other areas of the other RAID members have also
failed. The only sensible option is to run a RAID 6 which dramatically
reduces the potential for double failure, or to run a RAID 5 but run a
weekly (at least) check of the entire array for badblocks, carefully
monitoring the smart reported changes after running the test (trying to
read the entire array will cause badblocks to be detected and
reallocated if any).
> I powered the system down again, and now I'm trying to evaluate my
> best options to recover. Hoping to have some good advice in my inbox
> when I get back from the office. I'll be able to boot the thing up and
> get log info this afternoon.
>
I had to recover an array like that twice. The most important is
probably to mitigate the data loss on the second drive that is failing
right now by "ddrescueing" all of its data on another drive before it
gets more damaged (the longer the failing drive is online the less
chance you have). Use GNU ddrescue for that purpose.
Once you have rescued the failing drive onto a new one, you could then
try to add that new recovered drive in place of the failing one and
start the resync as you did before.
Note that it would probably be worthwhile to ddrescue the initial drive
that you took out (if it is still good enough to do so) in case the
second drive cannot be recovered correctly or is missing some data.
Regards,
Ben.
> Thanks!
> Andreas
>
> (please cc me, not subscribed)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 11:23 Failed during rebuild (raid5) Andreas Boman
2013-05-03 11:38 ` Benjamin ESTRABAUD
@ 2013-05-03 12:26 ` Ole Tange
2013-05-04 11:29 ` Andreas Boman
2013-05-05 14:00 ` Andreas Boman
1 sibling, 2 replies; 24+ messages in thread
From: Ole Tange @ 2013-05-03 12:26 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid
On Fri, May 3, 2013 at 1:23 PM, Andreas Boman <aboman@midgaard.us> wrote:
> This morning I came up to see the array degraded with two missing drives,
> another failed during the rebuild.
I just started this page for dealing with situations like yours:
https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID
/Ole
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 11:38 ` Benjamin ESTRABAUD
@ 2013-05-03 12:40 ` Robin Hill
2013-05-03 13:52 ` John Stoffel
0 siblings, 1 reply; 24+ messages in thread
From: Robin Hill @ 2013-05-03 12:40 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid, Benjamin ESTRABAUD
[-- Attachment #1: Type: text/plain, Size: 3092 bytes --]
On Fri May 03, 2013 at 12:38:47PM +0100, Benjamin ESTRABAUD wrote:
> On 03/05/13 12:23, Andreas Boman wrote:
> > I have (had?) a raid 5 array, with 5 disks (1.5TB/EA), smartd warned
> > one was getting bad, so I replaced it with an identical disk.
> > I issued mdadm --manage --add /dev/md127 /dev/sdX
> >
> > The array seemed to be rebuilding, was at around 15% when I went to bed.
> >
> > This morning I came up to see the array degraded with two missing
> > drives, another failed during the rebuild.
> >
> > I powered the system down, and since I have the disk smartd flagged as
> > bad and tried to just plug that in and power up hoping to see the
> > array come back up -no such luck (not enough disks).
> >
> Unfortunately this happens way too often: Your RAID members silently
> fail over time. They will get some bad blocks, and you won't know about
> it until you try to read or write one of the bad blocks. When that
> happen a disk will get kicked out. At this stage you'll replace the
> disk, not knowing that other areas of the other RAID members have also
> failed. The only sensible option is to run a RAID 6 which dramatically
> reduces the potential for double failure, or to run a RAID 5 but run a
> weekly (at least) check of the entire array for badblocks, carefully
> monitoring the smart reported changes after running the test (trying to
> read the entire array will cause badblocks to be detected and
> reallocated if any).
>
I'd recommend running a regular check anyway, whatever RAID level you're
running. I wouldn't choose to run RAID5 with more than 5 disks, or
larger than 1TB members either, but that really comes down to an
individual's cost/benefit view on the situation.
> > I powered the system down again, and now I'm trying to evaluate my
> > best options to recover. Hoping to have some good advice in my inbox
> > when I get back from the office. I'll be able to boot the thing up and
> > get log info this afternoon.
> >
> I had to recover an array like that twice. The most important is
> probably to mitigate the data loss on the second drive that is failing
> right now by "ddrescueing" all of its data on another drive before it
> gets more damaged (the longer the failing drive is online the less
> chance you have). Use GNU ddrescue for that purpose.
>
> Once you have rescued the failing drive onto a new one, you could then
> try to add that new recovered drive in place of the failing one and
> start the resync as you did before.
>
Not "add", but "force assemble using". Whenever you end up in a
situation with too many failed disks to start your array, ddrescue and
force assemble (mdadm -Af) is your safest option. If that fails you can
look at more extreme measures, but that should always be the first
approach.
Cheers,
Robin
--
___
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 12:40 ` Robin Hill
@ 2013-05-03 13:52 ` John Stoffel
2013-05-03 14:51 ` Phil Turmel
2013-05-03 16:29 ` Mikael Abrahamsson
0 siblings, 2 replies; 24+ messages in thread
From: John Stoffel @ 2013-05-03 13:52 UTC (permalink / raw)
To: Robin Hill; +Cc: Andreas Boman, linux-raid, Benjamin ESTRABAUD
After watching endless threads about RAID5 arrays losing a disk, and
then losing a second during the rebuild, I wonder if it would make
sense to:
- have MD automatically increase all disk timeouts when doing a
rebuild. The idea being that we are more tolerant of a bad sector
when rebuilding? The idea would be to NOT just evict disks when in
potentially bad situations without trying really hard.
- Automatically setup an automatic scrub of the array that happens
weekly unless you explicitly turn it off. This would possibly
require changes from the distros, but if it could be made a core
part of MD so that all the blocks in the array get read each week,
that would help with silent failures.
We've got all these compute cycles kicking around that could be used
to make things even more reliable, we should be using them in some
smart way.
John
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 13:52 ` John Stoffel
@ 2013-05-03 14:51 ` Phil Turmel
2013-05-03 16:23 ` John Stoffel
2013-05-03 16:29 ` Mikael Abrahamsson
1 sibling, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-03 14:51 UTC (permalink / raw)
To: John Stoffel; +Cc: Robin Hill, Andreas Boman, linux-raid, Benjamin ESTRABAUD
On 05/03/2013 09:52 AM, John Stoffel wrote:
>
> After watching endless threads about RAID5 arrays losing a disk, and
> then losing a second during the rebuild, I wonder if it would make
> sense to:
>
> - have MD automatically increase all disk timeouts when doing a
> rebuild. The idea being that we are more tolerant of a bad sector
> when rebuilding? The idea would be to NOT just evict disks when in
> potentially bad situations without trying really hard.
This would be conterproductive for those users who actually follow
manufacturer guidelines when selecting drives for their arrays.
Anyways, it's a policy issue that belongs in userspace. Distros can do
this today if they want. There's no lack of scripts in this list's
archives.
> - Automatically setup an automatic scrub of the array that happens
> weekly unless you explicitly turn it off. This would possibly
> require changes from the distros, but if it could be made a core
> part of MD so that all the blocks in the array get read each week,
> that would help with silent failures.
I understand some distros already do this.
> We've got all these compute cycles kicking around that could be used
> to make things even more reliable, we should be using them in some
> smart way.
But the "smart way" varies with the hardware at hand. There's no "one
size fits all" solution here.
Phil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 14:51 ` Phil Turmel
@ 2013-05-03 16:23 ` John Stoffel
2013-05-03 16:32 ` Roman Mamedov
0 siblings, 1 reply; 24+ messages in thread
From: John Stoffel @ 2013-05-03 16:23 UTC (permalink / raw)
To: Phil Turmel
Cc: John Stoffel, Robin Hill, Andreas Boman, linux-raid,
Benjamin ESTRABAUD
>>>>> "Phil" == Phil Turmel <philip@turmel.org> writes:
Phil> On 05/03/2013 09:52 AM, John Stoffel wrote:
>>
>> After watching endless threads about RAID5 arrays losing a disk, and
>> then losing a second during the rebuild, I wonder if it would make
>> sense to:
>>
>> - have MD automatically increase all disk timeouts when doing a
>> rebuild. The idea being that we are more tolerant of a bad sector
>> when rebuilding? The idea would be to NOT just evict disks when in
>> potentially bad situations without trying really hard.
Phil> This would be conterproductive for those users who actually
Phil> follow manufacturer guidelines when selecting drives for their
Phil> arrays.
Well for them, which is drives supporting STEC, etc, you'd skip that
step. But for those using consumer drives, it might make sense. And
I didn't say to make this change for all arrays, just for those in a
rebuilding state where losing another disk would be potentially
fatal.
Phil> Anyways, it's a policy issue that belongs in userspace. Distros
Phil> can do this today if they want. There's no lack of scripts in
Phil> this list's archives.
Sure, but I'm saying that MD should push the policy to default to
doing this. You can turn if off if you like and if you know enough.
>> - Automatically setup an automatic scrub of the array that happens
>> weekly unless you explicitly turn it off. This would possibly
>> require changes from the distros, but if it could be made a core
>> part of MD so that all the blocks in the array get read each week,
>> that would help with silent failures.
Phil> I understand some distros already do this.
>> We've got all these compute cycles kicking around that could be used
>> to make things even more reliable, we should be using them in some
>> smart way.
Phil> But the "smart way" varies with the hardware at hand. There's
Phil> no "one size fits all" solution here.
What's the common thread? A RAID5 loses a disk. While rebuilding,
another disk goes south. Poof! The entire array is toast until you
go through alot of manual steps to re-create. All I'm suggesting is
that when in a degraded state, MD automatically becomes more tolerant
of timeouts, errors and tries harder to keep going.
John
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 13:52 ` John Stoffel
2013-05-03 14:51 ` Phil Turmel
@ 2013-05-03 16:29 ` Mikael Abrahamsson
2013-05-03 19:29 ` John Stoffel
1 sibling, 1 reply; 24+ messages in thread
From: Mikael Abrahamsson @ 2013-05-03 16:29 UTC (permalink / raw)
To: John Stoffel; +Cc: linux-raid
On Fri, 3 May 2013, John Stoffel wrote:
> We've got all these compute cycles kicking around that could be used to
> make things even more reliable, we should be using them in some smart
> way.
I would like to see mdadm tell the user that RAID5 is not a recommended
raid level for component drives over 200 TB in size and recommend people
to use RAID6, and ask for confirmation from the operator to go ahead and
create RAID5 anyway if operator really wants that.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 16:23 ` John Stoffel
@ 2013-05-03 16:32 ` Roman Mamedov
2013-05-04 14:48 ` maurice
0 siblings, 1 reply; 24+ messages in thread
From: Roman Mamedov @ 2013-05-03 16:32 UTC (permalink / raw)
To: John Stoffel
Cc: Phil Turmel, Robin Hill, Andreas Boman, linux-raid,
Benjamin ESTRABAUD
[-- Attachment #1: Type: text/plain, Size: 660 bytes --]
On Fri, 3 May 2013 12:23:24 -0400
"John Stoffel" <john@stoffel.org> wrote:
> Sure, but I'm saying that MD should push the policy to default to
> doing this. You can turn if off if you like and if you know enough.
I wouldn't want mdadm suddenly change its behavior on my systems to activate
some automated waste of resources, and leave me scrambling how to disable
this, with the point of this whole change being to pander to stupid and greedy
people (because using RAID5 and not RAID6 with a set-up of six 4 TB drives
is nothing but utter stupidity and greed). Let them lose their arrays once or
twice, and learn.
--
With respect,
Roman
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 16:29 ` Mikael Abrahamsson
@ 2013-05-03 19:29 ` John Stoffel
2013-05-04 4:14 ` Mikael Abrahamsson
0 siblings, 1 reply; 24+ messages in thread
From: John Stoffel @ 2013-05-03 19:29 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: John Stoffel, linux-raid
>>>>> "Mikael" == Mikael Abrahamsson <swmike@swm.pp.se> writes:
Mikael> On Fri, 3 May 2013, John Stoffel wrote:
>> We've got all these compute cycles kicking around that could be used to
>> make things even more reliable, we should be using them in some smart
>> way.
Mikael> I would like to see mdadm tell the user that RAID5 is not a recommended
Mikael> raid level for component drives over 200 TB in size and recommend people
Mikael> to use RAID6, and ask for confirmation from the operator to go ahead and
Mikael> create RAID5 anyway if operator really wants that.
200T? Isn't that a little big? *grin* Actually I think the warning
should be based on both number of devices in the array and the size of
members and the speed of the individual devices.
Makeing a RADI5 with 10 RAM disks that are 10Gb each in size and can
write at 500MB/s might not that bad a thing to do. I'm being a little
silly here, but I think the idea is right. We need to account for all
three factors of a RAID5 array: number of devices, size, and speed of
each devices.
John
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 19:29 ` John Stoffel
@ 2013-05-04 4:14 ` Mikael Abrahamsson
0 siblings, 0 replies; 24+ messages in thread
From: Mikael Abrahamsson @ 2013-05-04 4:14 UTC (permalink / raw)
To: John Stoffel; +Cc: linux-raid
On Fri, 3 May 2013, John Stoffel wrote:
> 200T? Isn't that a little big? *grin* Actually I think the warning
> should be based on both number of devices in the array and the size of
> members and the speed of the individual devices.
Yeah, I meant GB.
> Makeing a RADI5 with 10 RAM disks that are 10Gb each in size and can
> write at 500MB/s might not that bad a thing to do. I'm being a little
> silly here, but I think the idea is right. We need to account for all
> three factors of a RAID5 array: number of devices, size, and speed of
> each devices.
Well, yes, 3 devices might be ok for RAID5, but for 4 devices or more I
would like the operator to read a page and make a really informed
decision, warning them about the consequences of RAID5 on large drives.
This page could also inform users about the mismatched SATA timeout (or we
actually get the SATA layer default changed from 30 to 180 seconds),
because right now I'd say 30 seconds isn't good for anything. RAID drives
have 7 seconds default before reporting an error, making 30 seconds too
long, and consumer drives have 120 seconds (?) making 30 seconds way too
short.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 12:26 ` Ole Tange
@ 2013-05-04 11:29 ` Andreas Boman
2013-05-05 14:00 ` Andreas Boman
1 sibling, 0 replies; 24+ messages in thread
From: Andreas Boman @ 2013-05-04 11:29 UTC (permalink / raw)
To: Ole Tange; +Cc: linux-raid
On 05/03/2013 08:26 AM, Ole Tange wrote:
> On Fri, May 3, 2013 at 1:23 PM, Andreas Boman<aboman@midgaard.us> wrote:
>
>> This morning I came up to see the array degraded with two missing drives,
>> another failed during the rebuild.
> I just started this page for dealing with situations like yours:
> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID
>
>
> /Ole
Thanks Ole, that is a great resource. Thanks to everybody else who
offered advise as well. I'll be getting new disks coming in today and
will try to recover.
I didn't mean to set of a contentious debate about default behavior of
mdadm. However having an option where mdadm can 'try harder' to complete
would be a good thing. It should probably remain an option to prevent
further beating on the disk and thus further data loss for those that
would prefer to remove the disk and clone it before trying to restore.
/Andreas
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 16:32 ` Roman Mamedov
@ 2013-05-04 14:48 ` maurice
0 siblings, 0 replies; 24+ messages in thread
From: maurice @ 2013-05-04 14:48 UTC (permalink / raw)
To: Roman Mamedov; +Cc: linux-raid
On 5/3/2013 10:32 AM, Roman Mamedov wrote:
> ..
> with the point of this whole change being to pander to stupid and greedy
> people (because using RAID5 and not RAID6 with a set-up of six 4 TB drives
> is nothing but utter stupidity and greed).
More likely is the "cheapness"of using consumer desktop drives, and then
acting outraged when they do not act reliably as enterprise drives.
Of course the drive manufacturers only make those expensive enterprise
drives to extract more money from us,
they are really all the same, and it is all one big conspiracy!
--
Cheers,
Maurice Hilarius
eMail: /mhilarius@gmail.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-03 12:26 ` Ole Tange
2013-05-04 11:29 ` Andreas Boman
@ 2013-05-05 14:00 ` Andreas Boman
2013-05-05 17:16 ` Andreas Boman
1 sibling, 1 reply; 24+ messages in thread
From: Andreas Boman @ 2013-05-05 14:00 UTC (permalink / raw)
To: Ole Tange; +Cc: linux-raid
On 05/03/2013 08:26 AM, Ole Tange wrote:
> On Fri, May 3, 2013 at 1:23 PM, Andreas Boman<aboman@midgaard.us> wrote:
>
>> This morning I came up to see the array degraded with two missing drives,
>> another failed during the rebuild.
> I just started this page for dealing with situations like yours:
> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID
>
>
> /Ole
After having ddrescue running all night, it dropped to copying at a rate
of 512B/s. I interrupted it and restarted, it stays at that speed. shows
no errors:
Press Ctrl-C to interrupt
Initial status (read from logfile)
rescued: 557052 MB, errsize: 0 B, errors: 0
Current status
rescued: 1493 GB, errsize: 0 B, current rate: 512 B/s
ipos: 937316 MB, errors: 0, average rate: 16431 kB/s
opos: 937316 MB, time from last successful read: 0 s
Copying non-tried blocks...
However that is much too slow...
Then, I decided to take a look at the superblocks and to my horror
discovered this:
# mdadm --examine /dev/sd[b-g] >>raid.status
mdadm: No md superblock detected on /dev/sdb.
mdadm: No md superblock detected on /dev/sdc.
mdadm: No md superblock detected on /dev/sdd.
mdadm: No md superblock detected on /dev/sde.
mdadm: No md superblock detected on /dev/sdf.
mdadm: No md superblock detected on /dev/sdg.
Can I recover still? What is going on here?
Thanks,
Andreas
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-05 14:00 ` Andreas Boman
@ 2013-05-05 17:16 ` Andreas Boman
2013-05-06 1:10 ` Sam Bingner
2013-05-06 3:21 ` Phil Turmel
0 siblings, 2 replies; 24+ messages in thread
From: Andreas Boman @ 2013-05-05 17:16 UTC (permalink / raw)
To: Ole Tange; +Cc: linux-raid
On 05/05/2013 10:00 AM, Andreas Boman wrote:
> On 05/03/2013 08:26 AM, Ole Tange wrote:
>> On Fri, May 3, 2013 at 1:23 PM, Andreas Boman<aboman@midgaard.us>
>> wrote:
>>
>>> This morning I came up to see the array degraded with two missing
>>> drives,
>>> another failed during the rebuild.
>> I just started this page for dealing with situations like yours:
>> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID
>>
>>
>> /Ole
>
> After having ddrescue running all night, it dropped to copying at a
> rate of 512B/s. I interrupted it and restarted, it stays at that
> speed. shows no errors:
>
> Press Ctrl-C to interrupt
> Initial status (read from logfile)
> rescued: 557052 MB, errsize: 0 B, errors: 0
> Current status
> rescued: 1493 GB, errsize: 0 B, current rate: 512
> B/s
> ipos: 937316 MB, errors: 0, average rate: 16431
> kB/s
> opos: 937316 MB, time from last successful read:
> 0 s
> Copying non-tried blocks...
>
>
> However that is much too slow...
>
> Then, I decided to take a look at the superblocks and to my horror
> discovered this:
>
> # mdadm --examine /dev/sd[b-g] >>raid.status
> mdadm: No md superblock detected on /dev/sdb.
> mdadm: No md superblock detected on /dev/sdc.
> mdadm: No md superblock detected on /dev/sdd.
> mdadm: No md superblock detected on /dev/sde.
> mdadm: No md superblock detected on /dev/sdf.
> mdadm: No md superblock detected on /dev/sdg.
>
> Can I recover still? What is going on here?
>
> Thanks,
> Andreas
>
Turns out the superblocks are there. I ran --examine on the disk instead
of partition. OOps.
I still have the problem with ddrescue being very slow, it is running at
512 B/s pretty much no matter what options I use. The ddrescued disk
does NOT have a md superblock. I tried to ddrescue -i to skip and grab
the last 3MB or so of the disk, that seemed to work, but I still don't
have the superblock.
How do I find/recover the superblock from the original disk?
After that is done I'll try to get the array up with 4 disks, then add
the spare and have it rebuild. After that I'll add a disk to go to raid 6.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-05 17:16 ` Andreas Boman
@ 2013-05-06 1:10 ` Sam Bingner
2013-05-06 3:21 ` Phil Turmel
1 sibling, 0 replies; 24+ messages in thread
From: Sam Bingner @ 2013-05-06 1:10 UTC (permalink / raw)
To: Andreas Boman; +Cc: Ole Tange, linux-raid@vger.kernel.org
On May 5, 2013, at 7:17 AM, "Andreas Boman" <aboman@midgaard.us> wrote:
> On 05/05/2013 10:00 AM, Andreas Boman wrote:
>> On 05/03/2013 08:26 AM, Ole Tange wrote:
>>> On Fri, May 3, 2013 at 1:23 PM, Andreas Boman<aboman@midgaard.us> wrote:
>>>
>>>> This morning I came up to see the array degraded with two missing drives,
>>>> another failed during the rebuild.
>>> I just started this page for dealing with situations like yours:
>>> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID
>>>
>>>
>>> /Ole
>>
>> After having ddrescue running all night, it dropped to copying at a rate of 512B/s. I interrupted it and restarted, it stays at that speed. shows no errors:
>>
>> Press Ctrl-C to interrupt
>> Initial status (read from logfile)
>> rescued: 557052 MB, errsize: 0 B, errors: 0
>> Current status
>> rescued: 1493 GB, errsize: 0 B, current rate: 512 B/s
>> ipos: 937316 MB, errors: 0, average rate: 16431 kB/s
>> opos: 937316 MB, time from last successful read: 0 s
>> Copying non-tried blocks...
>>
>>
>> However that is much too slow...
>>
>> Then, I decided to take a look at the superblocks and to my horror discovered this:
>>
>> # mdadm --examine /dev/sd[b-g] >>raid.status
>> mdadm: No md superblock detected on /dev/sdb.
>> mdadm: No md superblock detected on /dev/sdc.
>> mdadm: No md superblock detected on /dev/sdd.
>> mdadm: No md superblock detected on /dev/sde.
>> mdadm: No md superblock detected on /dev/sdf.
>> mdadm: No md superblock detected on /dev/sdg.
>>
>> Can I recover still? What is going on here?
>>
>> Thanks,
>> Andreas
> Turns out the superblocks are there. I ran --examine on the disk instead of partition. OOps.
>
> I still have the problem with ddrescue being very slow, it is running at 512 B/s pretty much no matter what options I use. The ddrescued disk does NOT have a md superblock. I tried to ddrescue -i to skip and grab the last 3MB or so of the disk, that seemed to work, but I still don't have the superblock.
>
> How do I find/recover the superblock from the original disk?
>
> After that is done I'll try to get the array up with 4 disks, then add the spare and have it rebuild. After that I'll add a disk to go to raid 6.
>
> Thanks,
> Andreas
>
You need to just let ddrescue run - it is probably on the area of the disk with problems. When it gets past that it should speed up again.
If you just want to get the rest first you could add an entry to the log file saying some area around where you are now failed so that it goes back and tries it later, but I would not do that if I were you.
Sam
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-05 17:16 ` Andreas Boman
2013-05-06 1:10 ` Sam Bingner
@ 2013-05-06 3:21 ` Phil Turmel
[not found] ` <51878BD0.9010809@midgaard.us>
1 sibling, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-06 3:21 UTC (permalink / raw)
To: Andreas Boman; +Cc: Ole Tange, linux-raid
Hi Andreas,
On 05/05/2013 01:16 PM, Andreas Boman wrote:
[trim /]
> Turns out the superblocks are there. I ran --examine on the disk instead
> of partition. OOps.
Please share the "--examine" reports for your array, and "smartctl -x"
for each disk, and anything from dmesg/syslog that relates to your array
or errors on its members. (Your original post did say you would be able
to get log info.)
> I still have the problem with ddrescue being very slow, it is running at
> 512 B/s pretty much no matter what options I use. The ddrescued disk
> does NOT have a md superblock. I tried to ddrescue -i to skip and grab
> the last 3MB or so of the disk, that seemed to work, but I still don't
> have the superblock.
>
> How do I find/recover the superblock from the original disk?
Superblocks are either at/near the beginning of the block device (v1.1 &
v1.2) or near the end (v0.90 and v1.0). If you've already recovered
beginning and end, and it's still not there, then you won't find it.
It may have to be reconstructed as part of "--create --assume-clean",
but that is a dangerous operation. You haven't yet shared enough
information to get good advice.
> After that is done I'll try to get the array up with 4 disks, then add
> the spare and have it rebuild. After that I'll add a disk to go to raid 6.
It may be wiser to get it running degraded and take a backup, but that
remains to be seen. You haven't shown that you know why the first
rebuild failed. Until that is understood and addressed, you probably
won't succeed in rebuilding onto a spare.
Phil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
[not found] ` <51878BD0.9010809@midgaard.us>
@ 2013-05-06 12:36 ` Phil Turmel
[not found] ` <5188189D.1060806@midgaard.us>
0 siblings, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-06 12:36 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid
Hi Andreas,
You dropped the list. Please don't do that. I added it back, and left
the end of the mail untrimmed so the list can see it.
On 05/06/2013 06:54 AM, Andreas Boman wrote:
> On 05/05/2013 11:21 PM, Phil Turmel wrote:
>> Hi Andreas,
>>
>> On 05/05/2013 01:16 PM, Andreas Boman wrote:
>>
>> [trim /]
>>
>>> Turns out the superblocks are there. I ran --examine on the disk instead
>>> of partition. OOps.
>>
>> Please share the "--examine" reports for your array, and "smartctl -x"
>> for each disk, and anything from dmesg/syslog that relates to your array
>> or errors on its members. (Your original post did say you would be able
>> to get log info.)
>
> The --examine for the array (as it is now) and smartctl -x for the
> failed disk are at the end of this mail.
>
> I pasted some log snippets here: http://pastebin.com/iqnYje1W
> This should be the interesting part:
>
> May 2 15:50:14 yggdrasil kernel: [ 7.247383] md: md127 stopped.
> May 2 15:50:14 yggdrasil kernel: [ 7.794697] raid5: allocated 5334kB
> for md127
> May 2 15:50:14 yggdrasil kernel: [ 7.794843] md127: detected
> capacity change from 0 to 6001196793856
> May 2 15:50:14 yggdrasil kernel: [ 7.796294] md127: unknown
> partition table
> May 2 15:54:36 yggdrasil kernel: [ 287.180692] md: recovery of RAID
> array md127
> May 2 22:40:26 yggdrasil kernel: [24637.888695] raid5:md127: read error
> not correctable (sector 884472576 on sda1).
Current versions of MD raid in the kernel allow multiple read errors per
hour before kicking out a drive. What kernel and mdadm versions are
involved here?
> Disk was sda at the time, sdb now don't ask why it reorders at times, I
> don't know. Sometimes the on board boot disk is sda, sometimes it is the
> last disk it seems.
You need to document the device names vs. drive S/Ns so you don't mess
up any "--create" operations. This is one of the reasons "--create
--assume-clean" is so dangerous.
I recommend my own "lsdrv" @ github.com/pturmel. But an excerpt from
"ls -l /dev/disk/by-id/" will do.
Use of LABEL= and UUID= syntax in fstab and during boot is intended to
mitigate the fact that the kernel cannot guarantee the order it finds
devices during boot.
> I tried to jump ddrescue to the end of the drive to ensure I get the md
> superblock and then live with some lost data after file system repair.
>
> ./ddrescue -f -d -n -i1499889899520 /dev/sdb /dev/sdf /root/rescue.log
>
> ^that is what i did (i tried to go further and further). These completed
> every time with no error. Also no superblock copied.
You have a v0.90 array. The superblock is within 128k of the end of the
partition.
> I'm guessing its some kind of user error that prevents me from copying
> that superblock.
Yes, something destroyed it.
> I'm still trying to determine if bringing the array up (--assemble
> --force) using this disk with the missing data will be just bad or very
> bad? I've been told that mdadm doesn't care, but what will it do when
> data is missing in a chunk on this disk?
Presuming you mean while using the ddrescued copy, then any bad data
will show up in the array's files. There's no help for that.
>>> After that is done I'll try to get the array up with 4 disks, then add
>>> the spare and have it rebuild. After that I'll add a disk to go to
>>> raid 6.
>>
>> It may be wiser to get it running degraded and take a backup, but that
>> remains to be seen. You haven't shown that you know why the first
>> rebuild failed. Until that is understood and addressed, you probably
>> won't succeed in rebuilding onto a spare.
You only shared one "smartctl -x" report. Please show the others. If
the others show pending sectors, you will have more difficulty after
rescuing sdb. (You will need to use ddrescue on the other drives that
show pending sectors.)
/dev/sdb has six pending sectors--unrecoverable read errors that won't
be resolved until those sectors are rewritten. They might be normal
transient errors that'll be fine after rewrite. Or they might be
unwritable, and the drive will have to re-allocate them. You need
regular "check" scrubs in a non-degraded array to catch these early and
fix them.
Since ddrescue is struggling with this disk starting at 884471083, close
to the point where MD kicked it, you might have a large damage area that
can't be rewritten.
> I have been wondering about that, it would be difficult to do (not to
> mention I'd have to buy a bunch of large disks to backup to), but I have
> (am) considered it.
Be careful selecting drives. The Samsung drive has ERC--you really want
to pick drives that have it.
If I understand correctly, your current plan is to ddrescue sdb, then
assemble degraded (with --force). I agree with this plan, and I think
you should not need to use "--create --assume-clean". You will need to
fsck the filesystem before you mount, and accept that some data will be
lost. Be sure to remove sdb from the system after you've duplicated it,
as two drives with identical metadata will cause problems for MD.
Phil
>
> Thanks,
> Andreas
>
>
> ---------------------metadata
> /dev/sdb1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 60b8d5d0:00c342d3:59cb281a:834c72d9
> Creation Time : Sun Oct 3 06:23:33 2010
> Raid Level : raid5
> Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
> Array Size : 5860543744 (5589.05 GiB 6001.20 GB)
> Raid Devices : 5
> Total Devices : 5
> Preferred Minor : 127
>
> Update Time : Thu May 2 22:15:22 2013
> State : clean
> Active Devices : 4
> Working Devices : 5
> Failed Devices : 1
> Spare Devices : 1
> Checksum : dd5e9120 - correct
> Events : 1011948
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 2 8 1 2 active sync /dev/sda1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 0 0 1 faulty removed
> 2 2 8 1 2 active sync /dev/sda1
> 3 3 8 49 3 active sync /dev/sdd1
> 4 4 8 65 4 active sync /dev/sde1
> 5 5 8 17 5 spare /dev/sdb1
> /dev/sdc1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 60b8d5d0:00c342d3:59cb281a:834c72d9
> Creation Time : Sun Oct 3 06:23:33 2010
> Raid Level : raid5
> Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
> Array Size : 5860543744 (5589.05 GiB 6001.20 GB)
> Raid Devices : 5
> Total Devices : 5
> Preferred Minor : 127
>
> Update Time : Fri May 3 05:31:49 2013
> State : clean
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 2
> Spare Devices : 1
> Checksum : dd5ef826 - correct
> Events : 1012026
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 5 8 17 5 spare /dev/sdb1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 0 0 1 faulty removed
> 2 2 0 0 2 faulty removed
> 3 3 8 49 3 active sync /dev/sdd1
> 4 4 8 65 4 active sync /dev/sde1
> 5 5 8 17 5 spare /dev/sdb1
> /dev/sdd1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 60b8d5d0:00c342d3:59cb281a:834c72d9
> Creation Time : Sun Oct 3 06:23:33 2010
> Raid Level : raid5
> Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
> Array Size : 5860543744 (5589.05 GiB 6001.20 GB)
> Raid Devices : 5
> Total Devices : 5
> Preferred Minor : 127
>
> Update Time : Fri May 3 05:31:49 2013
> State : clean
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 2
> Spare Devices : 1
> Checksum : dd5ef832 - correct
> Events : 1012026
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 0 8 33 0 active sync /dev/sdc1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 0 0 1 faulty removed
> 2 2 0 0 2 faulty removed
> 3 3 8 49 3 active sync /dev/sdd1
> 4 4 8 65 4 active sync /dev/sde1
> 5 5 8 17 5 spare /dev/sdb1
> /dev/sde1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 60b8d5d0:00c342d3:59cb281a:834c72d9
> Creation Time : Sun Oct 3 06:23:33 2010
> Raid Level : raid5
> Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
> Array Size : 5860543744 (5589.05 GiB 6001.20 GB)
> Raid Devices : 5
> Total Devices : 5
> Preferred Minor : 127
>
> Update Time : Fri May 3 05:31:49 2013
> State : clean
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 2
> Spare Devices : 1
> Checksum : dd5ef848 - correct
> Events : 1012026
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 3 8 49 3 active sync /dev/sdd1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 0 0 1 faulty removed
> 2 2 0 0 2 faulty removed
> 3 3 8 49 3 active sync /dev/sdd1
> 4 4 8 65 4 active sync /dev/sde1
> 5 5 8 17 5 spare /dev/sdb1
> /dev/sdg1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 60b8d5d0:00c342d3:59cb281a:834c72d9
> Creation Time : Sun Oct 3 06:23:33 2010
> Raid Level : raid5
> Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
> Array Size : 5860543744 (5589.05 GiB 6001.20 GB)
> Raid Devices : 5
> Total Devices : 5
> Preferred Minor : 127
>
> Update Time : Fri May 3 05:31:49 2013
> State : clean
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 2
> Spare Devices : 1
> Checksum : dd5ef85a - correct
> Events : 1012026
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 4 8 65 4 active sync /dev/sde1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 0 0 1 faulty removed
> 2 2 0 0 2 faulty removed
> 3 3 8 49 3 active sync /dev/sdd1
> 4 4 8 65 4 active sync /dev/sde1
> 5 5 8 17 5 spare /dev/sdb1
>
>
>
> ---------------------smartctl -x
>
>
> smartctl -x /dev/sdb
> smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF INFORMATION SECTION ===
> Model Family: SAMSUNG SpinPoint F2 EG series
> Device Model: SAMSUNG HD154UI
> Serial Number: S1Y6J1LZ100168
> Firmware Version: 1AG01118
> User Capacity: 1,500,301,910,016 bytes
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 8
> ATA Standard is: ATA-8-ACS revision 3b
> Local Time is: Mon May 6 05:41:26 2013 EDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status: (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status: ( 114) The previous self-test
> completed having
> the read element of the test failed.
> Total time to complete Offline
> data collection: (19591) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability: (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: ( 2) minutes.
> Extended self-test routine
> recommended polling time: ( 255) minutes.
> Conveyance self-test routine
> recommended polling time: ( 34) minutes.
> SCT capabilities: (0x003f) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail
> Always - 17
> 3 Spin_Up_Time 0x0007 063 063 011 Pre-fail
> Always - 11770
> 4 Start_Stop_Count 0x0032 100 100 000 Old_age
> Always - 155
> 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail
> Always - 0
> 8 Seek_Time_Performance 0x0025 100 097 015 Pre-fail
> Offline - 14926
> 9 Power_On_Hours 0x0032 100 100 000 Old_age
> Always - 378
> 10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail
> Always - 0
> 11 Calibration_Retry_Count 0x0012 100 100 000 Old_age
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 67
> 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age
> Always - 17
> 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age
> Always - 1
> 184 End-to-End_Error 0x0033 100 100 000 Pre-fail
> Always - 0
> 187 Reported_Uncorrect 0x0032 100 100 000 Old_age
> Always - 18
> 188 Command_Timeout 0x0032 100 100 000 Old_age
> Always - 0
> 190 Airflow_Temperature_Cel 0x0022 075 068 000 Old_age
> Always - 25 (Lifetime Min/Max 17/32)
> 194 Temperature_Celsius 0x0022 075 066 000 Old_age
> Always - 25 (Lifetime Min/Max 17/34)
> 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age
> Always - 463200379
> 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age
> Always - 0
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
> Always - 6
> 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age
> Offline - 1
> 199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age
> Always - 0
> 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age
> Always - 0
> 201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age
> Always - 0
>
> General Purpose Logging (GPL) feature set supported
> General Purpose Log Directory Version 1
> SMART Log Directory Version 1 [multi-sector log support]
> GP/S Log at address 0x00 has 1 sectors [Log Directory]
> SMART Log at address 0x01 has 1 sectors [Summary SMART error log]
> SMART Log at address 0x02 has 2 sectors [Comprehensive SMART error log]
> GP Log at address 0x03 has 2 sectors [Ext. Comprehensive SMART
> error log]
> GP Log at address 0x04 has 2 sectors [Device Statistics]
> SMART Log at address 0x06 has 1 sectors [SMART self-test log]
> GP Log at address 0x07 has 2 sectors [Extended self-test log]
> SMART Log at address 0x09 has 1 sectors [Selective self-test log]
> GP Log at address 0x10 has 1 sectors [NCQ Command Error]
> GP Log at address 0x11 has 1 sectors [SATA Phy Event Counters]
> GP Log at address 0x20 has 2 sectors [Streaming performance log]
> GP Log at address 0x21 has 1 sectors [Write stream error log]
> GP Log at address 0x22 has 1 sectors [Read stream error log]
> GP/S Log at address 0x80 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x81 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x82 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x83 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x84 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x85 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x86 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x87 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x88 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x89 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8a has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8b has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8c has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8d has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8e has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x8f has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x90 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x91 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x92 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x93 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x94 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x95 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x96 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x97 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x98 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x99 has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9a has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9b has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9c has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9d has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9e has 16 sectors [Host vendor specific log]
> GP/S Log at address 0x9f has 16 sectors [Host vendor specific log]
> GP/S Log at address 0xe0 has 1 sectors [SCT Command/Status]
> GP/S Log at address 0xe1 has 1 sectors [SCT Data Transfer]
>
> SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
> Device Error Count: 6
> CR = Command Register
> FEATR = Features Register
> COUNT = Count (was: Sector Count) Register
> LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8
> LH = LBA High (was: Cylinder High) Register ] LBA
> LM = LBA Mid (was: Cylinder Low) Register ] Register
> LL = LBA Low (was: Sector Number) Register ]
> DV = Device (was: Device/Head) Register
> DC = Device Control Register
> ER = Error register
> ST = Status register
> Powered_Up_Time is measured from power on, and printed as
> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> SS=sec, and sss=millisec. It "wraps" after 49.710 days.
>
> Error 6 [5] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 34 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 08 01 00 00 00 00 b7 f4 3f 40 00 06:51:19.350 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 f5 3f 40 00 06:51:19.350 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f7 3f 40 00 06:51:19.350 READ FPDMA
> QUEUED
> 60 00 70 01 00 00 00 00 b7 f6 3f 40 00 06:51:19.350 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f8 3f 40 00 06:51:19.350 READ FPDMA
> QUEUED
>
> Error 5 [4] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 35 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 08 01 00 00 00 00 b7 fb 3f 40 00 06:51:13.030 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 fa 3f 40 00 06:51:13.030 READ FPDMA
> QUEUED
> 60 00 08 00 e8 00 00 00 b7 f9 57 40 00 06:51:13.030 READ FPDMA
> QUEUED
> 60 00 70 00 18 00 00 00 b7 f9 3f 40 00 06:51:13.030 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f8 3f 40 00 06:51:13.030 READ FPDMA
> QUEUED
>
> Error 4 [3] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 33 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 08 01 00 00 00 00 b7 f4 3f 40 00 06:51:08.060 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 f5 3f 40 00 06:51:08.060 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f7 3f 40 00 06:51:08.060 READ FPDMA
> QUEUED
> 60 00 70 01 00 00 00 00 b7 f6 3f 40 00 06:51:08.060 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f8 3f 40 00 06:51:08.060 READ FPDMA
> QUEUED
>
> Error 3 [2] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 31 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 08 01 00 00 00 00 b7 fb 3f 40 00 06:51:03.650 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 fa 3f 40 00 06:51:03.650 READ FPDMA
> QUEUED
> 60 00 08 00 e8 00 00 00 b7 f9 57 40 00 06:51:03.650 READ FPDMA
> QUEUED
> 60 00 70 00 18 00 00 00 b7 f9 3f 40 00 06:51:03.650 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f8 3f 40 00 06:51:03.650 READ FPDMA
> QUEUED
>
> Error 2 [1] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 34 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 08 01 00 00 00 00 b7 f4 3f 40 00 06:50:58.240 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 f5 3f 40 00 06:50:58.240 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f7 3f 40 00 06:50:58.240 READ FPDMA
> QUEUED
> 60 00 70 01 00 00 00 00 b7 f6 3f 40 00 06:50:58.240 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f8 3f 40 00 06:50:58.240 READ FPDMA
> QUEUED
>
> Error 1 [0] occurred at disk power-on lifetime: 329 hours (13 days + 17
> hours)
> When the command that caused the error occurred, the device was active
> or idle.
>
> After command completion occurred, registers were:
> ER -- ST COUNT LBA_48 LH LM LL DV DC
> -- -- -- == -- == == == -- -- -- -- --
> 00 -- 42 00 00 00 00 34 b7 f5 2f 40 00
>
> Commands leading to the command that caused the error were:
> CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time
> Command/Feature_Name
> -- == -- == -- == == == -- -- -- -- -- ---------------
> --------------------
> 60 00 00 01 00 00 00 00 b7 f4 3f 40 00 06:50:53.300 READ FPDMA
> QUEUED
> 60 00 08 01 00 00 00 00 b7 f3 3f 40 00 06:50:53.280 READ FPDMA
> QUEUED
> 60 00 00 01 00 00 00 00 b7 f2 3f 40 00 06:50:53.280 READ FPDMA
> QUEUED
> 60 00 08 00 e8 00 00 00 b7 f1 57 40 00 06:50:53.280 READ FPDMA
> QUEUED
> 60 00 00 00 18 00 00 00 b7 f1 3f 40 00 06:50:53.270 READ FPDMA
> QUEUED
>
> SMART Extended Self-test Log Version: 1 (2 sectors)
> Num Test_Description Status Remaining
> LifeTime(hours) LBA_of_first_error
> # 1 Short offline Completed: read failure 20%
> 375 884471093
> # 2 Short offline Completed: read failure 20%
> 351 884471083
> # 3 Extended offline Completed: read failure 90%
> 337 884471094
> # 4 Short offline Completed: read failure 20%
> 333 884471093
> # 5 Short offline Completed without error 00%
> 309 -
> # 6 Short offline Completed without error 00%
> 285 -
> # 7 Short offline Completed without error 00%
> 261 -
> # 8 Short offline Completed without error 00%
> 237 -
> # 9 Short offline Completed without error 00%
> 213 -
> #10 Extended offline Completed: read failure 60%
> 203 884471093
> #11 Short offline Completed without error 00%
> 190 -
> #12 Short offline Completed without error 00%
> 166 -
> #13 Short offline Completed without error 00%
> 142 -
> #14 Short offline Completed without error 00%
> 118 -
> #15 Short offline Completed without error 00%
> 94 -
> #16 Short offline Completed without error 00%
> 70 -
>
> SMART Selective self-test log data structure revision number 1
> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> 1 0 0 Not_testing
> 2 0 0 Not_testing
> 3 0 0 Not_testing
> 4 0 0 Not_testing
> 5 0 0 Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
>
> SCT Status Version: 2
> SCT Version (vendor specific): 256 (0x0100)
> SCT Support Level: 1
> Device State: Active (0)
> Current Temperature: 25 Celsius
> Power Cycle Max Temperature: 34 Celsius
> Lifetime Max Temperature: 40 Celsius
> SCT Temperature History Version: 2
> Temperature Sampling Period: 1 minute
> Temperature Logging Interval: 1 minute
> Min/Max recommended Temperature: -4/72 Celsius
> Min/Max Temperature Limit: -9/77 Celsius
> Temperature History Size (Index): 128 (113)
>
> Index Estimated Time Temperature Celsius
> 114 2013-05-06 03:34 26 *******
> 115 2013-05-06 03:35 25 ******
> 116 2013-05-06 03:36 26 *******
> ... ..( 3 skipped). .. *******
> 120 2013-05-06 03:40 26 *******
> 121 2013-05-06 03:41 25 ******
> 122 2013-05-06 03:42 25 ******
> 123 2013-05-06 03:43 26 *******
> 124 2013-05-06 03:44 25 ******
> ... ..( 4 skipped). .. ******
> 1 2013-05-06 03:49 25 ******
> 2 2013-05-06 03:50 26 *******
> 3 2013-05-06 03:51 26 *******
> 4 2013-05-06 03:52 25 ******
> 5 2013-05-06 03:53 25 ******
> 6 2013-05-06 03:54 25 ******
> 7 2013-05-06 03:55 26 *******
> ... ..( 2 skipped). .. *******
> 10 2013-05-06 03:58 26 *******
> 11 2013-05-06 03:59 25 ******
> 12 2013-05-06 04:00 26 *******
> 13 2013-05-06 04:01 25 ******
> 14 2013-05-06 04:02 25 ******
> 15 2013-05-06 04:03 26 *******
> 16 2013-05-06 04:04 25 ******
> ... ..( 4 skipped). .. ******
> 21 2013-05-06 04:09 25 ******
> 22 2013-05-06 04:10 26 *******
> 23 2013-05-06 04:11 25 ******
> ... ..( 89 skipped). .. ******
> 113 2013-05-06 05:41 25 ******
>
> SCT Error Recovery Control:
> Read: 70 (7.0 seconds)
> Write: 70 (7.0 seconds)
>
> SATA Phy Event Counters (GP Log 0x11)
> ID Size Value Description
> 0x000a 2 7 Device-to-host register FISes sent due to a
> COMRESET
> 0x0001 2 0 Command failed due to ICRC error
> 0x0002 2 0 R_ERR response for data FIS
> 0x0003 2 0 R_ERR response for device-to-host data FIS
> 0x0004 2 0 R_ERR response for host-to-device data FIS
> 0x0005 2 0 R_ERR response for non-data FIS
> 0x0006 2 0 R_ERR response for device-to-host non-data FIS
> 0x0007 2 0 R_ERR response for host-to-device non-data FIS
> 0x0008 2 0 Device-to-host non-data FIS retries
> 0x0009 2 7 Transition from drive PhyRdy to drive PhyNRdy
> 0x000b 2 0 CRC errors within host-to-device FIS
> 0x000d 2 0 Non-CRC errors within host-to-device FIS
> 0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
> 0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
> 0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
> 0x0013 2 0 R_ERR response for host-to-device non-data FIS,
> non-CRC
>
>
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
[not found] ` <5188189D.1060806@midgaard.us>
@ 2013-05-07 0:39 ` Phil Turmel
2013-05-07 1:14 ` Andreas Boman
0 siblings, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-07 0:39 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid
On 05/06/2013 04:54 PM, Andreas Boman wrote:
> On 05/06/2013 08:36 AM, Phil Turmel wrote:
[trim /]
>> Current versions of MD raid in the kernel allow multiple read errors per
>> hour before kicking out a drive. What kernel and mdadm versions are
>> involved here?
> kernel 2.6.32-5-amd64, mdadm 3.1.4 (debian 6.0.7)
Ok. Missing some neat features, but not a crisis.
>>> Disk was sda at the time, sdb now don't ask why it reorders at times, I
>>> don't know. Sometimes the on board boot disk is sda, sometimes it is the
>>> last disk it seems.
>>
>> You need to document the device names vs. drive S/Ns so you don't mess
>> up any "--create" operations. This is one of the reasons "--create
>> --assume-clean" is so dangerous. I recommend my own "lsdrv" @
>> github.com/pturmel. But an excerpt from
>> "ls -l /dev/disk/by-id/" will do.
>>
>> Use of LABEL= and UUID= syntax in fstab and during boot is intended to
>> mitigate the fact that the kernel cannot guarantee the order it finds
>> devices during boot.
>>
> Noted, I'll look into this.
Thanks to smartctl, we now have an index of drive names to serial
numbers. Whenever you create an array, document what drive is what
role, just in case.
[trim /]
>>> I'm guessing its some kind of user error that prevents me from copying
>>> that superblock.
>>
>> Yes, something destroyed it.
> The superblock is available on the original drive, I can do mdadm -E
> /dev/sdb all day long. It just hasn't transferred to the new disk.
Hmmm. v0.90 is at the end of the member device. Does your partition go
all the way to the end? Please show your partition tables:
fdisk -lu /dev/sd[bcdefg]
>>> I'm still trying to determine if bringing the array up (--assemble
>>> --force) using this disk with the missing data will be just bad or very
>>> bad? I've been told that mdadm doesn't care, but what will it do when
>>> data is missing in a chunk on this disk?
>>
>> Presuming you mean while using the ddrescued copy, then any bad data
>> will show up in the array's files. There's no help for that.
> Right, but the array will come up and fdisk (xfs_repair) should be able
> to get it going again with most data avalable? mdadm won't get wonky
> when expected parity data isn't there for example?
Yes, xfs_repair will fix what's fixable. It might not notice file
contents that are no longer correct.
[trim /]
>> /dev/sdb has six pending sectors--unrecoverable read errors that won't
>> be resolved until those sectors are rewritten. They might be normal
>> transient errors that'll be fine after rewrite. Or they might be
>> unwritable, and the drive will have to re-allocate them. You need
>> regular "check" scrubs in a non-degraded array to catch these early and
>> fix them.
> I have smartd run daily 'short self test' and weekly 'long self test', I
> guess that wasn't enough.
No. Each drive by itself cannot fix its own errors. It needs its bad
data *rewritten* by an upper layer. MD will do this when it encounters
read errors in a non-degraded array. And it will test-read everything
to trigger these corrections during a "check" scrub. See the "Scrubbing
and Mismatches" section of the "md" man-page.
As long as these errors aren't bunched together so much that MD exceeds
its internal read error limits, the drives with these errors are fixed
and stay on online. More on this below, though.
>> Since ddrescue is struggling with this disk starting at 884471083, close
>> to the point where MD kicked it, you might have a large damage area that
>> can't be rewritten.
>>
>>> I have been wondering about that, it would be difficult to do (not to
>>> mention I'd have to buy a bunch of large disks to backup to), but I have
>>> (am) considered it.
>>
>> Be careful selecting drives. The Samsung drive has ERC--you really want
>> to pick drives that have it.
> Noted, I'll look into what that is and hope that my new disks have it as
> well.
Your /dev/sdb {SAMSUNG HD154UI S1Y6J1LZ100168} has ERC, and it is set to
the typical 7.0 seconds for RAID duty:
> SCT Error Recovery Control:
> Read: 70 (7.0 seconds)
> Write: 70 (7.0 seconds)
Your /dev/sdc {SAMSUNG HD154UI S1XWJX0D300206} also has ERC, but it is
disabled:
> SCT Error Recovery Control:
> Read: Disabled
> Write: Disabled
Fortunately, it has no pending sectors (yet).
Your /dev/sdd {SAMSUNG HD154UI S1XWJX0B900500} and /dev/sde {SAMSUNG
HD154UI S1XWJ1KS813588} also have ERC, and are also disabled.
If ERC is available, but disabled, it can be enabled by a suitable
script in /etc/local.d/ or in /etc/rc.local (Enterprise drives enable it
by default, desktop disks do not.), like so:
# smartctl -l scterc,70,70 /dev/sdc
Now for the bad news:
Your /dev/sdf {ST3000DM001-1CH166 W1F1LTQY} and /dev/sdg
{ST3000DM001-1CH166 W1F1LTQY} do not have ERC at all. Modern "green"
drives generally don't:
> Warning: device does not support SCT Error Recovery Control command
Since these cannot be set to a short error timeout, the linux driver's
timeout must be changed to tolerate 2+ minutes of error recovery. I
recommend 180 seconds. This must be put in /etc/local.d/ or
/etc/rc.local like so:
# echo 180 >/sys/block/sdf/device/timeout
If you don't do this, "check" scrubbing will fail. And by fail, I mean
any ordinary URE will kick drives out instead of fixing them. Search
the archives for "scterc" and you'll find more detailed explanations
(attached to horror stories).
>> If I understand correctly, your current plan is to ddrescue sdb, then
>> assemble degraded (with --force). I agree with this plan, and I think
>> you should not need to use "--create --assume-clean". You will need to
>> fsck the filesystem before you mount, and accept that some data will be
>> lost. Be sure to remove sdb from the system after you've duplicated it,
>> as two drives with identical metadata will cause problems for MD.
> Correct, that is the plan: Assemble degraded with the 3 'good' disks and
> the ddrescued copy, xfs_repair and add the 5th disk back. Allow it to
> resynch the array. Then reshape to raid 6. Allow that to finish, then
> add another disk and grow the array/lvm/filesystem. Looking at a lot of
> beating on the disks reshaping so much, but after that I should be fine
> for a while. I'll probably add a hot spare as well.
I would encourage you to take your backups of critical files as soon as
the array is running, before you add a fifth disk. Then you can add two
disks and recover/reshape simultaneously.
Phil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-07 0:39 ` Phil Turmel
@ 2013-05-07 1:14 ` Andreas Boman
2013-05-07 1:46 ` Phil Turmel
0 siblings, 1 reply; 24+ messages in thread
From: Andreas Boman @ 2013-05-07 1:14 UTC (permalink / raw)
To: Phil Turmel; +Cc: linux-raid
On 05/06/2013 08:39 PM, Phil Turmel wrote:
> On 05/06/2013 04:54 PM, Andreas Boman wrote:
>> On 05/06/2013 08:36 AM, Phil Turmel wrote:
>
> [trim /]
<snip>
>
>
> Hmmm. v0.90 is at the end of the member device. Does your partition go
> all the way to the end? Please show your partition tables:
>
> fdisk -lu /dev/sd[bcdefg]
fdisk -lu /dev/sd[bcdefg]
Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x3d1e17f0
Device Boot Start End Blocks Id System
/dev/sdb1 63 2930272064 1465136001 fd Linux raid
autodetect
Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sdc1 63 2930272064 1465136001 fd Linux raid
autodetect
Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sdd1 63 2930272064 1465136001 fd Linux raid
autodetect
Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x36cc19da
Device Boot Start End Blocks Id System
/dev/sde1 63 2930272064 1465136001 fd Linux raid
autodetect
Disk /dev/sdf: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x3d1e17f0
Device Boot Start End Blocks Id System
/dev/sdf1 63 2930272064 1465136001 fd Linux raid
autodetect
Partition 1 does not start on physical sector boundary.
Disk /dev/sdg: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000
Device Boot Start End Blocks Id System
/dev/sdg1 63 2930272064 1465136001 fd Linux raid
autodetect
Partition 1 does not start on physical sector boundary.
>> Warning: device does not support SCT Error Recovery Control command
>
> Since these cannot be set to a short error timeout, the linux driver's
> timeout must be changed to tolerate 2+ minutes of error recovery. I
> recommend 180 seconds. This must be put in /etc/local.d/ or
> /etc/rc.local like so:
>
> # echo 180>/sys/block/sdf/device/timeout
>
> If you don't do this, "check" scrubbing will fail. And by fail, I mean
> any ordinary URE will kick drives out instead of fixing them. Search
> the archives for "scterc" and you'll find more detailed explanations
> (attached to horror stories).
Thank you! I had no idea about that or I obviously would not have bought
those disks...
<snip>
>
> I would encourage you to take your backups of critical files as soon as
> the array is running, before you add a fifth disk. Then you can add two
> disks and recover/reshape simultaneously.
Hmm.. any hints as to how to do that at the same time? That does sound
better.
Thanks you for all your help/advice Phil.
Andreas
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-07 1:14 ` Andreas Boman
@ 2013-05-07 1:46 ` Phil Turmel
2013-05-07 2:08 ` Andreas Boman
0 siblings, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-07 1:46 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid
On 05/06/2013 09:14 PM, Andreas Boman wrote:
> fdisk -lu /dev/sd[bcdefg]
>
> Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
> 255 heads, 63 sectors/track, 182401 cylinders, total 2930277168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x3d1e17f0
>
> Device Boot Start End Blocks Id System
> /dev/sdb1 63 2930272064 1465136001 fd Linux raid
> autodetect
Oooo! That's not good. Your partitions are not on 4k boundaries, so
they won't be compatible with modern large drives. Modern fdisk puts
the first partition at sector 2048 by default. (Highly recommended.)
You're stuck with this on the old drives until you can rebuild the
entire array.
[trim /]
> Disk /dev/sdf: 3000.6 GB, 3000592982016 bytes
> 255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disk identifier: 0x3d1e17f0
>
> Device Boot Start End Blocks Id System
> /dev/sdf1 63 2930272064 1465136001 fd Linux raid
> autodetect
> Partition 1 does not start on physical sector boundary.
This is serious. The drives will run, but every block written to them
will create at least two read-modify-write cycles on 4k sectors. In
addition to crushing your array's performance, it will prevent scrub
actions from fixing UREs (the read part of the R-M-W will fail).
Fortunately, these new drives are bigger than the originals, so you can
put the partition at sector 2048 and still have it the same size as the
originals. Warning: v0.90 has problems with partitions greater than
2GB in some kernel versions. When you are ready to fix your overall
partition alignment issues, you probably want to switch to v1.1 or v1.2
metadata as well.
[trim /]
>> I would encourage you to take your backups of critical files as soon as
>> the array is running, before you add a fifth disk. Then you can add two
>> disks and recover/reshape simultaneously.
>
> Hmm.. any hints as to how to do that at the same time? That does sound
> better.
I believe you would set "sync_max" to "0" before adding the spares, then
issue the "--grow" command to reshape, then set "sync_max" to "max".
Others may want to chime in here.
Phil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-07 1:46 ` Phil Turmel
@ 2013-05-07 2:08 ` Andreas Boman
2013-05-07 2:16 ` Phil Turmel
0 siblings, 1 reply; 24+ messages in thread
From: Andreas Boman @ 2013-05-07 2:08 UTC (permalink / raw)
To: Phil Turmel; +Cc: linux-raid
On 05/06/2013 09:46 PM, Phil Turmel wrote:
> Fortunately, these new drives are bigger than the originals, so you
> can put the partition at sector 2048 and still have it the same size
> as the originals. Warning: v0.90 has problems with partitions greater
> than 2GB in some kernel versions. When you are ready to fix your
> overall partition alignment issues, you probably want to switch to
> v1.1 or v1.2 metadata as well. [trim /]
Ok, this is not great news.. I'll have to fix that. Later.
My ddrescue problem remains however, I'm still unable to get my degraded
array online with the ddrescued disk since I'm missing the md superblock
on that disk.
/Andreas
mdadm -E /dev/sdf1
mdadm: No md superblock detected on /dev/sdf1.
./ddrescue -f -d /dev/sdb /dev/sdf /root/rescue.log
GNU ddrescue 1.17-rc3
Press Ctrl-C to interrupt
Initial status (read from logfile)
rescued: 1493 GB, errsize: 0 B, errors: 0
Current status
rescued: 1493 GB, errsize: 0 B, current rate: 512 B/s
ipos: 942774 MB, errors: 0, average rate: 579 B/s
opos: 942774 MB, time since last successful read: 0 s
Copying non-tried blocks..
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-07 2:08 ` Andreas Boman
@ 2013-05-07 2:16 ` Phil Turmel
2013-05-07 2:21 ` Andreas Boman
0 siblings, 1 reply; 24+ messages in thread
From: Phil Turmel @ 2013-05-07 2:16 UTC (permalink / raw)
To: Andreas Boman; +Cc: linux-raid
On 05/06/2013 10:08 PM, Andreas Boman wrote:
> My ddrescue problem remains however, I'm still unable to get my degraded
> array online with the ddrescued disk since I'm missing the md superblock
> on that disk.
>
> /Andreas
>
> mdadm -E /dev/sdf1
> mdadm: No md superblock detected on /dev/sdf1.
Try using "blockdev --rereadpt /dev/sdf". Then check again.
Phil
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Failed during rebuild (raid5)
2013-05-07 2:16 ` Phil Turmel
@ 2013-05-07 2:21 ` Andreas Boman
0 siblings, 0 replies; 24+ messages in thread
From: Andreas Boman @ 2013-05-07 2:21 UTC (permalink / raw)
To: Phil Turmel; +Cc: linux-raid
On 05/06/2013 10:16 PM, Phil Turmel wrote:
> On 05/06/2013 10:08 PM, Andreas Boman wrote:
>> My ddrescue problem remains however, I'm still unable to get my degraded
>> array online with the ddrescued disk since I'm missing the md superblock
>> on that disk.
>>
>> /Andreas
>>
>> mdadm -E /dev/sdf1
>> mdadm: No md superblock detected on /dev/sdf1.
> Try using "blockdev --rereadpt /dev/sdf". Then check again.
>
> Phil
>
No change.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2013-05-07 2:21 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-03 11:23 Failed during rebuild (raid5) Andreas Boman
2013-05-03 11:38 ` Benjamin ESTRABAUD
2013-05-03 12:40 ` Robin Hill
2013-05-03 13:52 ` John Stoffel
2013-05-03 14:51 ` Phil Turmel
2013-05-03 16:23 ` John Stoffel
2013-05-03 16:32 ` Roman Mamedov
2013-05-04 14:48 ` maurice
2013-05-03 16:29 ` Mikael Abrahamsson
2013-05-03 19:29 ` John Stoffel
2013-05-04 4:14 ` Mikael Abrahamsson
2013-05-03 12:26 ` Ole Tange
2013-05-04 11:29 ` Andreas Boman
2013-05-05 14:00 ` Andreas Boman
2013-05-05 17:16 ` Andreas Boman
2013-05-06 1:10 ` Sam Bingner
2013-05-06 3:21 ` Phil Turmel
[not found] ` <51878BD0.9010809@midgaard.us>
2013-05-06 12:36 ` Phil Turmel
[not found] ` <5188189D.1060806@midgaard.us>
2013-05-07 0:39 ` Phil Turmel
2013-05-07 1:14 ` Andreas Boman
2013-05-07 1:46 ` Phil Turmel
2013-05-07 2:08 ` Andreas Boman
2013-05-07 2:16 ` Phil Turmel
2013-05-07 2:21 ` Andreas Boman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).