unable to recover RAID 5 due to bad block

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* unable to recover RAID 5 due to bad block
@ 2011-11-10 16:04 Chris Purves
  2011-11-10 16:20 ` Alexander Kühn
       [not found] ` <CALJXSJrdBX8xqkwamauRsz27LUQYg-gV-G4K+RrNkdhB5ki31w@mail.gmail.com>
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Purves @ 2011-11-10 16:04 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org

Hello,

I have a 5 disk RAID 5 where one of the disks died completely.  I replaced the failed disk and added a new disk to the array.  While it was rebuilding the new disk there was a read error on one of the "good" disks that caused the disk to be reset by the controller and the array was brought down.

I can bring the array back up in degraded mode by running mdadm --assemble --force /dev/md1; however, I have been unable to build the fifth disk.  I would appreciate any help.

-- 
Chris Purves

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:04 unable to recover RAID 5 due to bad block Chris Purves
@ 2011-11-10 16:20 ` Alexander Kühn
  2011-11-10 16:50   ` Robin Hill
  2011-11-10 16:50   ` Chris Purves
       [not found] ` <CALJXSJrdBX8xqkwamauRsz27LUQYg-gV-G4K+RrNkdhB5ki31w@mail.gmail.com>
  1 sibling, 2 replies; 11+ messages in thread
From: Alexander Kühn @ 2011-11-10 16:20 UTC (permalink / raw)
  To: Chris Purves; +Cc: linux-raid@vger.kernel.org

ddrescue to the rescue!
Get a another new disk, then ddrescue the one with the read error to  
the new disk.
Assemble the array using the new disk instead of the one with the read error.
You will loose the blocks that can't be read of course.
And in the future do run raid check/scrubbing at regular intervals. ;)
Alex.

Zitat von Chris Purves <chris@northfolk.ca>:

> Hello,
>
> I have a 5 disk RAID 5 where one of the disks died completely.  I  
> replaced the failed disk and added a new disk to the array.  While  
> it was rebuilding the new disk there was a read error on one of the  
> "good" disks that caused the disk to be reset by the controller and  
> the array was brought down.
>
> I can bring the array back up in degraded mode by running mdadm  
> --assemble --force /dev/md1; however, I have been unable to build  
> the fifth disk.  I would appreciate any help.
>
> -- 
> Chris Purves
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
       [not found] ` <CALJXSJrdBX8xqkwamauRsz27LUQYg-gV-G4K+RrNkdhB5ki31w@mail.gmail.com>
@ 2011-11-10 16:42   ` Jérôme Poulin
  0 siblings, 0 replies; 11+ messages in thread
From: Jérôme Poulin @ 2011-11-10 16:42 UTC (permalink / raw)
  To: linux-raid

On Thu, Nov 10, 2011 at 11:04 AM, Chris Purves <chris@northfolk.ca> wrote:
> I can bring the array back up in degraded mode by running mdadm --assemble --force /dev/md1; however, I have been unable to build the fifth disk.  I would appreciate any help.

The best way to proceed would be with the array down, you can ddrescue
the semi-good disk to a new good disk, then start the array and
recover by adding a new disk to the array.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:20 ` Alexander Kühn
@ 2011-11-10 16:50   ` Robin Hill
  2011-11-10 17:04     ` Chris Purves
  2011-11-10 16:50   ` Chris Purves
  1 sibling, 1 reply; 11+ messages in thread
From: Robin Hill @ 2011-11-10 16:50 UTC (permalink / raw)
  To: Chris Purves; +Cc: linux-raid@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 884 bytes --]

On Thu Nov 10, 2011 at 05:20:03PM +0100, Alexander Kühn wrote:

> ddrescue to the rescue!
> Get a another new disk, then ddrescue the one with the read error to  
> the new disk.
> Assemble the array using the new disk instead of the one with the read error.
> You will loose the blocks that can't be read of course.
> And in the future do run raid check/scrubbing at regular intervals. ;)
> Alex.
> 
You may be better cloning the original failed disk - that way it will
have a chance of actually recovering the read error (rather than having
to lose the blocks). It depends on why/how/when the original disk failed
though.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:20 ` Alexander Kühn
  2011-11-10 16:50   ` Robin Hill
@ 2011-11-10 16:50   ` Chris Purves
  2011-11-10 17:02     ` Alexander Kühn
  2011-11-10 20:11     ` NeilBrown
  1 sibling, 2 replies; 11+ messages in thread
From: Chris Purves @ 2011-11-10 16:50 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org; +Cc: Alexander Kühn, jeromepoulin

On 2011-11-10 12:20, Alexander Kühn wrote:
> ddrescue to the rescue!
> Get a another new disk, then ddrescue the one with the read error to the new disk.
> Assemble the array using the new disk instead of the one with the read error.
> You will loose the blocks that can't be read of course.
> And in the future do run raid check/scrubbing at regular intervals. ;)

I have tried this already.  After cloning the disk with errors, I replaced it with the clone and tried to re-start the array using

mdadm --assemble --force /dev/md1

mdadm assigned the new disk as a spare and said there were only three disks to start the array and so couldn't start it.

After I clone the disk with the error, how precisely should I re-start the array?


-- 
Chris Purves
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:50   ` Chris Purves
@ 2011-11-10 17:02     ` Alexander Kühn
  2011-11-10 17:07       ` Chris Purves
  2011-11-10 20:09       ` NeilBrown
  2011-11-10 20:11     ` NeilBrown
  1 sibling, 2 replies; 11+ messages in thread
From: Alexander Kühn @ 2011-11-10 17:02 UTC (permalink / raw)
  To: Chris Purves; +Cc: linux-raid@vger.kernel.org, jeromepoulin

Try --re-add.


Zitat von Chris Purves <chris@northfolk.ca>:

> On 2011-11-10 12:20, Alexander Kühn wrote:
>> ddrescue to the rescue!
>> Get a another new disk, then ddrescue the one with the read error  
>> to the new disk.
>> Assemble the array using the new disk instead of the one with the  
>> read error.
>> You will loose the blocks that can't be read of course.
>> And in the future do run raid check/scrubbing at regular intervals. ;)
>
> I have tried this already.  After cloning the disk with errors, I  
> replaced it with the clone and tried to re-start the array using
>
> mdadm --assemble --force /dev/md1
>
> mdadm assigned the new disk as a spare and said there were only  
> three disks to start the array and so couldn't start it.
>
> After I clone the disk with the error, how precisely should I  
> re-start the array?
>
>
> -- 
> Chris Purves
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:50   ` Robin Hill
@ 2011-11-10 17:04     ` Chris Purves
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Purves @ 2011-11-10 17:04 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org

On 2011-11-10 12:50, Robin Hill wrote:
> On Thu Nov 10, 2011 at 05:20:03PM +0100, Alexander Kühn wrote:
>
>> ddrescue to the rescue!
>> Get a another new disk, then ddrescue the one with the read error to
>> the new disk.
>> Assemble the array using the new disk instead of the one with the read error.
>> You will loose the blocks that can't be read of course.
>> And in the future do run raid check/scrubbing at regular intervals. ;)
>> Alex.
>>
> You may be better cloning the original failed disk - that way it will
> have a chance of actually recovering the read error (rather than having
> to lose the blocks). It depends on why/how/when the original disk failed
> though.
>
> Cheers,
>      Robin

The original failed disk can't be recognized by the controller.  I tried to read the disk on another machine with no luck.  It appears to be completely dead.

-- 
Chris Purves
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 17:02     ` Alexander Kühn
@ 2011-11-10 17:07       ` Chris Purves
  2011-11-10 20:09       ` NeilBrown
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Purves @ 2011-11-10 17:07 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org

On 2011-11-10 13:02, Alexander Kühn wrote:
> Try --re-add.
>


I'll give it a try.  It'll take several hours to clone the disk again (I had tried other things in the meantime) and then I'll report back.

Thanks.


-- 
Chris Purves
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 17:02     ` Alexander Kühn
  2011-11-10 17:07       ` Chris Purves
@ 2011-11-10 20:09       ` NeilBrown
  1 sibling, 0 replies; 11+ messages in thread
From: NeilBrown @ 2011-11-10 20:09 UTC (permalink / raw)
  To: Alexander Kühn
  Cc: Chris Purves, linux-raid@vger.kernel.org, jeromepoulin

[-- Attachment #1: Type: text/plain, Size: 1524 bytes --]

On Thu, 10 Nov 2011 18:02:34 +0100 Alexander Kühn
<alexander.kuehn@nagilum.de> wrote:

> Try --re-add.

Nope.  That doesn't make any sense at all.

NeilBrown


> 
> 
> Zitat von Chris Purves <chris@northfolk.ca>:
> 
> > On 2011-11-10 12:20, Alexander Kühn wrote:
> >> ddrescue to the rescue!
> >> Get a another new disk, then ddrescue the one with the read error  
> >> to the new disk.
> >> Assemble the array using the new disk instead of the one with the  
> >> read error.
> >> You will loose the blocks that can't be read of course.
> >> And in the future do run raid check/scrubbing at regular intervals. ;)
> >
> > I have tried this already.  After cloning the disk with errors, I  
> > replaced it with the clone and tried to re-start the array using
> >
> > mdadm --assemble --force /dev/md1
> >
> > mdadm assigned the new disk as a spare and said there were only  
> > three disks to start the array and so couldn't start it.
> >
> > After I clone the disk with the error, how precisely should I  
> > re-start the array?
> >
> >
> > -- 
> > Chris Purves
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block
  2011-11-10 16:50   ` Chris Purves
  2011-11-10 17:02     ` Alexander Kühn
@ 2011-11-10 20:11     ` NeilBrown
  2011-11-11  0:15       ` unable to recover RAID 5 due to bad block [solved] Chris Purves
  1 sibling, 1 reply; 11+ messages in thread
From: NeilBrown @ 2011-11-10 20:11 UTC (permalink / raw)
  To: Chris Purves
  Cc: linux-raid@vger.kernel.org, Alexander Kühn, jeromepoulin

[-- Attachment #1: Type: text/plain, Size: 1212 bytes --]

On Thu, 10 Nov 2011 12:50:46 -0400 Chris Purves <chris@northfolk.ca> wrote:

> On 2011-11-10 12:20, Alexander Kühn wrote:
> > ddrescue to the rescue!
> > Get a another new disk, then ddrescue the one with the read error to the new disk.
> > Assemble the array using the new disk instead of the one with the read error.
> > You will loose the blocks that can't be read of course.
> > And in the future do run raid check/scrubbing at regular intervals. ;)
> 
> I have tried this already.  After cloning the disk with errors, I replaced it with the clone and tried to re-start the array using
> 
> mdadm --assemble --force /dev/md1
> 
> mdadm assigned the new disk as a spare and said there were only three disks to start the array and so couldn't start it.
> 
> After I clone the disk with the error, how precisely should I re-start the array?
> 
> 

This should have worked.  So presumably some unstated requirement wasn't met.

Give details.  Names of devices, sizes of devices, "mdadm --examine" of
device.  Are you using partitions or whole disk ....

The best assemble command would be:
  mdadm --assemble /dev/md1 -vv ...list...of.devices..you...want.it.to.include. 


NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: unable to recover RAID 5 due to bad block [solved]
  2011-11-10 20:11     ` NeilBrown
@ 2011-11-11  0:15       ` Chris Purves
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Purves @ 2011-11-11  0:15 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org; +Cc: NeilBrown, Alexander Kühn, jeromepoulin

On 2011-11-10 16:11, NeilBrown wrote:
> On Thu, 10 Nov 2011 12:50:46 -0400 Chris Purves<chris@northfolk.ca>  wrote:
>
>> On 2011-11-10 12:20, Alexander Kühn wrote:
>>> ddrescue to the rescue!
>>> Get a another new disk, then ddrescue the one with the read error to the new disk.
>>> Assemble the array using the new disk instead of the one with the read error.
>>> You will loose the blocks that can't be read of course.
>>> And in the future do run raid check/scrubbing at regular intervals. ;)
>>
>> I have tried this already.  After cloning the disk with errors, I replaced it with the clone and tried to re-start the array using
>>
>> mdadm --assemble --force /dev/md1
>>
>> mdadm assigned the new disk as a spare and said there were only three disks to start the array and so couldn't start it.
>>
>> After I clone the disk with the error, how precisely should I re-start the array?
>>
>>
>
> This should have worked.  So presumably some unstated requirement wasn't met.
>
> Give details.  Names of devices, sizes of devices, "mdadm --examine" of
> device.  Are you using partitions or whole disk ....
>
> The best assemble command would be:
>    mdadm --assemble /dev/md1 -vv ...list...of.devices..you...want.it.to.include.
>
>
> NeilBrown

After ddrescue finished, I unplugged the disk with the bad sector (all 4 KB worth), and ran mdadm --assemble --force.

This time the array started up with four disks (three old, one new).  I'm not sure why it didn't work the previous time; I must have done something differently.

In any case, it's working now.  Thanks to everyone for your assistance.  It was a big help knowing that replacing the damaged disk with a clone should work.  It saved me a lot of hassle restoring from backup.  Now I just need to add a fifth disk and I'll be back to normal.


-- 
Chris Purves
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-11  0:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-10 16:04 unable to recover RAID 5 due to bad block Chris Purves
2011-11-10 16:20 ` Alexander Kühn
2011-11-10 16:50   ` Robin Hill
2011-11-10 17:04     ` Chris Purves
2011-11-10 16:50   ` Chris Purves
2011-11-10 17:02     ` Alexander Kühn
2011-11-10 17:07       ` Chris Purves
2011-11-10 20:09       ` NeilBrown
2011-11-10 20:11     ` NeilBrown
2011-11-11  0:15       ` unable to recover RAID 5 due to bad block [solved] Chris Purves
     [not found] ` <CALJXSJrdBX8xqkwamauRsz27LUQYg-gV-G4K+RrNkdhB5ki31w@mail.gmail.com>
2011-11-10 16:42   ` unable to recover RAID 5 due to bad block Jérôme Poulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).