RAID-5 degraded mode question

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID-5 degraded mode question
@ 2009-12-16 20:20 Tirumala Reddy Marri
  2009-12-16 21:12 ` Robin Hill
  0 siblings, 1 reply; 7+ messages in thread
From: Tirumala Reddy Marri @ 2009-12-16 20:20 UTC (permalink / raw)
  To: linux-raid

All,
  I have question on degraded mode RAID-5 operation md driver. If I
understand reads from the degraded mode md would cause  parity
calculations or rebuild of  data happens for every request. 

 Also as soon as disk failed md drivers marks that drive as faulty and
continue operation in degraded mode right ? Is there a way to get out
the degraded mode without adding spare drive. Assuming we have 5 disk
system with one failed drive.

Thanks in advance,
Marri

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID-5 degraded mode question
  2009-12-16 20:20 RAID-5 degraded mode question Tirumala Reddy Marri
@ 2009-12-16 21:12 ` Robin Hill
  2009-12-16 21:29   ` Tirumala Reddy Marri
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Hill @ 2009-12-16 21:12 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

On Wed Dec 16, 2009 at 12:20:04PM -0800, Tirumala Reddy Marri wrote:

> All,
>   I have question on degraded mode RAID-5 operation md driver. If I
> understand reads from the degraded mode md would cause  parity
> calculations or rebuild of  data happens for every request. 
> 
For pretty much every read, yes - some will end up with all the data on
the remaining disks, but most will need to recalculate from parity.

>  Also as soon as disk failed md drivers marks that drive as faulty and
> continue operation in degraded mode right ? Is there a way to get out
> the degraded mode without adding spare drive. Assuming we have 5 disk
> system with one failed drive.
> 
I'm not sure what you want to happen here.  The only way to get out of
degraded mode is to replace the drive in the array (if it's not actually
faulty then you can add it back, otherwise you need to add a new drive).
What were you thinking might happen otherwise?

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: RAID-5 degraded mode question
  2009-12-16 21:12 ` Robin Hill
@ 2009-12-16 21:29   ` Tirumala Reddy Marri
  2009-12-21 12:41     ` Goswin von Brederlow
  0 siblings, 1 reply; 7+ messages in thread
From: Tirumala Reddy Marri @ 2009-12-16 21:29 UTC (permalink / raw)
  To: Robin Hill, linux-raid

Thanks for the response.

>>  Also as soon as disk failed md drivers marks that drive as faulty
and 
>> continue operation in degraded mode right ? Is there a way to get out

>> the degraded mode without adding spare drive. Assuming we have 5 disk

>> system with one failed drive.
>> 
>I'm not sure what you want to happen here.  The only way to get out of
degraded mode is to replace the drive in the >array (if it's not
actually faulty then you can add it back, otherwise you need to add a
new drive).
>What were you thinking might happen otherwise?

I was thinking we can recover from this using re-sync or resize .After
running IO to degraded (RAID-5) /dev/md0, I am seeing an issue where
e2fsck reports inconsistent file system and corrects it. I am trying to
debug  to see if the issue is because of data not being written or
reading wrong data in degraded mode. 

I guess problem happening during the write. Reason is , after ran e2fsck
I don't see inconsistency any more.

Regards,
Marri

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID-5 degraded mode question
  2009-12-16 21:29   ` Tirumala Reddy Marri
@ 2009-12-21 12:41     ` Goswin von Brederlow
  2009-12-22  6:51       ` Michael Evans
  0 siblings, 1 reply; 7+ messages in thread
From: Goswin von Brederlow @ 2009-12-21 12:41 UTC (permalink / raw)
  To: Tirumala Reddy Marri; +Cc: Robin Hill, linux-raid

"Tirumala Reddy Marri" <tmarri@amcc.com> writes:

> Thanks for the response.
>
>>>  Also as soon as disk failed md drivers marks that drive as faulty
> and 
>>> continue operation in degraded mode right ? Is there a way to get out
>
>>> the degraded mode without adding spare drive. Assuming we have 5 disk
>
>>> system with one failed drive.
>>> 
>>I'm not sure what you want to happen here.  The only way to get out of
> degraded mode is to replace the drive in the >array (if it's not
> actually faulty then you can add it back, otherwise you need to add a
> new drive).
>>What were you thinking might happen otherwise?
>
>
> I was thinking we can recover from this using re-sync or resize .After

Theoretically you could shrink the array by one disk and then use that
spare disk to resync the parity. But that is a lengthy process with a
lot higher failure chance than resyncing to a new disk. Note that you
also need to shrink the filesystem on the raid first adding even more
stress and failure chance. So I really wouldn't recommend that.

> running IO to degraded (RAID-5) /dev/md0, I am seeing an issue where
> e2fsck reports inconsistent file system and corrects it. I am trying to
> debug  to see if the issue is because of data not being written or
> reading wrong data in degraded mode. 
>
> I guess problem happening during the write. Reason is , after ran e2fsck
> I don't see inconsistency any more.
>
> Regards,
> Marri

A degraded raid5 might get corrupted if your system crashes. If you
are writing to one of the remaining disks then it also needs to update
the parity block simultaneously. If it crashed between writing the
data and the parity then the data block on the failed drive will
appear changed. I'm not sure though if the raid will even assemble on
its own in such a case though. It might just complain about not having
enough in-sync disks.

Apart from that there should never be any corruption unless one of
your disks returns bad data on read.

MfG
        Goswin

PS: This is not a bug in linux raid but a fundamental limitation of
raid.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID-5 degraded mode question
  2009-12-21 12:41     ` Goswin von Brederlow
@ 2009-12-22  6:51       ` Michael Evans
  2009-12-22 13:37         ` Goswin von Brederlow
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Evans @ 2009-12-22  6:51 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Tirumala Reddy Marri, Robin Hill, linux-raid

On Mon, Dec 21, 2009 at 4:41 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> "Tirumala Reddy Marri" <tmarri@amcc.com> writes:
>
>> Thanks for the response.
>>
>>>>  Also as soon as disk failed md drivers marks that drive as faulty
>> and
>>>> continue operation in degraded mode right ? Is there a way to get out
>>
>>>> the degraded mode without adding spare drive. Assuming we have 5 disk
>>
>>>> system with one failed drive.
>>>>
>>>I'm not sure what you want to happen here.  The only way to get out of
>> degraded mode is to replace the drive in the >array (if it's not
>> actually faulty then you can add it back, otherwise you need to add a
>> new drive).
>>>What were you thinking might happen otherwise?
>>
>>
>> I was thinking we can recover from this using re-sync or resize .After
>
> Theoretically you could shrink the array by one disk and then use that
> spare disk to resync the parity. But that is a lengthy process with a
> lot higher failure chance than resyncing to a new disk. Note that you
> also need to shrink the filesystem on the raid first adding even more
> stress and failure chance. So I really wouldn't recommend that.
>
>> running IO to degraded (RAID-5) /dev/md0, I am seeing an issue where
>> e2fsck reports inconsistent file system and corrects it. I am trying to
>> debug  to see if the issue is because of data not being written or
>> reading wrong data in degraded mode.
>>
>> I guess problem happening during the write. Reason is , after ran e2fsck
>> I don't see inconsistency any more.
>>
>> Regards,
>> Marri
>
> A degraded raid5 might get corrupted if your system crashes. If you
> are writing to one of the remaining disks then it also needs to update
> the parity block simultaneously. If it crashed between writing the
> data and the parity then the data block on the failed drive will
> appear changed. I'm not sure though if the raid will even assemble on
> its own in such a case though. It might just complain about not having
> enough in-sync disks.
>
> Apart from that there should never be any corruption unless one of
> your disks returns bad data on read.
>
> MfG
>        Goswin
>
> PS: This is not a bug in linux raid but a fundamental limitation of
> raid.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

You're forgetting the every horrid possibility of failed/corrupted
hardware.  I've had IO cards go bad due to a prior bug that let an
experimental 'debugging' option in the kernel write to random memory
locations in the rare case of an unusual error.  Not just the
occasional rare chance of a buffer being corrupted, but the actual
hardware going bad.  One of the cards could not even be recovered by
an attempt at software-flashing the firmware (it must have been too
far gone for the utility to recognize, and replacing it was the least
expensive route remaining).

However in general I've seen hardware that's actually failing will
tend to do so with enough grace to either outright refuse to operate,
or operate with obvious and persistent symptoms.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID-5 degraded mode question
  2009-12-22  6:51       ` Michael Evans
@ 2009-12-22 13:37         ` Goswin von Brederlow
  2009-12-22 21:59           ` Michael Evans
  0 siblings, 1 reply; 7+ messages in thread
From: Goswin von Brederlow @ 2009-12-22 13:37 UTC (permalink / raw)
  To: Michael Evans
  Cc: Goswin von Brederlow, Tirumala Reddy Marri, Robin Hill,
	linux-raid

Michael Evans <mjevans1983@gmail.com> writes:

> On Mon, Dec 21, 2009 at 4:41 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> "Tirumala Reddy Marri" <tmarri@amcc.com> writes:
>>
>>> Thanks for the response.
>>>
>>>>>  Also as soon as disk failed md drivers marks that drive as faulty
>>> and
>>>>> continue operation in degraded mode right ? Is there a way to get out
>>>
>>>>> the degraded mode without adding spare drive. Assuming we have 5 disk
>>>
>>>>> system with one failed drive.
>>>>>
>>>>I'm not sure what you want to happen here.  The only way to get out of
>>> degraded mode is to replace the drive in the >array (if it's not
>>> actually faulty then you can add it back, otherwise you need to add a
>>> new drive).
>>>>What were you thinking might happen otherwise?
>>>
>>>
>>> I was thinking we can recover from this using re-sync or resize .After
>>
>> Theoretically you could shrink the array by one disk and then use that
>> spare disk to resync the parity. But that is a lengthy process with a
>> lot higher failure chance than resyncing to a new disk. Note that you
>> also need to shrink the filesystem on the raid first adding even more
>> stress and failure chance. So I really wouldn't recommend that.
>>
>>> running IO to degraded (RAID-5) /dev/md0, I am seeing an issue where
>>> e2fsck reports inconsistent file system and corrects it. I am trying to
>>> debug  to see if the issue is because of data not being written or
>>> reading wrong data in degraded mode.
>>>
>>> I guess problem happening during the write. Reason is , after ran e2fsck
>>> I don't see inconsistency any more.
>>>
>>> Regards,
>>> Marri
>>
>> A degraded raid5 might get corrupted if your system crashes. If you
>> are writing to one of the remaining disks then it also needs to update
>> the parity block simultaneously. If it crashed between writing the
>> data and the parity then the data block on the failed drive will
>> appear changed. I'm not sure though if the raid will even assemble on
>> its own in such a case though. It might just complain about not having
>> enough in-sync disks.
>>
>> Apart from that there should never be any corruption unless one of
>> your disks returns bad data on read.
>>
>> MfG
>>        Goswin
>>
>> PS: This is not a bug in linux raid but a fundamental limitation of
>> raid.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> You're forgetting the every horrid possibility of failed/corrupted
> hardware.  I've had IO cards go bad due to a prior bug that let an
> experimental 'debugging' option in the kernel write to random memory
> locations in the rare case of an unusual error.  Not just the
> occasional rare chance of a buffer being corrupted, but the actual
> hardware going bad.  One of the cards could not even be recovered by
> an attempt at software-flashing the firmware (it must have been too
> far gone for the utility to recognize, and replacing it was the least
> expensive route remaining).
>
> However in general I've seen hardware that's actually failing will
> tend to do so with enough grace to either outright refuse to operate,
> or operate with obvious and persistent symptoms.

And how is that relevant to the raid-5 being degraded? If the hardware
goes bad you just get errors no matter what.

MfG
        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID-5 degraded mode question
  2009-12-22 13:37         ` Goswin von Brederlow
@ 2009-12-22 21:59           ` Michael Evans
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Evans @ 2009-12-22 21:59 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Tirumala Reddy Marri, Robin Hill, linux-raid

On Tue, Dec 22, 2009 at 5:37 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Michael Evans <mjevans1983@gmail.com> writes:
>
>> On Mon, Dec 21, 2009 at 4:41 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> "Tirumala Reddy Marri" <tmarri@amcc.com> writes:
>>>
>>>> Thanks for the response.
>>>>
>>>>>>  Also as soon as disk failed md drivers marks that drive as faulty
>>>> and
>>>>>> continue operation in degraded mode right ? Is there a way to get out
>>>>
>>>>>> the degraded mode without adding spare drive. Assuming we have 5 disk
>>>>
>>>>>> system with one failed drive.
>>>>>>
>>>>>I'm not sure what you want to happen here.  The only way to get out of
>>>> degraded mode is to replace the drive in the >array (if it's not
>>>> actually faulty then you can add it back, otherwise you need to add a
>>>> new drive).
>>>>>What were you thinking might happen otherwise?
>>>>
>>>>
>>>> I was thinking we can recover from this using re-sync or resize .After
>>>
>>> Theoretically you could shrink the array by one disk and then use that
>>> spare disk to resync the parity. But that is a lengthy process with a
>>> lot higher failure chance than resyncing to a new disk. Note that you
>>> also need to shrink the filesystem on the raid first adding even more
>>> stress and failure chance. So I really wouldn't recommend that.
>>>
>>>> running IO to degraded (RAID-5) /dev/md0, I am seeing an issue where
>>>> e2fsck reports inconsistent file system and corrects it. I am trying to
>>>> debug  to see if the issue is because of data not being written or
>>>> reading wrong data in degraded mode.
>>>>
>>>> I guess problem happening during the write. Reason is , after ran e2fsck
>>>> I don't see inconsistency any more.
>>>>
>>>> Regards,
>>>> Marri
>>>
>>> A degraded raid5 might get corrupted if your system crashes. If you
>>> are writing to one of the remaining disks then it also needs to update
>>> the parity block simultaneously. If it crashed between writing the
>>> data and the parity then the data block on the failed drive will
>>> appear changed. I'm not sure though if the raid will even assemble on
>>> its own in such a case though. It might just complain about not having
>>> enough in-sync disks.
>>>
>>> Apart from that there should never be any corruption unless one of
>>> your disks returns bad data on read.
>>>
>>> MfG
>>>        Goswin
>>>
>>> PS: This is not a bug in linux raid but a fundamental limitation of
>>> raid.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> You're forgetting the every horrid possibility of failed/corrupted
>> hardware.  I've had IO cards go bad due to a prior bug that let an
>> experimental 'debugging' option in the kernel write to random memory
>> locations in the rare case of an unusual error.  Not just the
>> occasional rare chance of a buffer being corrupted, but the actual
>> hardware going bad.  One of the cards could not even be recovered by
>> an attempt at software-flashing the firmware (it must have been too
>> far gone for the utility to recognize, and replacing it was the least
>> expensive route remaining).
>>
>> However in general I've seen hardware that's actually failing will
>> tend to do so with enough grace to either outright refuse to operate,
>> or operate with obvious and persistent symptoms.
>
> And how is that relevant to the raid-5 being degraded? If the hardware
> goes bad you just get errors no matter what.
>
> MfG
>        Goswin
>

It could be the reason the array degraded; but yes, if the hardware
fails your data is lost/at extreme risk regardless.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-12-22 21:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-16 20:20 RAID-5 degraded mode question Tirumala Reddy Marri
2009-12-16 21:12 ` Robin Hill
2009-12-16 21:29   ` Tirumala Reddy Marri
2009-12-21 12:41     ` Goswin von Brederlow
2009-12-22  6:51       ` Michael Evans
2009-12-22 13:37         ` Goswin von Brederlow
2009-12-22 21:59           ` Michael Evans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).