RAID 5 recovery to not degrade device on bad block

public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed

* RAID 5 recovery to not degrade device on bad block
@ 2009-08-23  8:16 Anshuman Aggarwal
  2009-08-24 12:54 ` Goswin von Brederlow
  0 siblings, 1 reply; 7+ messages in thread
From: Anshuman Aggarwal @ 2009-08-23  8:16 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Here is a simple feature request which I assume would not be much
logic change for kernel devs familiar with the code.

Essentially, if I understand correctly, the kernel raid code will try
to let the drive fix a bad sector and otherwise fail the device and
degrade the array.
However, if an array is already degraded then this behvaviour can be
very limiting because typically you are in recovery mode and want to
get as much data out to your new disk as you can.

I would say that for an already degraded array, bad blocks should
*NOT* by default cause a single bad block to fail the whole
array...instead just log the bad blocks to the syslog and let the
admin take care of it.

Right now, the big benefit of RAID5 is being affected

Ideally, I'd like to see Neil's road map bad block device handler
implemented (have often thought of tinkering with the block device
code in the kernel to do just that)...but till then a simple check
that an array is  degraded before failing a device which would render
the whole array inoperable should suffice? This could throw big errors
in the syslog but at least the a 2 TB MD array won't be down because
of 1 512 byte sector?

Thanks,
Anshuman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID 5 recovery to not degrade device on bad block
  2009-08-23  8:16 RAID 5 recovery to not degrade device on bad block Anshuman Aggarwal
@ 2009-08-24 12:54 ` Goswin von Brederlow
  2009-08-24 14:39   ` Write intent bitmaps Simon Jackson
  0 siblings, 1 reply; 7+ messages in thread
From: Goswin von Brederlow @ 2009-08-24 12:54 UTC (permalink / raw)
  To: Anshuman Aggarwal; +Cc: NeilBrown, linux-raid

Anshuman Aggarwal <anshuman.aggarwal@gmail.com> writes:

> Here is a simple feature request which I assume would not be much
> logic change for kernel devs familiar with the code.
>
> Essentially, if I understand correctly, the kernel raid code will try
> to let the drive fix a bad sector and otherwise fail the device and
> degrade the array.
> However, if an array is already degraded then this behvaviour can be
> very limiting because typically you are in recovery mode and want to
> get as much data out to your new disk as you can.
>
> I would say that for an already degraded array, bad blocks should
> *NOT* by default cause a single bad block to fail the whole
> array...instead just log the bad blocks to the syslog and let the
> admin take care of it.

Big problem there.

As long as the raid is degrade a bad block can be reported to the
system as I/O error.

But consider what happens when you resync the drive and don't stop on
a bad block. The block on the new drive coresponding to the bad block
can not be initialized corectly. But a read of the bad block would
trigger the block to be recomputed from the remaining disks. Instead
of an I/O error you would get invalid data.

What would be needed is the ability to mark blocks as bad. Even with
bitmap support the bit cover too large an area.

> Right now, the big benefit of RAID5 is being affected
>
> Ideally, I'd like to see Neil's road map bad block device handler
> implemented (have often thought of tinkering with the block device
> code in the kernel to do just that)...but till then a simple check
> that an array is  degraded before failing a device which would render
> the whole array inoperable should suffice? This could throw big errors
> in the syslog but at least the a 2 TB MD array won't be down because
> of 1 512 byte sector?
>
> Thanks,
> Anshuman

MfG
        Goswin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Write intent bitmaps.
  2009-08-24 12:54 ` Goswin von Brederlow
@ 2009-08-24 14:39   ` Simon Jackson
       [not found]     ` <ABFC24E4C13D81489F7F624E14891C860D1F15EF@uk-ex-mbx1.terastack.bluearc .com>
  0 siblings, 1 reply; 7+ messages in thread
From: Simon Jackson @ 2009-08-24 14:39 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org



I am trying to use write intent bitmaps on some RAID 1 volumes to reduce the rebuild times in the event of hard resets that cause the md driver to kick members out of my arrays.

I used the mdadm --grow /dev/md0 --bitmap=internal  and this appeared to succeed, but when I tried to examine the bitmap I get an error. 


:~$ sudo mdadm --grow /dev/md0 --bitmap=internal
:~$ sudo mdadm -X /dev/md0
        Filename : /dev/md0
           Magic : 00000000
mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
         Version : 0
mdadm: unknown bitmap version 0, either the bitmap file is corrupted or you need to upgrade your tools

cat /proc/mdstat
Personalities : [raid1] 
      
md0 : active raid1 sda5[0] sdb5[1]
      7823552 blocks [2/2] [UU]
      bitmap: 29/239 pages [116KB], 16KB chunk
      
unused devices: <none>
sjackson@mercuryst5:~$ sudo mdadm -X /dev/md0
        Filename : /dev/md0
           Magic : 00000000
mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
         Version : 0
mdadm: unknown bitmap version 0, either the bitmap file is corrupted or you need to upgrade your tools

Do I really have a usable bitmap on the device in this case?

Thanks for any input.  



^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <ABFC24E4C13D81489F7F624E14891C860D1F15EF@uk-ex-mbx1.terastack.bluearc .com>]

* Re: Write intent bitmaps.
       [not found]     ` <ABFC24E4C13D81489F7F624E14891C860D1F15EF@uk-ex-mbx1.terastack.bluearc .com>
@ 2009-08-24 20:25       ` NeilBrown
  2009-09-02 16:10         ` Bill Davidsen
  0 siblings, 1 reply; 7+ messages in thread
From: NeilBrown @ 2009-08-24 20:25 UTC (permalink / raw)
  To: Simon Jackson; +Cc: linux-raid@vger.kernel.org

On Tue, August 25, 2009 12:39 am, Simon Jackson wrote:
>
>
> I am trying to use write intent bitmaps on some RAID 1 volumes to reduce
> the rebuild times in the event of hard resets that cause the md driver to
> kick members out of my arrays.
>
> I used the mdadm --grow /dev/md0 --bitmap=internal  and this appeared to
> succeed, but when I tried to examine the bitmap I get an error.
>
>
> :~$ sudo mdadm --grow /dev/md0 --bitmap=internal
> :~$ sudo mdadm -X /dev/md0
>         Filename : /dev/md0
>            Magic : 00000000
> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>          Version : 0
> mdadm: unknown bitmap version 0, either the bitmap file is corrupted or
> you need to upgrade your tools

Quoting from the man page:

       -X, --examine-bitmap
              Report  information about a bitmap file.  The argument is
either
              an external bitmap file or an array  component  in  case  of
 an
              internal  bitmap.   Note  that  running  this on an array
device
              (e.g.  /dev/md0) does not report the bitmap for that array.


Particularly read the last sentence.
Then try
   mdadm -X /dev/sda5

NeilBrown


>
> cat /proc/mdstat
> Personalities : [raid1]
>
> md0 : active raid1 sda5[0] sdb5[1]
>       7823552 blocks [2/2] [UU]
>       bitmap: 29/239 pages [116KB], 16KB chunk
>
> unused devices: <none>
> sjackson@mercuryst5:~$ sudo mdadm -X /dev/md0
>         Filename : /dev/md0
>            Magic : 00000000
> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>          Version : 0
> mdadm: unknown bitmap version 0, either the bitmap file is corrupted or
> you need to upgrade your tools
>
> Do I really have a usable bitmap on the device in this case?
>
> Thanks for any input.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Write intent bitmaps.
  2009-08-24 20:25       ` NeilBrown
@ 2009-09-02 16:10         ` Bill Davidsen
  2009-09-02 16:28           ` Paul Clements
  0 siblings, 1 reply; 7+ messages in thread
From: Bill Davidsen @ 2009-09-02 16:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: Simon Jackson, linux-raid@vger.kernel.org

NeilBrown wrote:
> On Tue, August 25, 2009 12:39 am, Simon Jackson wrote:
>   
>> I am trying to use write intent bitmaps on some RAID 1 volumes to reduce
>> the rebuild times in the event of hard resets that cause the md driver to
>> kick members out of my arrays.
>>
>> I used the mdadm --grow /dev/md0 --bitmap=internal  and this appeared to
>> succeed, but when I tried to examine the bitmap I get an error.
>>
>>
>> :~$ sudo mdadm --grow /dev/md0 --bitmap=internal
>> :~$ sudo mdadm -X /dev/md0
>>         Filename : /dev/md0
>>            Magic : 00000000
>> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>>          Version : 0
>> mdadm: unknown bitmap version 0, either the bitmap file is corrupted or
>> you need to upgrade your tools
>>     
>
> Quoting from the man page:
>
>        -X, --examine-bitmap
>               Report  information about a bitmap file.  The argument is
> either
>               an external bitmap file or an array  component  in  case  of
>  an
>               internal  bitmap.   Note  that  running  this on an array
> device
>               (e.g.  /dev/md0) does not report the bitmap for that array.
>
>
> Particularly read the last sentence.
> Then try
>    mdadm -X /dev/sda5
>   

Well that's nice and clear, but raises the question "why not?" This 
would seem to be one of the most common things someone would do, to look 
at the bitmap for an array.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc

"Now we have another quarterback besides Kurt Warner telling us during postgame
interviews that he owes every great thing that happens to him on a football
field to his faith in Jesus. I knew there had to be a reason why the Almighty
included a mute button on my remote."
			-- Arthur Troyer on Tim Tebow (Sports Illustrated)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Write intent bitmaps.
  2009-09-02 16:10         ` Bill Davidsen
@ 2009-09-02 16:28           ` Paul Clements
  2009-09-02 17:36             ` Ryan Wagoner
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Clements @ 2009-09-02 16:28 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: NeilBrown, Simon Jackson, linux-raid@vger.kernel.org

Bill Davidsen wrote:
> NeilBrown wrote:
>> On Tue, August 25, 2009 12:39 am, Simon Jackson wrote:
>>  
>>> I am trying to use write intent bitmaps on some RAID 1 volumes to reduce
>>> the rebuild times in the event of hard resets that cause the md 
>>> driver to
>>> kick members out of my arrays.
>>>
>>> I used the mdadm --grow /dev/md0 --bitmap=internal  and this appeared to
>>> succeed, but when I tried to examine the bitmap I get an error.
>>>
>>>
>>> :~$ sudo mdadm --grow /dev/md0 --bitmap=internal
>>> :~$ sudo mdadm -X /dev/md0
>>>         Filename : /dev/md0
>>>            Magic : 00000000
>>> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>>>          Version : 0
>>> mdadm: unknown bitmap version 0, either the bitmap file is corrupted or
>>> you need to upgrade your tools
>>>     
>>
>> Quoting from the man page:
>>
>>        -X, --examine-bitmap
>>               Report  information about a bitmap file.  The argument is
>> either
>>               an external bitmap file or an array  component  in  
>> case  of
>>  an
>>               internal  bitmap.   Note  that  running  this on an array
>> device
>>               (e.g.  /dev/md0) does not report the bitmap for that array.
>>
>>
>> Particularly read the last sentence.
>> Then try
>>    mdadm -X /dev/sda5
>>   
> 
> Well that's nice and clear, but raises the question "why not?" This 
> would seem to be one of the most common things someone would do, to look 
> at the bitmap for an array.

Two reasons why not:

The examine code simply takes the device or file you give it and looks 
for a bitmap in that file or device. You'd have to do some hand-waving 
to "read the bitmap for /dev/md0". There actually is no bitmap on 
/dev/md0; there is a bitmap stored either in a file or on each of the 
component devices. So which version of the bitmap do you read? From the 
first, second, third ... component disk?

Also, mdadm's behavior would be ambiguous if you implemented the above. 
What if /dev/md0 is itself a component of another md device? Then how is 
mdadm to know which bitmap you want? The one that actually physically 
exists on md0, or the ones that the components of md0 contain?

Perhaps better would be to simply throw an error in this case?

--
Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Write intent bitmaps.
  2009-09-02 16:28           ` Paul Clements
@ 2009-09-02 17:36             ` Ryan Wagoner
  0 siblings, 0 replies; 7+ messages in thread
From: Ryan Wagoner @ 2009-09-02 17:36 UTC (permalink / raw)
  To: Paul Clements
  Cc: Bill Davidsen, NeilBrown, Simon Jackson,
	linux-raid@vger.kernel.org

You could just provide more information in the mdadm bitmap error message

mdadm: unknown bitmap version 0, either the bitmap file is corrupted
or you need to upgrade your tools

suggested change

mdadm: unknown bitmap version 0, either the bitmap file is corrupted,
you are looking looking at the array and not an array component, or
you need to upgrade your tools

Ryan

On Wed, Sep 2, 2009 at 12:28 PM, Paul
Clements<paul.clements@steeleye.com> wrote:
> Bill Davidsen wrote:
>>
>> NeilBrown wrote:
>>>
>>> On Tue, August 25, 2009 12:39 am, Simon Jackson wrote:
>>>
>>>>
>>>> I am trying to use write intent bitmaps on some RAID 1 volumes to reduce
>>>> the rebuild times in the event of hard resets that cause the md driver
>>>> to
>>>> kick members out of my arrays.
>>>>
>>>> I used the mdadm --grow /dev/md0 --bitmap=internal  and this appeared to
>>>> succeed, but when I tried to examine the bitmap I get an error.
>>>>
>>>>
>>>> :~$ sudo mdadm --grow /dev/md0 --bitmap=internal
>>>> :~$ sudo mdadm -X /dev/md0
>>>>        Filename : /dev/md0
>>>>           Magic : 00000000
>>>> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>>>>         Version : 0
>>>> mdadm: unknown bitmap version 0, either the bitmap file is corrupted or
>>>> you need to upgrade your tools
>>>>
>>>
>>> Quoting from the man page:
>>>
>>>       -X, --examine-bitmap
>>>              Report  information about a bitmap file.  The argument is
>>> either
>>>              an external bitmap file or an array  component  in  case  of
>>>  an
>>>              internal  bitmap.   Note  that  running  this on an array
>>> device
>>>              (e.g.  /dev/md0) does not report the bitmap for that array.
>>>
>>>
>>> Particularly read the last sentence.
>>> Then try
>>>   mdadm -X /dev/sda5
>>>
>>
>> Well that's nice and clear, but raises the question "why not?" This would
>> seem to be one of the most common things someone would do, to look at the
>> bitmap for an array.
>
> Two reasons why not:
>
> The examine code simply takes the device or file you give it and looks for a
> bitmap in that file or device. You'd have to do some hand-waving to "read
> the bitmap for /dev/md0". There actually is no bitmap on /dev/md0; there is
> a bitmap stored either in a file or on each of the component devices. So
> which version of the bitmap do you read? From the first, second, third ...
> component disk?
>
> Also, mdadm's behavior would be ambiguous if you implemented the above. What
> if /dev/md0 is itself a component of another md device? Then how is mdadm to
> know which bitmap you want? The one that actually physically exists on md0,
> or the ones that the components of md0 contain?
>
> Perhaps better would be to simply throw an error in this case?
>
> --
> Paul
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-02 17:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-23  8:16 RAID 5 recovery to not degrade device on bad block Anshuman Aggarwal
2009-08-24 12:54 ` Goswin von Brederlow
2009-08-24 14:39   ` Write intent bitmaps Simon Jackson
     [not found]     ` <ABFC24E4C13D81489F7F624E14891C860D1F15EF@uk-ex-mbx1.terastack.bluearc .com>
2009-08-24 20:25       ` NeilBrown
2009-09-02 16:10         ` Bill Davidsen
2009-09-02 16:28           ` Paul Clements
2009-09-02 17:36             ` Ryan Wagoner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox