linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Are we forced to use bad blocks list?
@ 2014-07-31 14:31 Ethan Wilson
  2014-08-04  1:38 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Ethan Wilson @ 2014-07-31 14:31 UTC (permalink / raw)
  To: linux-raid

Dear MD developers,
it seems that with mdadm 3.3.1 , if an array has bad blocks disabled 
(e.g. "--update=no-bbl"  was invoked) and we want to add a disk to that 
array, e.g. a spare, that one will be created by mdadm with BBL enabled 
during the --add operation.

There is apparently no "--add --no-bbl" option in mdadm, so the BBL will 
result in being forcibly active for that disk, it seems to me.

It is indeed possible to "--stop" the array and then "--assemble 
--update=no-bbl" so to clear the BBL flag in all disks, but this 
requires stopping the array, which for a production system often is not 
possible, and not justified for just adding a spare.

Can I add a "feature request" to have BBL optional, and/or to default 
BBL presence/absence so that it conforms to the presence/absence of BBLs 
in the other disks of the array which is already running?

The same problem probably happens when mdadm monitor daemon moves spares 
among the spare-group: it should probably understand if the receiving 
array is configured for BBL or not, and add a spare of the same type.

Thank you
EW

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Are we forced to use bad blocks list?
  2014-07-31 14:31 Are we forced to use bad blocks list? Ethan Wilson
@ 2014-08-04  1:38 ` NeilBrown
  2014-08-04 12:37   ` Ethan Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2014-08-04  1:38 UTC (permalink / raw)
  To: Ethan Wilson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On Thu, 31 Jul 2014 16:31:28 +0200 Ethan Wilson <ethan.wilson@shiftmail.org>
wrote:

> Dear MD developers,
> it seems that with mdadm 3.3.1 , if an array has bad blocks disabled 
> (e.g. "--update=no-bbl"  was invoked) and we want to add a disk to that 
> array, e.g. a spare, that one will be created by mdadm with BBL enabled 
> during the --add operation.
> 
> There is apparently no "--add --no-bbl" option in mdadm, so the BBL will 
> result in being forcibly active for that disk, it seems to me.
> 
> It is indeed possible to "--stop" the array and then "--assemble 
> --update=no-bbl" so to clear the BBL flag in all disks, but this 
> requires stopping the array, which for a production system often is not 
> possible, and not justified for just adding a spare.
> 
> Can I add a "feature request" to have BBL optional, and/or to default 
> BBL presence/absence so that it conforms to the presence/absence of BBLs 
> in the other disks of the array which is already running?
> 
> The same problem probably happens when mdadm monitor daemon moves spares 
> among the spare-group: it should probably understand if the receiving 
> array is configured for BBL or not, and add a spare of the same type.
> 

Why don't you want bad-block-lists?

I'm not necessarily against having some why to avoid getting them
automatically ... possibly a 'policy' option in mdadm.conf.
But I'd like to make sure I understand all of your thinking first.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Are we forced to use bad blocks list?
  2014-08-04  1:38 ` NeilBrown
@ 2014-08-04 12:37   ` Ethan Wilson
  2014-08-07  2:27     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Ethan Wilson @ 2014-08-04 12:37 UTC (permalink / raw)
  To: linux-raid

On 04/08/2014 03:38, NeilBrown wrote:
> On Thu, 31 Jul 2014 16:31:28 +0200 Ethan Wilson<ethan.wilson@shiftmail.org>
> wrote:
>
>> Dear MD developers,
>> it seems that with mdadm 3.3.1 , if an array has bad blocks disabled
>> ....
>> array is configured for BBL or not, and add a spare of the same type.
>>
> Why don't you want bad-block-lists?
>
> I'm not necessarily against having some why to avoid getting them
> automatically ... possibly a 'policy' option in mdadm.conf.
> But I'd like to make sure I understand all of your thinking first.
>
> Thanks,
> NeilBrown

Hello Neil,

Well... on the ML, I think that we saw the badblocks code triggered only 
once, and it was with the recent thread of Pedro Teixeira.

It seemed to me that his error condition could indicate that there might 
be a bug in the bad blocks code. It's not clear to me how those zillions 
of bad sectors could have been stored without some bug such as an 
erroneous propagation of bad blocks, or erroneous handling or degraded 
mode (he said he operated with a doubly degraded raid6 after 3 disks 
dropped out).

Additionally, when he did fsck, that should have cleared the bad blocks 
which were being written over, but he said that
"When doing a fsck.ext4 of /dev/md0 it returns the following ( and I can 
do it over and over again with the exact same errors) ..... "
I think 'exact same errors' is not supposed to happen if I understand 
the intent of BBL correctly.

So, I can't be sure, but I have the feeling it's possible that there are 
still a few bugs in the BBL code. MD RAID in general is very stable and 
I really like it so much, but maybe on production systems I'd keep the 
BBL disabled still for a while, if possible.

Thanks,
EW

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Are we forced to use bad blocks list?
  2014-08-04 12:37   ` Ethan Wilson
@ 2014-08-07  2:27     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2014-08-07  2:27 UTC (permalink / raw)
  To: Ethan Wilson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]

On Mon, 04 Aug 2014 14:37:59 +0200 Ethan Wilson <ethan.wilson@shiftmail.org>
wrote:

> On 04/08/2014 03:38, NeilBrown wrote:
> > On Thu, 31 Jul 2014 16:31:28 +0200 Ethan Wilson<ethan.wilson@shiftmail.org>
> > wrote:
> >
> >> Dear MD developers,
> >> it seems that with mdadm 3.3.1 , if an array has bad blocks disabled
> >> ....
> >> array is configured for BBL or not, and add a spare of the same type.
> >>
> > Why don't you want bad-block-lists?
> >
> > I'm not necessarily against having some why to avoid getting them
> > automatically ... possibly a 'policy' option in mdadm.conf.
> > But I'd like to make sure I understand all of your thinking first.
> >
> > Thanks,
> > NeilBrown
> 
> Hello Neil,
> 
> Well... on the ML, I think that we saw the badblocks code triggered only 
> once, and it was with the recent thread of Pedro Teixeira.
> 
> It seemed to me that his error condition could indicate that there might 
> be a bug in the bad blocks code. It's not clear to me how those zillions 
> of bad sectors could have been stored without some bug such as an 
> erroneous propagation of bad blocks, or erroneous handling or degraded 
> mode (he said he operated with a doubly degraded raid6 after 3 disks 
> dropped out).
> 
> Additionally, when he did fsck, that should have cleared the bad blocks 
> which were being written over, but he said that
> "When doing a fsck.ext4 of /dev/md0 it returns the following ( and I can 
> do it over and over again with the exact same errors) ..... "
> I think 'exact same errors' is not supposed to happen if I understand 
> the intent of BBL correctly.
> 
> So, I can't be sure, but I have the feeling it's possible that there are 
> still a few bugs in the BBL code. MD RAID in general is very stable and 
> I really like it so much, but maybe on production systems I'd keep the 
> BBL disabled still for a while, if possible.
> 

Fair enough.  Thanks for the explanation.

http://git.neil.brown.name/?p=mdadm.git;a=commitdiff;h=e2efe9e7bc73307f74a4c2e2197d6d4498dd46f0

will be in mdadm-3.3.2.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-08-07  2:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-31 14:31 Are we forced to use bad blocks list? Ethan Wilson
2014-08-04  1:38 ` NeilBrown
2014-08-04 12:37   ` Ethan Wilson
2014-08-07  2:27     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).