All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nate Dailey <nate.dailey@stratus.com>
To: Neil Brown <neilb@suse.de>, Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: linux-raid@vger.kernel.org, William.Kuzeja@stratus.com, xni@redhat.com
Subject: Re: [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error()
Date: Thu, 22 Oct 2015 18:37:04 -0400	[thread overview]
Message-ID: <56296510.4030702@stratus.com> (raw)
In-Reply-To: <87r3kmziux.fsf@notabene.neil.brown.name>

The problem is that we aren't getting true write (medium) errors.

In this case we're testing device removals. The write errors happen because the 
disk goes away. Narrow_write_error returns 1, the bitmap bit is cleared, and 
then when the device is re-added the resync might not include the sectors in 
that chunk (there's some luck involved; if other writes to that chunk happen 
while the disk is removed, we're okay--bug is easier to hit with smaller bitmap 
chunks because of this).




On 10/22/2015 05:36 PM, Neil Brown wrote:
> Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>
>> Neil Brown <neilb@suse.de> writes:
>>> Jes.Sorensen@redhat.com writes:
>>>
>>>> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>>>>
>>>> Hi,
>>>>
>>>> Bill Kuzeja reported a problem to me about data corruption when
>>>> repeatedly removing and re-adding devices in raid1 arrays. It showed
>>>> up to be caused by the return value of submit_bio_wait() being handled
>>>> incorrectly. Tracking this down is credit of Bill!
>>>>
>>>> Looks like commit 9e882242c6193ae6f416f2d8d8db0d9126bd996b changed the
>>>> return of submit_bio_wait() to return != 0 on error, whereas before it
>>>> returned 0 on error.
>>>>
>>>> This fix should be suitable for -stable as far back as 3.9
>>> 3.10?
>>>
>>> Thanks to both of you!
>>>
>>> I took the liberty of changing the patches a little so they are now:
>>>
>>> -               if (submit_bio_wait(WRITE, wbio) == 0)
>>> +               if (submit_bio_wait(WRITE, wbio) < 0)
>>>
>>> because when there is no explicit test I tend to expect a Bool but these
>>> values are not Bool.
>>>
>>> Patches are in my for-linus branch and will be forwarded sometime this
>>> week.
>>>
>>> This bug only causes a problem when bad-block logs are active, so
>>> hopefully it won't have caused too much corruption yet -- you would need
>>> to be using a newish mdadm.
>> Neil,
>>
>> An additional twist on this one - Nate ran more tests on this, but was
>> still able to hit data corruption. He suggests the it is a mistake to
>> set 'ok = rdev_set_badblocks()' and it should instead be set to 0 if
>> submit_bio_wait() fails. Like this:
>>
>> --- raid1.c
>> +++ raid1.c
>> @@ -2234,11 +2234,12 @@
>>   		bio_trim(wbio, sector - r1_bio->sector, sectors);
>>   		wbio->bi_sector += rdev->data_offset;
>>   		wbio->bi_bdev = rdev->bdev;
>>   		if (submit_bio_wait(WRITE, wbio) < 0) {
>>   			/* failure! */
>> -			ok = rdev_set_badblocks(rdev, sector,
>> -						sectors, 0)
>> -				&& ok;
>> +			ok = 0;
>> +			rdev_set_badblocks(rdev, sector,
>> +					   sectors, 0);
>> +		}
>>
>> Question is whether this change has any negative impact in case of a
>> real write failure?
>>
>> I have actual patches, I'll send as a reply to this one.
>>
> If we unconditionally set ok to 0 on a write error, then
> narrow_write_error() will return 0 and handle_write finished() will call
> md_error() to kick the device out of the array.
>
> And given that we only call narrow_write_error()  when we got a write
> error, we strongly expect at least one sector to give an error.
>
> So it seems to me that the net result of this patch is to make
> bad-block-lists completely ineffective.
>
> What sort of tests are you running, and what sort of corruption do you
> see?
>
> NeilBrown


  reply	other threads:[~2015-10-22 22:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20 16:09 [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Jes.Sorensen
2015-10-20 16:09 ` [PATCH 1/2] md/raid1: submit_bio_wait() returns 0 on success Jes.Sorensen
2015-10-20 16:09 ` [PATCH 2/2] md/raid10: " Jes.Sorensen
2015-10-20 20:29 ` [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Neil Brown
2015-10-20 23:12   ` Jes Sorensen
2015-10-22 15:59   ` Jes Sorensen
2015-10-22 16:01     ` [PATCH 1/2] md/raid1: Do not clear bitmap bit if submit_bio_wait() fails Jes.Sorensen
2015-10-22 16:01     ` [PATCH 2/2] md/raid10: " Jes.Sorensen
2015-10-22 21:36     ` [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Neil Brown
2015-10-22 22:37       ` Nate Dailey [this message]
2015-10-23  0:09         ` Neil Brown
2015-10-23 14:30           ` Nate Dailey
2015-10-23 18:02             ` Jes Sorensen
2015-10-24  5:31               ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56296510.4030702@stratus.com \
    --to=nate.dailey@stratus.com \
    --cc=Jes.Sorensen@redhat.com \
    --cc=William.Kuzeja@stratus.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.