From: Nate Dailey <nate.dailey@stratus.com>
To: Neil Brown <neilb@suse.de>, Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: linux-raid@vger.kernel.org, William.Kuzeja@stratus.com, xni@redhat.com
Subject: Re: [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error()
Date: Fri, 23 Oct 2015 10:30:13 -0400 [thread overview]
Message-ID: <562A4475.1000904@stratus.com> (raw)
In-Reply-To: <87d1w6zbrv.fsf@notabene.neil.brown.name>
Thank you!
I confirmed that this patch prevents the bug.
Nate
On 10/22/2015 08:09 PM, Neil Brown wrote:
> Nate Dailey <nate.dailey@stratus.com> writes:
>
>> The problem is that we aren't getting true write (medium) errors.
>>
>> In this case we're testing device removals. The write errors happen because the
>> disk goes away. Narrow_write_error returns 1, the bitmap bit is cleared, and
>> then when the device is re-added the resync might not include the sectors in
>> that chunk (there's some luck involved; if other writes to that chunk happen
>> while the disk is removed, we're okay--bug is easier to hit with smaller bitmap
>> chunks because of this).
>>
>>
> OK, that makes sense.
>
> The device removal will be noticed when the bad block log is written
> out.
> When a bad-block is recorded we make sure to write that out promptly
> before bio_endio() gets called. But not before close_write() has called
> bitmap_end_write().
>
> So I guess we need to delay the close_write() call until the
> bad-block-log has been written.
>
> I think this patch should do it. Can you test?
>
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index c1ad0b075807..1a1c5160c930 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -2269,8 +2269,6 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
> rdev_dec_pending(conf->mirrors[m].rdev,
> conf->mddev);
> }
> - if (test_bit(R1BIO_WriteError, &r1_bio->state))
> - close_write(r1_bio);
> if (fail) {
> spin_lock_irq(&conf->device_lock);
> list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
> @@ -2396,6 +2394,9 @@ static void raid1d(struct md_thread *thread)
> r1_bio = list_first_entry(&tmp, struct r1bio,
> retry_list);
> list_del(&r1_bio->retry_list);
> + if (mddev->degraded)
> + set_bit(R1BIO_Degraded, &r1_bio->state);
> + close_write(r1_bio);
> raid_end_bio_io(r1_bio);
> }
> }
next prev parent reply other threads:[~2015-10-23 14:30 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-20 16:09 [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Jes.Sorensen
2015-10-20 16:09 ` [PATCH 1/2] md/raid1: submit_bio_wait() returns 0 on success Jes.Sorensen
2015-10-20 16:09 ` [PATCH 2/2] md/raid10: " Jes.Sorensen
2015-10-20 20:29 ` [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Neil Brown
2015-10-20 23:12 ` Jes Sorensen
2015-10-22 15:59 ` Jes Sorensen
2015-10-22 16:01 ` [PATCH 1/2] md/raid1: Do not clear bitmap bit if submit_bio_wait() fails Jes.Sorensen
2015-10-22 16:01 ` [PATCH 2/2] md/raid10: " Jes.Sorensen
2015-10-22 21:36 ` [PATCH 0/2] raid1/10: Handle write errors correctly in narrow_write_error() Neil Brown
2015-10-22 22:37 ` Nate Dailey
2015-10-23 0:09 ` Neil Brown
2015-10-23 14:30 ` Nate Dailey [this message]
2015-10-23 18:02 ` Jes Sorensen
2015-10-24 5:31 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=562A4475.1000904@stratus.com \
--to=nate.dailey@stratus.com \
--cc=Jes.Sorensen@redhat.com \
--cc=William.Kuzeja@stratus.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.