From: Neil Brown <neilb@suse.de>
To: Shaohua Li <shli@fb.com>
Cc: linux-raid@vger.kernel.org, Kernel-team@fb.com,
songliubraving@fb.com, hch@infradead.org,
dan.j.williams@intel.com
Subject: Re: [PATCH 1/2] md: clear CHANGE_PENDING in readonly array
Date: Wed, 30 Sep 2015 16:59:37 +1000 [thread overview]
Message-ID: <87fv1wl7dy.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20150924164720.GA3706482@devbig084.prn1.facebook.com>
[-- Attachment #1: Type: text/plain, Size: 3316 bytes --]
Shaohua Li <shli@fb.com> writes:
> On Thu, Sep 24, 2015 at 02:03:53PM +1000, Neil Brown wrote:
>> Shaohua Li <shli@fb.com> writes:
>>
>> > On Wed, Sep 23, 2015 at 04:05:33PM +1000, Neil Brown wrote:
>> >> Shaohua Li <shli@fb.com> writes:
>> >>
>> >> > If faulty disks of an array are more than allowed degraded number, the
>> >> > array enters error handling. It will be marked as read-only with
>> >> > MD_CHANGE_PENDING/RECOVERY_NEEDED set. But currently recovery doesn't
>> >> > clear CHANGE_PENDING bit for read-only array. If MD_CHANGE_PENDING is
>> >> > set for a raid5 array, all returned IO will be hold on a list till the
>> >> > bit is clear. But recovery nevery clears this bit, the IO is always in
>> >> > pending state and nevery finish. This has bad effects like upper layer
>> >> > can't get an IO error and the array can't be stopped.
>> >> >
>> >> > Signed-off-by: Shaohua Li <shli@fb.com>
>> >> > ---
>> >> > drivers/md/md.c | 1 +
>> >> > 1 file changed, 1 insertion(+)
>> >> >
>> >> > diff --git a/drivers/md/md.c b/drivers/md/md.c
>> >> > index 95824fb..c596b73 100644
>> >> > --- a/drivers/md/md.c
>> >> > +++ b/drivers/md/md.c
>> >> > @@ -8209,6 +8209,7 @@ void md_check_recovery(struct mddev *mddev)
>> >> > md_reap_sync_thread(mddev);
>> >> > clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
>> >> > clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>> >> > + clear_bit(MD_CHANGE_PENDING, &mddev->flags);
>> >> > goto unlock;
>> >> > }
>> >> >
>> >> > --
>> >> > 1.8.1
>> >>
>> >> Hi,
>> >> I can see that clearing MD_CHANGE_PENDING there is probably correct -
>> >> bug introduced by
>> >> Commit: c3cce6cda162 ("md/raid5: ensure device failure recorded before write request returns.")
>> >>
>> >> However I don't understand your reasoning. You say that the array is
>> >> marked as read-only, but I don't see how that would happen. What
>> >> causes the array to be marked "read-only"?
>> >
>> > It's set read-only by mdadm. I didn't look carefully, but looks there is
>> > disk failure event, mdadm is invoked automatically by some background
>> > daemon. It's a ubuntu distribution.
>>
>> Thanks.
>> This raises a couple of questions.
>>
>> 1/ What should md_set_readonly do if it finds that MD_CHANGE_PENDING is
>> set?
>> Maybe it should wait for md_check_recovery to get run which should
>> clear the bit, after probably writing out the metadata.
>>
>> 2/ Why didn't md_check_recovery already do that before mdadm had a
>> chance to set the array read-only?
>> I guess that is just a timing thing. md_check_recovery could be
>> delayed, and mdadm could get called by udev rather quickly.
>>
>> I think I'll get md_set_readonly to
>> wait_event(mddev->sb_wait,
>> !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>>
>> because I think that is the right thing to do. But if the array is
>> already read-only that won't help, so I'll still need you patch.
>>
>> Would you be able to test that the following patch (without your patch)
>> also fixes the symptom?
>
> Yes, the wait_event patch fixes the issue (without my patch).
>
> Thanks,
> Shaohua
Thanks for testing. I change "Reported-by:" to
"Reported-and-tested-by:" :-)
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
prev parent reply other threads:[~2015-09-30 6:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-18 17:20 [PATCH 1/2] md: clear CHANGE_PENDING in readonly array Shaohua Li
2015-09-18 17:20 ` [PATCH 2/2] raid5: update analysis state for failed stripe Shaohua Li
2015-09-23 6:21 ` Neil Brown
2015-09-23 6:34 ` Shaohua Li
2015-09-24 5:26 ` Neil Brown
2015-09-23 6:05 ` [PATCH 1/2] md: clear CHANGE_PENDING in readonly array Neil Brown
2015-09-23 6:23 ` Shaohua Li
2015-09-24 4:03 ` Neil Brown
2015-09-24 16:47 ` Shaohua Li
2015-09-30 6:59 ` Neil Brown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fv1wl7dy.fsf@notabene.neil.brown.name \
--to=neilb@suse.de \
--cc=Kernel-team@fb.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=linux-raid@vger.kernel.org \
--cc=shli@fb.com \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).