linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Shaohua Li <shli@fb.com>
Cc: linux-raid@vger.kernel.org, Kernel-team@fb.com,
	songliubraving@fb.com, hch@infradead.org,
	dan.j.williams@intel.com
Subject: Re: [PATCH 1/2] md: clear CHANGE_PENDING in readonly array
Date: Wed, 30 Sep 2015 16:59:37 +1000	[thread overview]
Message-ID: <87fv1wl7dy.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20150924164720.GA3706482@devbig084.prn1.facebook.com>

[-- Attachment #1: Type: text/plain, Size: 3316 bytes --]

Shaohua Li <shli@fb.com> writes:

> On Thu, Sep 24, 2015 at 02:03:53PM +1000, Neil Brown wrote:
>> Shaohua Li <shli@fb.com> writes:
>> 
>> > On Wed, Sep 23, 2015 at 04:05:33PM +1000, Neil Brown wrote:
>> >> Shaohua Li <shli@fb.com> writes:
>> >> 
>> >> > If faulty disks of an array are more than allowed degraded number, the
>> >> > array enters error handling. It will be marked as read-only with
>> >> > MD_CHANGE_PENDING/RECOVERY_NEEDED set. But currently recovery doesn't
>> >> > clear CHANGE_PENDING bit for read-only array.  If MD_CHANGE_PENDING is
>> >> > set for a raid5 array, all returned IO will be hold on a list till the
>> >> > bit is clear. But recovery nevery clears this bit, the IO is always in
>> >> > pending state and nevery finish. This has bad effects like upper layer
>> >> > can't get an IO error and the array can't be stopped.
>> >> >
>> >> > Signed-off-by: Shaohua Li <shli@fb.com>
>> >> > ---
>> >> >  drivers/md/md.c | 1 +
>> >> >  1 file changed, 1 insertion(+)
>> >> >
>> >> > diff --git a/drivers/md/md.c b/drivers/md/md.c
>> >> > index 95824fb..c596b73 100644
>> >> > --- a/drivers/md/md.c
>> >> > +++ b/drivers/md/md.c
>> >> > @@ -8209,6 +8209,7 @@ void md_check_recovery(struct mddev *mddev)
>> >> >  			md_reap_sync_thread(mddev);
>> >> >  			clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
>> >> >  			clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>> >> > +			clear_bit(MD_CHANGE_PENDING, &mddev->flags);
>> >> >  			goto unlock;
>> >> >  		}
>> >> >  
>> >> > -- 
>> >> > 1.8.1
>> >> 
>> >> Hi,
>> >>  I can see that clearing MD_CHANGE_PENDING there is probably correct -
>> >>  bug introduced by
>> >>    Commit: c3cce6cda162 ("md/raid5: ensure device failure recorded before write request returns.")
>> >> 
>> >>  However I don't understand your reasoning.  You say that the array is
>> >>  marked as read-only, but I don't see how that would happen.  What
>> >>  causes the array to be marked "read-only"?
>> >
>> > It's set read-only by mdadm. I didn't look carefully, but looks there is
>> > disk failure event, mdadm is invoked automatically by some background
>> > daemon. It's a ubuntu distribution.
>> 
>> Thanks.
>> This raises a couple of questions.
>> 
>> 1/ What should md_set_readonly do if it finds that MD_CHANGE_PENDING is
>>    set?
>>   Maybe it should wait for md_check_recovery to get run which should
>>   clear the bit, after probably writing out the metadata.
>> 
>> 2/ Why didn't md_check_recovery already do that before mdadm had a
>>    chance to set the array read-only?
>>   I guess that is just a timing thing.  md_check_recovery could be
>>   delayed, and mdadm could get called by udev rather quickly.
>> 
>> I think I'll get md_set_readonly to
>> 	wait_event(mddev->sb_wait,
>> 		   !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>> 
>> because I think that is the right thing to do.  But if the array is
>> already read-only that won't help, so I'll still need you patch.
>> 
>> Would you be able to test that the following patch (without your patch)
>> also fixes the symptom?
>
> Yes, the wait_event patch fixes the issue (without my patch).
>
> Thanks,
> Shaohua

Thanks for testing.  I change "Reported-by:" to
"Reported-and-tested-by:" :-)

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

      reply	other threads:[~2015-09-30  6:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-18 17:20 [PATCH 1/2] md: clear CHANGE_PENDING in readonly array Shaohua Li
2015-09-18 17:20 ` [PATCH 2/2] raid5: update analysis state for failed stripe Shaohua Li
2015-09-23  6:21   ` Neil Brown
2015-09-23  6:34     ` Shaohua Li
2015-09-24  5:26       ` Neil Brown
2015-09-23  6:05 ` [PATCH 1/2] md: clear CHANGE_PENDING in readonly array Neil Brown
2015-09-23  6:23   ` Shaohua Li
2015-09-24  4:03     ` Neil Brown
2015-09-24 16:47       ` Shaohua Li
2015-09-30  6:59         ` Neil Brown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fv1wl7dy.fsf@notabene.neil.brown.name \
    --to=neilb@suse.de \
    --cc=Kernel-team@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).