From: Shaohua Li <shli@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Guoqing Jiang <gqjiang@suse.com>, linux-raid@vger.kernel.org
Subject: Re: [PATCH V3 08/13] md: set MD_CHANGE_PENDING in a spinlocked region
Date: Fri, 29 Apr 2016 11:21:37 -0700 [thread overview]
Message-ID: <20160429182137.GA103471@kernel.org> (raw)
In-Reply-To: <87eg9purw7.fsf@notabene.neil.brown.name>
On Fri, Apr 29, 2016 at 11:26:32AM +1000, NeilBrown wrote:
> On Thu, Apr 28 2016, Shaohua Li wrote:
>
> > On Wed, Apr 27, 2016 at 10:55:43PM -0400, Guoqing Jiang wrote:
> >>
> >>
> >> On 04/27/2016 11:27 AM, Shaohua Li wrote:
> >> >On Tue, Apr 26, 2016 at 09:56:26PM -0400, Guoqing Jiang wrote:
> >> >>Some code waits for a metadata update by:
> >> >>
> >> >>1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN)
> >> >>2. setting MD_CHANGE_PENDING and waking the management thread
> >> >>3. waiting for MD_CHANGE_PENDING to be cleared
> >> >>
> >> >>If the first two are done without locking, the code in md_update_sb()
> >> >>which checks if it needs to repeat might test if an update is needed
> >> >>before step 1, then clear MD_CHANGE_PENDING after step 2, resulting
> >> >>in the wait returning early.
> >> >>
> >> >>So make sure all places that set MD_CHANGE_PENDING are protected by
> >> >>mddev->lock.
> >> >>
> >> >>Reviewed-by: NeilBrown <neilb@suse.com>
> >> >>Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
> >> >>---
> >> >>V3 changes:
> >> >>1. use spin_lock_irqsave/spin_unlock_irqrestore in error funcs and
> >> >> raid10's __make_request
> >> >shouldn't other places use spin_lock_irq/spin_unlock_irq? interrupt can occur
> >> >after you do spin_lock(), and if it's md_error, we deadlock.
> >>
> >> It could possible in theory if func was interrupted by md_error after it
> >> called spin_lock,
> >> but seems lots of place in md.c also use spin_lock/unlock for mddev->lock,
> >> take
> >> md_do_sync and md_update_sb as example, both of them used
> >> spin_lock(&mddev->lock)
> >> and spin_unlock(&mddev->lock) before.
> >>
> >> So I guess it will not cause trouble, otherwise, then we need to change all
> >> the usages of
> >> spin_lock/unlock(&mddev->lock), or introduce a new lock for this scenario. I
> >> am not sure
> >> which one is more acceptable.
> >
> > It doesn't cause trouble, because no interrupt/bh uses lock before. But now we
> > use it in softirq, that's the difference. Please enable lockdep, I think it
> > will complain. either we change all the locking to irq save or introducing a
> > new lock. either is ok.
>
> Thanks for catching this!
>
> As you say, the straight forward solution is change all the current
> spin_lock(&mddev->lock);
> to
> spin_lock_irq(&mddev->lock);
>
> and similar for unlock. Except where it can be called from interrupts
> we have to use spin_lock_irqsave().
>
> There is another option that occurs to me. Not sure if it is elegant or
> ugly, so I'm keen to see what you think.
>
> In the places where were set MD_CHANGE_PENDING and one other bit -
> either MD_CHANGE_DEVS or MD_CHANGE_CLEAN - we could use set_mask_bits
> to set them both atomically.
>
> set_mask_bits(&mddev->flags, BIT(MD_CHANGE_PENDING) | BIT(MD_CHANGE_DEVS));
>
> Then in md_update_sb, when deciding whether to loop back to "repeat:" we
> use a new "bit_clear_unless".
>
> #define bit_clear_unless(ptr, _clear, _test) \
> ({ \
> const typeof(*ptr) clear = (_clear), test = (_test); \
> typeof(*ptr) old, new; \
> \
> do { \
> old = ACCESS_ONCE(*ptr); \
> new = old & ~clear; \
> } while (!(old & test) && cmpxchg(ptr, old, new) != old);\
> \
> !(old & test); \
> })
>
> The code in md_update_sb() would be
>
> if (mddev->in_sync != sync_req ||
> !bit_clear_unless(&mddev->flags, BIT(MD_CHANGE_PENDING),
> BIT(MD_CHANGE_DEVS)|BIT(MD_CHANGE_CLEAN)))
> goto repeat;
>
>
> So if either DEV or CLEAN we set, PENDING would not be cleared and the
> code would goto repeat.
>
> What do you think?
Looks great, I like this idea. Thanks!
next prev parent reply other threads:[~2016-04-29 18:21 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-21 5:58 [PATCH 00/13] The latest patches for md-cluster Guoqing Jiang
2016-04-21 5:58 ` [PATCH 01/13] md-cluster: change resync lock from asynchronous to synchronous Guoqing Jiang
2016-04-21 6:20 ` kbuild test robot
2016-04-21 6:55 ` [PATCH V2 " Guoqing Jiang
2016-04-21 5:58 ` [PATCH 02/13] md-cluser: make resync_finish only called after pers->sync_request Guoqing Jiang
2016-04-21 5:58 ` [PATCH 03/13] md-cluster: wake up thread to continue recovery Guoqing Jiang
2016-04-21 5:58 ` [PATCH 04/13] md-cluster: unregister thread if err happened Guoqing Jiang
2016-04-21 5:58 ` [PATCH 05/13] md-cluster: fix locking when node joins cluster during message broadcast Guoqing Jiang
2016-04-21 5:58 ` [PATCH 06/13] md-cluster: change array_sectors and update size are not supported Guoqing Jiang
2016-04-21 5:58 ` [PATCH 07/13] md-cluster: wakeup thread if activated a spare disk Guoqing Jiang
2016-04-21 5:58 ` [PATCH 08/13] md: set MD_CHANGE_PENDING in a spinlocked region Guoqing Jiang
2016-04-21 6:26 ` kbuild test robot
2016-04-21 6:58 ` [PATCH V2 " Guoqing Jiang
2016-04-25 17:32 ` Shaohua Li
2016-04-26 3:19 ` Guoqing Jiang
2016-04-27 1:56 ` [PATCH V3 " Guoqing Jiang
2016-04-27 15:27 ` Shaohua Li
2016-04-28 2:55 ` Guoqing Jiang
2016-04-28 3:58 ` Shaohua Li
2016-04-29 1:26 ` NeilBrown
2016-04-29 18:21 ` Shaohua Li [this message]
2016-04-21 5:58 ` [PATCH 09/13] md-cluster: always setup in-memory bitmap Guoqing Jiang
2016-04-21 7:00 ` kbuild test robot
2016-04-21 7:00 ` [PATCH] md-cluster: fix ifnullfree.cocci warnings kbuild test robot
2016-04-21 9:10 ` Guoqing Jiang
2016-04-25 17:45 ` [PATCH 09/13] md-cluster: always setup in-memory bitmap Shaohua Li
2016-04-26 3:22 ` Guoqing Jiang
2016-04-27 15:24 ` Shaohua Li
2016-04-28 2:59 ` Guoqing Jiang
2016-04-21 5:58 ` [PATCH 10/13] md-cluster: sync bitmap when node received RESYNCING msg Guoqing Jiang
2016-04-21 5:58 ` [PATCH 11/13] md-cluster/bitmap: fix wrong calcuation of offset Guoqing Jiang
2016-04-21 5:58 ` [PATCH 12/13] md-cluster/bitmap: fix wrong page num in bitmap_file_clear_bit and bitmap_file_set_bit Guoqing Jiang
2016-04-21 5:58 ` [PATCH 13/13] md-cluster/bitmap: unplug bitmap to sync dirty pages to disk Guoqing Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160429182137.GA103471@kernel.org \
--to=shli@kernel.org \
--cc=gqjiang@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).