linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Guoqing Jiang <gqjiang@suse.com>, linux-raid@vger.kernel.org
Subject: Re: [PATCH V3 08/13] md: set MD_CHANGE_PENDING in a spinlocked region
Date: Fri, 29 Apr 2016 11:21:37 -0700	[thread overview]
Message-ID: <20160429182137.GA103471@kernel.org> (raw)
In-Reply-To: <87eg9purw7.fsf@notabene.neil.brown.name>

On Fri, Apr 29, 2016 at 11:26:32AM +1000, NeilBrown wrote:
> On Thu, Apr 28 2016, Shaohua Li wrote:
> 
> > On Wed, Apr 27, 2016 at 10:55:43PM -0400, Guoqing Jiang wrote:
> >> 
> >> 
> >> On 04/27/2016 11:27 AM, Shaohua Li wrote:
> >> >On Tue, Apr 26, 2016 at 09:56:26PM -0400, Guoqing Jiang wrote:
> >> >>Some code waits for a metadata update by:
> >> >>
> >> >>1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN)
> >> >>2. setting MD_CHANGE_PENDING and waking the management thread
> >> >>3. waiting for MD_CHANGE_PENDING to be cleared
> >> >>
> >> >>If the first two are done without locking, the code in md_update_sb()
> >> >>which checks if it needs to repeat might test if an update is needed
> >> >>before step 1, then clear MD_CHANGE_PENDING after step 2, resulting
> >> >>in the wait returning early.
> >> >>
> >> >>So make sure all places that set MD_CHANGE_PENDING are protected by
> >> >>mddev->lock.
> >> >>
> >> >>Reviewed-by: NeilBrown <neilb@suse.com>
> >> >>Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
> >> >>---
> >> >>V3 changes:
> >> >>1. use spin_lock_irqsave/spin_unlock_irqrestore in error funcs and
> >> >>    raid10's __make_request
> >> >shouldn't other places use spin_lock_irq/spin_unlock_irq? interrupt can occur
> >> >after you do spin_lock(), and if it's md_error, we deadlock.
> >> 
> >> It could possible in theory if func was interrupted by md_error after it
> >> called spin_lock,
> >> but seems lots of place in md.c also use spin_lock/unlock for mddev->lock,
> >> take
> >> md_do_sync and md_update_sb as example, both of them used
> >> spin_lock(&mddev->lock)
> >> and spin_unlock(&mddev->lock) before.
> >> 
> >> So I guess it will not cause trouble, otherwise, then we need to change all
> >> the usages of
> >> spin_lock/unlock(&mddev->lock), or introduce a new lock for this scenario. I
> >> am not sure
> >> which one is more acceptable.
> >
> > It doesn't cause trouble, because no interrupt/bh uses lock before. But now we
> > use it in softirq, that's the difference. Please enable lockdep, I think it
> > will complain. either we change all the locking to irq save or introducing a
> > new lock. either is ok.
> 
> Thanks for catching this!
> 
> As you say, the straight forward solution is change all the current
>   spin_lock(&mddev->lock);
> to
>   spin_lock_irq(&mddev->lock);
> 
> and similar for unlock.  Except where it can be called from interrupts
> we have to use spin_lock_irqsave().
> 
> There is another option that occurs to me.  Not sure if it is elegant or
> ugly, so I'm keen to see what you think.
> 
> In the places where were set MD_CHANGE_PENDING and one other bit -
> either MD_CHANGE_DEVS or MD_CHANGE_CLEAN - we could use set_mask_bits
> to set them both atomically.
> 
>   set_mask_bits(&mddev->flags, BIT(MD_CHANGE_PENDING) | BIT(MD_CHANGE_DEVS));
> 
> Then in md_update_sb, when deciding whether to loop back to "repeat:" we
> use a new "bit_clear_unless".
> 
> #define bit_clear_unless(ptr, _clear, _test)			\
> ({								\
> 	const typeof(*ptr) clear = (_clear), test = (_test);	\
> 	typeof(*ptr) old, new;					\
> 								\
> 	do {							\
> 		old = ACCESS_ONCE(*ptr);			\
> 		new = old & ~clear;				\
> 	} while (!(old & test) && cmpxchg(ptr, old, new) != old);\
> 								\
> 	!(old & test);							\
> })
> 
> The code in md_update_sb() would be
> 
>  if (mddev->in_sync != sync_req ||
>      !bit_clear_unless(&mddev->flags, BIT(MD_CHANGE_PENDING),
>                        BIT(MD_CHANGE_DEVS)|BIT(MD_CHANGE_CLEAN)))
>         goto repeat;
> 
> 
> So if either DEV or CLEAN we set, PENDING would not be cleared and the
> code would goto repeat.
> 
> What do you think?

Looks great, I like this idea. Thanks!

  reply	other threads:[~2016-04-29 18:21 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21  5:58 [PATCH 00/13] The latest patches for md-cluster Guoqing Jiang
2016-04-21  5:58 ` [PATCH 01/13] md-cluster: change resync lock from asynchronous to synchronous Guoqing Jiang
2016-04-21  6:20   ` kbuild test robot
2016-04-21  6:55   ` [PATCH V2 " Guoqing Jiang
2016-04-21  5:58 ` [PATCH 02/13] md-cluser: make resync_finish only called after pers->sync_request Guoqing Jiang
2016-04-21  5:58 ` [PATCH 03/13] md-cluster: wake up thread to continue recovery Guoqing Jiang
2016-04-21  5:58 ` [PATCH 04/13] md-cluster: unregister thread if err happened Guoqing Jiang
2016-04-21  5:58 ` [PATCH 05/13] md-cluster: fix locking when node joins cluster during message broadcast Guoqing Jiang
2016-04-21  5:58 ` [PATCH 06/13] md-cluster: change array_sectors and update size are not supported Guoqing Jiang
2016-04-21  5:58 ` [PATCH 07/13] md-cluster: wakeup thread if activated a spare disk Guoqing Jiang
2016-04-21  5:58 ` [PATCH 08/13] md: set MD_CHANGE_PENDING in a spinlocked region Guoqing Jiang
2016-04-21  6:26   ` kbuild test robot
2016-04-21  6:58   ` [PATCH V2 " Guoqing Jiang
2016-04-25 17:32     ` Shaohua Li
2016-04-26  3:19       ` Guoqing Jiang
2016-04-27  1:56     ` [PATCH V3 " Guoqing Jiang
2016-04-27 15:27       ` Shaohua Li
2016-04-28  2:55         ` Guoqing Jiang
2016-04-28  3:58           ` Shaohua Li
2016-04-29  1:26             ` NeilBrown
2016-04-29 18:21               ` Shaohua Li [this message]
2016-04-21  5:58 ` [PATCH 09/13] md-cluster: always setup in-memory bitmap Guoqing Jiang
2016-04-21  7:00   ` kbuild test robot
2016-04-21  7:00   ` [PATCH] md-cluster: fix ifnullfree.cocci warnings kbuild test robot
2016-04-21  9:10     ` Guoqing Jiang
2016-04-25 17:45   ` [PATCH 09/13] md-cluster: always setup in-memory bitmap Shaohua Li
2016-04-26  3:22     ` Guoqing Jiang
2016-04-27 15:24       ` Shaohua Li
2016-04-28  2:59         ` Guoqing Jiang
2016-04-21  5:58 ` [PATCH 10/13] md-cluster: sync bitmap when node received RESYNCING msg Guoqing Jiang
2016-04-21  5:58 ` [PATCH 11/13] md-cluster/bitmap: fix wrong calcuation of offset Guoqing Jiang
2016-04-21  5:58 ` [PATCH 12/13] md-cluster/bitmap: fix wrong page num in bitmap_file_clear_bit and bitmap_file_set_bit Guoqing Jiang
2016-04-21  5:58 ` [PATCH 13/13] md-cluster/bitmap: unplug bitmap to sync dirty pages to disk Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160429182137.GA103471@kernel.org \
    --to=shli@kernel.org \
    --cc=gqjiang@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).