public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, rcu <rcu@vger.kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading
Date: Fri, 24 Mar 2023 00:55:23 +0000	[thread overview]
Message-ID: <20230324005523.GB723582@google.com> (raw)
In-Reply-To: <c614c542-f2b5-4b39-bbc4-ae5f0a125c81@paulmck-laptop>

On Wed, Mar 22, 2023 at 04:18:24PM -0700, Paul E. McKenney wrote:
> On Wed, Mar 22, 2023 at 08:44:53PM +0100, Frederic Weisbecker wrote:
> > The shrinker may run concurrently with callbacks (de-)offloading. As
> > such, calling rcu_nocb_lock() is very dangerous because it does a
> > conditional locking. The worst outcome is that rcu_nocb_lock() doesn't
> > lock but rcu_nocb_unlock() eventually unlocks, or the reverse, creating
> > an imbalance.
> > 
> > Fix this with protecting against (de-)offloading using the barrier mutex.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> 
> Good catch!!!  A few questions, comments, and speculations below.

Added a few more. ;)

> > ---
> >  kernel/rcu/tree_nocb.h | 17 ++++++++++++++++-
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index f2280616f9d5..dd9b655ae533 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -1336,13 +1336,25 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  	unsigned long flags;
> >  	unsigned long count = 0;
> >  
> > +	/*
> > +	 * Protect against concurrent (de-)offloading. Otherwise nocb locking
> > +	 * may be ignored or imbalanced.
> > +	 */
> > +	mutex_lock(&rcu_state.barrier_mutex);
> 
> I was worried about this possibly leading to out-of-memory deadlock,
> but if I recall correctly, the (de-)offloading process never allocates
> memory, so this should be OK?

Maybe trylock is better then? If we can't make progress, may be better to let
kswapd free memory by other means than blocking on the mutex.

ISTR, from my Android days that there are weird lockdep issues that happen
when locking in a shrinker (due to the 'fake lock' dependency added during
reclaim).

> The other concern was that the (de-)offloading operation might take a
> long time, but the usual cause for that is huge numbers of callbacks,
> in which case letting them free their memory is not necessarily a bad
> strategy.
> 
> > +
> >  	/* Snapshot count of all CPUs */
> >  	for_each_possible_cpu(cpu) {
> >  		struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> > -		int _count = READ_ONCE(rdp->lazy_len);
> > +		int _count;
> > +
> > +		if (!rcu_rdp_is_offloaded(rdp))
> > +			continue;
> 
> If the CPU is offloaded, isn't ->lazy_len guaranteed to be zero?

Did you mean de-offloaded? If it is offloaded, that means nocb is active so
there could be lazy CBs queued. Or did I miss something?

thanks,

 - Joel


> Or can it contain garbage after a de-offloading operation?
> 
> > +		_count = READ_ONCE(rdp->lazy_len);
> >  
> >  		if (_count == 0)
> >  			continue;
> > +
> >  		rcu_nocb_lock_irqsave(rdp, flags);
> >  		WRITE_ONCE(rdp->lazy_len, 0);
> >  		rcu_nocb_unlock_irqrestore(rdp, flags);
> > @@ -1352,6 +1364,9 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> >  		if (sc->nr_to_scan <= 0)
> >  			break;
> >  	}
> > +
> > +	mutex_unlock(&rcu_state.barrier_mutex);
> > +
> >  	return count ? count : SHRINK_STOP;
> >  }
> >  
> > -- 
> > 2.34.1
> > 

  reply	other threads:[~2023-03-24  0:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22 19:44 [PATCH 0/4] rcu/nocb: Shrinker related boring fixes Frederic Weisbecker
2023-03-22 19:44 ` [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading Frederic Weisbecker
2023-03-22 23:18   ` Paul E. McKenney
2023-03-24  0:55     ` Joel Fernandes [this message]
2023-03-24  1:06       ` Paul E. McKenney
2023-03-24 22:09     ` Frederic Weisbecker
2023-03-24 22:51       ` Paul E. McKenney
2023-03-26 20:01         ` Frederic Weisbecker
2023-03-26 21:45           ` Paul E. McKenney
2023-03-29 16:07             ` Frederic Weisbecker
2023-03-29 20:45               ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 2/4] rcu/nocb: Fix shrinker race against callback enqueuer Frederic Weisbecker
2023-03-22 23:19   ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 3/4] rcu/nocb: Recheck lazy callbacks under the ->nocb_lock from shrinker Frederic Weisbecker
2023-03-22 23:21   ` Paul E. McKenney
2023-03-22 19:44 ` [PATCH 4/4] rcu/nocb: Make shrinker to iterate only NOCB CPUs Frederic Weisbecker
2023-03-24  0:41   ` Joel Fernandes
  -- strict thread matches above, loose matches on Subject: below --
2023-03-29 16:01 [PATCH 0/4 v2] rcu/nocb: Shrinker related boring fixes Frederic Weisbecker
2023-03-29 16:02 ` [PATCH 1/4] rcu/nocb: Protect lazy shrinker against concurrent (de-)offloading Frederic Weisbecker
2023-03-29 20:44   ` Paul E. McKenney
2023-03-29 21:18     ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230324005523.GB723582@google.com \
    --to=joel@joelfernandes.org \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox