public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	rushikesh.s.kadam@intel.com, urezki@gmail.com,
	neeraj.iitr10@gmail.com, paulmck@kernel.org, rostedt@goodmis.org,
	vineeth@bitbyteword.org, boqun.feng@gmail.com
Subject: Re: [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation
Date: Mon, 5 Sep 2022 14:59:49 +0200	[thread overview]
Message-ID: <20220905125949.GA173859@lothringen> (raw)
In-Reply-To: <67122ae3-d69e-438c-18fc-a8de6e40201e@joelfernandes.org>

On Fri, Sep 02, 2022 at 07:09:39PM -0400, Joel Fernandes wrote:
> On 9/2/2022 11:21 AM, Frederic Weisbecker wrote:
> >> @@ -3904,7 +3943,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> >>  	rdp->barrier_head.func = rcu_barrier_callback;
> >>  	debug_rcu_head_queue(&rdp->barrier_head);
> >>  	rcu_nocb_lock(rdp);
> >> -	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> >> +	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false,
> >> +		     /* wake gp thread */ true));
> > 
> > It's a bad sign when you need to start commenting your boolean parameters :)
> > Perhaps use a single two-bit flag instead of two booleans, for readability?
> 
> That's fair, what do you mean 2-bit flag? Are you saying, we encode the last 2
> parameters to flush bypass in a u*?

Yeah exactly. Such as rcu_nocb_flush_bypass(rdp, NULL, jiffies, FLUSH_LAZY | FLUSH_WAKE)

> 
> > Also that's a subtle change which purpose isn't explained. It means that
> > rcu_barrier_entrain() used to wait for the bypass timer in the worst case
> > but now we force rcuog into it immediately. Should that be a separate change?
> 
> It could be split, but it is laziness that amplifies the issue so I thought of
> keeping it in the same patch. I don't mind one way or the other.

Ok then lets keep it here but please add a comment for the reason to
force wake here.

> >> +	case RCU_NOCB_WAKE_BYPASS:
> >> +		mod_jif = 2;
> >> +		break;
> >> +
> >> +	case RCU_NOCB_WAKE:
> >> +	case RCU_NOCB_WAKE_FORCE:
> >> +		// For these, make it wake up the soonest if we
> >> +		// were in a bypass or lazy sleep before.
> >>  		if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
> >> -			mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
> >> -		if (rdp_gp->nocb_defer_wakeup < waketype)
> >> -			WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
> >> +			mod_jif = 1;
> >> +		break;
> >>  	}
> >>  
> >> +	if (mod_jif)
> >> +		mod_timer(&rdp_gp->nocb_timer, jiffies + mod_jif);
> >> +
> >> +	if (rdp_gp->nocb_defer_wakeup < waketype)
> >> +		WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
> > 
> > So RCU_NOCB_WAKE_BYPASS and RCU_NOCB_WAKE_LAZY don't override the timer state
> > anymore? Looks like something is missing.
> 
> My goal was to make sure that NOCB_WAKE_LAZY wake keeps the timer lazy. If I
> don't do this, then when CPU enters idle, it will immediately do a wake up via
> this call:
> 
> 	rcu_nocb_need_deferred_wakeup(rdp_gp, RCU_NOCB_WAKE)

But if the timer is in RCU_NOCB_WAKE_LAZY mode, that shouldn't be a problem.

> 
> That was almost always causing lazy CBs to be non-lazy thus negating all the
> benefits.
> 
> Note that bypass will also have same issue where the bypass CB will not wait for
> intended bypass duration. To make it consistent with lazy, I made bypass also
> not override nocb_defer_wakeup.

I'm surprised because rcu_nocb_flush_deferred_wakeup() should only do the wake up
if the timer is RCU_NOCB_WAKE or RCU_NOCB_WAKE_FORCE. Or is that code buggy
somehow?

Actually your change is modifying the timer delay without changing the timer
mode, which may shortcut rcu_nocb_flush_deferred_wakeup() check and actually
make it perform early upon idle loop entry.

Or am I missing something?


> 
> I agree its not pretty, but it works and I could not find any code path where it
> does not work. That said, I am open to ideas for changing this and perhaps some
> of these unneeded delays with bypass CBs can be split into separate patches.
> 
> Regarding your point about nocb_defer_wakeup state diverging from the timer
> programming, that happens anyway here in current code:
> 
>  283        } else {
>  284                if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
>  285                        mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
>  286                if (rdp_gp->nocb_defer_wakeup < waketype)
>  287                        WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
>  288        }

How so?

> >> @@ -705,12 +816,21 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
> >>  	my_rdp->nocb_gp_gp = needwait_gp;
> >>  	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
> >>  
> >> -	if (bypass && !rcu_nocb_poll) {
> >> -		// At least one child with non-empty ->nocb_bypass, so set
> >> -		// timer in order to avoid stranding its callbacks.
> >> -		wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
> >> -				   TPS("WakeBypassIsDeferred"));
> >> +	// At least one child with non-empty ->nocb_bypass, so set
> >> +	// timer in order to avoid stranding its callbacks.
> >> +	if (!rcu_nocb_poll) {
> >> +		// If bypass list only has lazy CBs. Add a deferred
> >> +		// lazy wake up.
> >> +		if (lazy && !bypass) {
> >> +			wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY,
> >> +					TPS("WakeLazyIsDeferred"));
> > 
> > What if:
> > 
> > 1) rdp(1) has lazy callbacks
> > 2) all other rdp's have no callback at all
> > 3) nocb_gp_wait() runs through all rdp's, everything is handled, except for
> >    these lazy callbacks
> > 4) It reaches the above path, ready to arm the RCU_NOCB_WAKE_LAZY timer,
> >    but it hasn't yet called wake_nocb_gp_defer()
> > 5) Oh but rdp(2) queues a non-lazy callback. interrupts are disabled so it defers
> >    the wake up to nocb_gp_wait() with arming the timer in RCU_NOCB_WAKE.
> > 6) nocb_gp_wait() finally calls wake_nocb_gp_defer() and override the timeout
> >    to several seconds ahead.
> > 7) No more callbacks queued, the non-lazy callback will have to wait several
> >    seconds to complete.
> > 
> > Or did I miss something?
> 
> In theory, I can see this being an issue. In practice, I have not seen it to
> be.

What matters is that the issue looks plausible.

> In my view, the nocb GP thread should not go to sleep in the first place if
> there are any non-bypass CBs being queued. If it does, then that seems an
> optimization-related bug.

Yeah, it's a constraint introduced by the optimized delayed wake up.
By why would RCU_NOCB_WAKE_LAZY need to overwrite RCU_NOCB_WAKE in the first
place?

> 
> That said, we can make wake_nocb_gp_defer() more robust perhaps by making it not
> overwrite the timer if the wake-type requested is weaker than RCU_NOCB_WAKE,
> however that should not cause the going-into-idle issues I pointed. Whether the
> idle time issue will happen, I have no idea. But in theory, will that address
> your concern above?

Yes, but I'm still confused by this idle time issue.

> 
> > Note that the race exists with RCU_NOCB_WAKE_BYPASS
> > but it's only about one jiffy delay, not seconds.
> 
> Well, 2 jiffies. But yeah.
> 
> Thanks, so far I do not see anything that cannot be fixed on top of this patch
> but you raised some good points. Maybe we ought to rewrite the idle path to not
> disturb lazy CBs in a different way, or something (while keeping the timer state
> consistent with the programming of the timer in wake_nocb_gp_defer()).

I'd rather see an updated patch (not the whole patchset but just this one) rather
than deltas, just to make sure I'm not missing something in the whole picture.

Thanks.

  reply	other threads:[~2022-09-05 13:00 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 22:17 [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 01/18] mm/slub: perform free consistency checks before call_rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 02/18] mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head Joel Fernandes (Google)
2022-09-02  9:26   ` Vlastimil Babka
2022-09-02  9:30     ` Vlastimil Babka
2022-09-02 15:09       ` Joel Fernandes
2022-09-03 13:53         ` Paul E. McKenney
2022-09-01 22:17 ` [PATCH v5 03/18] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
2022-09-06 22:26   ` Boqun Feng
2022-09-06 22:31     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 04/18] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google)
2022-09-02 11:35   ` Frederic Weisbecker
2022-09-02 23:58     ` Joel Fernandes
2022-09-03 15:10       ` Paul E. McKenney
2022-09-04 21:13       ` Frederic Weisbecker
2022-09-03 14:04   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-06  3:07   ` Joel Fernandes
2022-09-06  9:48     ` Frederic Weisbecker
2022-09-07  2:43       ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 05/18] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation Joel Fernandes (Google)
2022-09-02 15:21   ` Frederic Weisbecker
2022-09-02 23:09     ` Joel Fernandes
2022-09-05 12:59       ` Frederic Weisbecker [this message]
2022-09-05 20:18         ` Joel Fernandes
2022-09-05 20:32         ` Joel Fernandes
2022-09-06  8:55           ` Paul E. McKenney
2022-09-06 16:16             ` Joel Fernandes
2022-09-06 17:05               ` Paul E. McKenney
2022-09-03 22:00     ` Joel Fernandes
2022-09-04 21:01       ` Frederic Weisbecker
2022-09-05 20:20         ` Joel Fernandes
2022-09-06  3:05     ` Joel Fernandes
2022-09-06 15:17       ` Frederic Weisbecker
2022-09-06 16:15         ` Joel Fernandes
2022-09-06 16:31           ` Joel Fernandes
2022-09-06 16:38             ` Joel Fernandes
2022-09-06 16:43               ` Joel Fernandes
2022-09-06 19:11                 ` Frederic Weisbecker
2022-09-07  2:56                   ` Joel Fernandes
2022-09-07  9:56                     ` Frederic Weisbecker
2022-09-07 10:03           ` Frederic Weisbecker
2022-09-07 14:01             ` Joel Fernandes
2022-09-07  0:06         ` Joel Fernandes
2022-09-07  9:40           ` Frederic Weisbecker
2022-09-07 13:44             ` Joel Fernandes
2022-09-07 15:38               ` Frederic Weisbecker
2022-09-07 15:39                 ` Joel Fernandes
2022-09-21 23:52             ` Joel Fernandes
2022-09-06 18:16       ` Uladzislau Rezki
2022-09-06 18:21         ` Joel Fernandes
2022-09-07  8:52           ` Uladzislau Rezki
2022-09-07 15:23             ` Joel Fernandes
2022-09-03 14:03   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 07/18] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 08/18] rcu: Add per-CB tracing for queuing, flush and invocation Joel Fernandes (Google)
2022-09-02 16:48   ` kernel test robot
2022-09-03 12:39     ` Paul E. McKenney
2022-09-03 14:07       ` Joel Fernandes
2022-09-02 19:01   ` kernel test robot
2022-09-01 22:17 ` [PATCH v5 09/18] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 10/18] rcutorture: Add test code for call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 11/18] fs: Move call_rcu() to call_rcu_lazy() in some paths Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 12/18] cred: Move call_rcu() to call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 13/18] security: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 14/18] net/core: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 15/18] kernel: Move various core kernel usages " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 16/18] lib: Move call_rcu() " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 17/18] i915: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 18/18] fork: Move thread_stack_free_rcu() " Joel Fernandes (Google)
2022-09-03 15:22 ` [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220905125949.GA173859@lothringen \
    --to=frederic@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rushikesh.s.kadam@intel.com \
    --cc=urezki@gmail.com \
    --cc=vineeth@bitbyteword.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox