Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Joel Fernandes <joelagnelf@nvidia.com>
To: "Paul E . McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	rcu@vger.kernel.org
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang@linux.dev>, Shuah Khan <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH -next 5/8] rcu/nocb: Add warning to detect if overload advancement is ever useful
Date: Tue, 13 Jan 2026 20:09:35 -0500	[thread overview]
Message-ID: <654e3bde-764b-49ae-8faa-bab8199b1f15@nvidia.com> (raw)
In-Reply-To: <20260101163417.1065705-6-joelagnelf@nvidia.com>

Since I am resubmitting the nocb patches in this series (3 of them from this
series) for the next merge window, I thought I'll replace this particular patch
with just a deletion of the rcu_advance_cbs_nowake() call itself instead of
bloating the code path with warnings and comments.

linux-next and many days of testing on my side are also looking good.

Thoughts?  Once I get any opinions, I'll change this patch to do the deletion.
Also I am adding one other (trivial) patch to this series:
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=nocb-7.0&id=84669d678b9cb28ff8774a3b6457186a4a187c75

Running overnight tests on all 4 patches now...

thanks,

 - Joel

On 1/1/2026 11:34 AM, Joel Fernandes wrote:
> During callback overload, the NOCB code attempts an opportunistic
> advancement via rcu_advance_cbs_nowake().
> 
> Analysis via tracing with 300,000 callbacks flooded shows this
> optimization is likely dead code:
> - 30 overload conditions triggered
> - 0 advancements actually occurred
> - 100% of time no advancement due to current GP not done.
> 
> I also ran TREE05 and TREE08 for 2 hours and cannot trigger it.
> 
> When callbacks overflow (exceed qhimark), they are waiting for a grace
> period that hasn't completed yet. The optimization requires the GP to be
> complete to advance callbacks, but the overload condition itself is
> caused by callbacks piling up faster than GPs can complete. This creates
> a logical contradiction where the advancement cannot happen.
> 
> In *theory* this might be possible, the GP completed just in the nick of
> time as we hit the overload, but this is just so rare that it can be
> considered impossible when we cannot even hit it with synthetic callback
> flooding even, it is a waste of cycles to even try to advance, let alone
> be useful and is a maintenance burden complexity we don't need.
> 
> I suggest deletion. However, add a WARN_ON_ONCE for a merge window or 2
> and delete it after out of extreme caution.
> 
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> ---
>  kernel/rcu/tree_nocb.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 7e9d465c8ab1..d3e6a0e77210 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -571,8 +571,20 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  		if (j != rdp->nocb_gp_adv_time &&
>  		    rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
>  		    rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
> +			long done_before = rcu_segcblist_get_seglen(&rdp->cblist, RCU_DONE_TAIL);
> +
>  			rcu_advance_cbs_nowake(rdp->mynode, rdp);
>  			rdp->nocb_gp_adv_time = j;
> +
> +			/*
> +			 * The advance_cbs call above is not useful. Under an
> +			 * overload condition, nocb_gp_wait() is always waiting
> +			 * for GP completion, due to this nothing can be moved
> +			 * from WAIT to DONE, in the list. WARN if an
> +			 * advancement happened (next step is deletion of advance).
> +			 */
> +			WARN_ON_ONCE(rcu_segcblist_get_seglen(&rdp->cblist,
> +				     RCU_DONE_TAIL) > done_before);
>  		}
>  	}
>  


  reply	other threads:[~2026-01-14  1:09 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-01 16:34 [PATCH -next 0/8] RCU updates from me for next merge window Joel Fernandes
2026-01-01 16:34 ` [PATCH -next 1/8] rcu: Fix rcu_read_unlock() deadloop due to softirq Joel Fernandes
2026-01-02 17:28   ` Steven Rostedt
2026-01-02 17:30     ` Steven Rostedt
2026-01-02 19:51       ` Paul E. McKenney
2026-01-03  0:41         ` Joel Fernandes
2026-01-04  3:20           ` Yao Kai
2026-01-05 17:16             ` Steven Rostedt
2026-01-09 16:38               ` Steven Rostedt
2026-01-04 10:00           ` Boqun Feng
2026-01-07 23:14   ` Frederic Weisbecker
2026-01-08  1:02     ` Joel Fernandes
2026-01-08  1:35       ` Joel Fernandes
2026-01-08  3:35         ` Joel Fernandes
2026-01-08 15:39           ` Frederic Weisbecker
2026-01-08 15:57             ` Mathieu Desnoyers
2026-01-08 15:25       ` Frederic Weisbecker
2026-01-09  1:12         ` Joel Fernandes
2026-01-09 14:23           ` Frederic Weisbecker
2026-01-01 16:34 ` [PATCH -next 2/8] srcu: Use suitable gfp_flags for the init_srcu_struct_nodes() Joel Fernandes
2026-01-01 16:34 ` [PATCH -next 3/8] rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake path Joel Fernandes
2026-01-08 15:57   ` Frederic Weisbecker
2026-01-09  1:39     ` Joel Fernandes
2026-01-09 10:32       ` Boqun Feng
2026-01-09 11:20         ` Joel Fernandes
2026-01-11 12:14           ` Boqun Feng
2026-01-01 16:34 ` [PATCH -next 4/8] rcu/nocb: Add warning if no rcuog wake up attempt happened during overload Joel Fernandes
2026-01-08 17:22   ` Frederic Weisbecker
2026-01-09  3:49     ` Joel Fernandes
2026-01-09 14:36       ` Frederic Weisbecker
2026-01-09 21:20         ` Joel Fernandes
2026-01-01 16:34 ` [PATCH -next 5/8] rcu/nocb: Add warning to detect if overload advancement is ever useful Joel Fernandes
2026-01-14  1:09   ` Joel Fernandes [this message]
2026-01-01 16:34 ` [PATCH -next 6/8] rcu: Reduce synchronize_rcu() latency by reporting GP kthread's CPU QS early Joel Fernandes
2026-01-01 16:34 ` [PATCH -next 7/8] rcutorture: Prevent concurrent kvm.sh runs on same source tree Joel Fernandes
2026-01-01 16:34 ` [PATCH -next 8/8] rcutorture: Add --kill-previous option to terminate previous kvm.sh runs Joel Fernandes
2026-01-01 22:48   ` Paul E. McKenney
2026-01-04 10:55 ` [PATCH -next 0/8] RCU updates from me for next merge window Boqun Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=654e3bde-764b-49ae-8faa-bab8199b1f15@nvidia.com \
    --to=joelagnelf@nvidia.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=qiang.zhang@linux.dev \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox