From: Joel Fernandes <joel@joelfernandes.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier()
Date: Tue, 11 Oct 2022 02:01:23 +0000 [thread overview]
Message-ID: <Y0TOc8eigdzanBGQ@google.com> (raw)
In-Reply-To: <20221010223956.1041247-2-frederic@kernel.org>
On Tue, Oct 11, 2022 at 12:39:55AM +0200, Frederic Weisbecker wrote:
> Upon entraining a callback to a NOCB CPU, no further wake up is
> issued on the corresponding nocb_gp kthread. As a result, the callback
> and all the subsequent ones on that CPU may be ignored, at least until
> an RCU_NOCB_WAKE_FORCE timer is ever armed or another NOCB CPU belonging
> to the same group enqueues a callback on an empty queue.
>
> Here is a possible bad scenario:
>
> 1) CPU 0 is NOCB unlike all other CPUs.
> 2) CPU 0 queues a callback
> 2) The grace period related to that callback elapses
> 3) The callback is moved to the done list (but is not invoked yet),
> there are no more pending callbacks for CPU 0
> 4) CPU 1 calls rcu_barrier() and sends an IPI to CPU 0
> 5) CPU 0 entrains the callback but doesn't wake up nocb_gp
> 6) CPU 1 blocks forever, unless CPU 0 ever queues enough further
> callbacks to arm an RCU_NOCB_WAKE_FORCE timer.
>
> Make sure the necessary wake up is produced whenever necessary.
>
> Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
And if Paul is taking this, I'll rebase and drop this patch from the lazy
series.
thanks,
- Joel
> ---
> kernel/rcu/tree.c | 6 ++++++
> kernel/rcu/tree.h | 1 +
> kernel/rcu/tree_nocb.h | 5 +++++
> 3 files changed, 12 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 96d678c9cfb6..025f59f6f97f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3914,6 +3914,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> {
> unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
> unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
> + bool wake_nocb = false;
> + bool was_alldone = false;
>
> lockdep_assert_held(&rcu_state.barrier_lock);
> if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
> @@ -3922,6 +3924,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> rdp->barrier_head.func = rcu_barrier_callback;
> debug_rcu_head_queue(&rdp->barrier_head);
> rcu_nocb_lock(rdp);
> + was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
> WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
> atomic_inc(&rcu_state.barrier_cpu_count);
> @@ -3929,7 +3932,10 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> debug_rcu_head_unqueue(&rdp->barrier_head);
> rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
> }
> + wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
> rcu_nocb_unlock(rdp);
> + if (wake_nocb)
> + wake_nocb_gp(rdp, false);
> smp_store_release(&rdp->barrier_seq_snap, gseq);
> }
>
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index d4a97e40ea9c..925dd98f8b23 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
> static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
> static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
> static void rcu_init_one_nocb(struct rcu_node *rnp);
> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
> static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> unsigned long j);
> static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index f77a6d7e1356..094fd454b6c3 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
> {
> }
>
> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
> +{
> + return false;
> +}
> +
> static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> unsigned long j)
> {
> --
> 2.25.1
>
next prev parent reply other threads:[~2022-10-11 2:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-10 22:39 [PATCH 0/2] rcu/nocb fix and optimization Frederic Weisbecker
2022-10-10 22:39 ` [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier() Frederic Weisbecker
2022-10-11 2:01 ` Joel Fernandes [this message]
2022-10-11 7:25 ` Paul E. McKenney
2022-10-11 7:33 ` Joel Fernandes
2022-10-10 22:39 ` [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue Frederic Weisbecker
2022-10-11 2:00 ` Joel Fernandes
2022-10-11 2:08 ` Joel Fernandes
2022-10-11 19:21 ` Frederic Weisbecker
2022-10-11 23:47 ` Joel Fernandes
2022-10-12 10:23 ` Frederic Weisbecker
2022-10-12 14:49 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0TOc8eigdzanBGQ@google.com \
--to=joel@joelfernandes.org \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.