From: Uladzislau Rezki <urezki@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
linux-kernel@vger.kernel.org, frederic@kernel.org,
boqun.feng@gmail.com, neeraj.iitr10@gmail.com,
rcu@vger.kernel.org, rostedt@goodmis.org,
"Paul E. McKenney" <paulmck@kernel.org>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>
Subject: Re: [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case
Date: Tue, 19 Mar 2024 10:53:02 +0100 [thread overview]
Message-ID: <ZflgfrjZSZdqrLLw@pc636> (raw)
In-Reply-To: <404E28F5-B018-4CE7-BE57-D0362B0C9969@joelfernandes.org>
On Mon, Mar 18, 2024 at 05:05:31PM -0400, Joel Fernandes wrote:
>
>
> > On Mar 18, 2024, at 2:58 PM, Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > Hello, Joel!
> >
> > Sorry for late checking, see below few comments:
> >
> >> In the synchronize_rcu() common case, we will have less than
> >> SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker
> >> is pointless just to free the last injected wait head since at that point,
> >> all the users have already been awakened.
> >>
> >> Introduce a new counter to track this and prevent the wakeup in the
> >> common case.
> >>
> >> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >> ---
> >> Rebased on paul/dev of today.
> >>
> >> kernel/rcu/tree.c | 36 +++++++++++++++++++++++++++++++-----
> >> kernel/rcu/tree.h | 1 +
> >> 2 files changed, 32 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >> index 9fbb5ab57c84..bd29fe3c76bf 100644
> >> --- a/kernel/rcu/tree.c
> >> +++ b/kernel/rcu/tree.c
> >> @@ -96,6 +96,7 @@ static struct rcu_state rcu_state = {
> >> .ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED,
> >> .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
> >> rcu_sr_normal_gp_cleanup_work),
> >> + .srs_cleanups_pending = ATOMIC_INIT(0),
> >> };
> >>
> >> /* Dump rcu_node combining tree at boot to verify correct setup. */
> >> @@ -1642,8 +1643,11 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >> * the done tail list manipulations are protected here.
> >> */
> >> done = smp_load_acquire(&rcu_state.srs_done_tail);
> >> - if (!done)
> >> + if (!done) {
> >> + /* See comments below. */
> >> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending);
> >> return;
> >> + }
> >>
> >> WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
> >> head = done->next;
> >> @@ -1666,6 +1670,9 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >>
> >> rcu_sr_put_wait_head(rcu);
> >> }
> >> +
> >> + /* Order list manipulations with atomic access. */
> >> + atomic_dec_return_release(&rcu_state.srs_cleanups_pending);
> >> }
> >>
> >> /*
> >> @@ -1673,7 +1680,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >> */
> >> static void rcu_sr_normal_gp_cleanup(void)
> >> {
> >> - struct llist_node *wait_tail, *next, *rcu;
> >> + struct llist_node *wait_tail, *next = NULL, *rcu = NULL;
> >> int done = 0;
> >>
> >> wait_tail = rcu_state.srs_wait_tail;
> >> @@ -1699,16 +1706,35 @@ static void rcu_sr_normal_gp_cleanup(void)
> >> break;
> >> }
> >>
> >> - // concurrent sr_normal_gp_cleanup work might observe this update.
> >> - smp_store_release(&rcu_state.srs_done_tail, wait_tail);
> >> + /*
> >> + * Fast path, no more users to process. Remove the last wait head
> >> + * if no inflight-workers. If there are in-flight workers, let them
> >> + * remove the last wait head.
> >> + */
> >> + WARN_ON_ONCE(!rcu);
> >>
> > This assumption is not correct. An "rcu" can be NULL in fact.
>
> Hmm I could never trigger that. Are you saying that is true after Neeraj recent patch or something else?
> Note, after Neeraj patch to handle the lack of heads availability, it could be true so I requested
> him to rebase his patch on top of this one.
>
> However I will revisit my patch and look for if it could occur but please let me know if you knew of a sequence of events to make it NULL.
> >
I think we should agree on your patch first otherwise it becomes a bit
messy or go with a Neeraj as first step and then work on youth. So, i
reviewed this patch based on latest Paul's dev branch. I see that Neeraj
needs further work.
So this is true without Neeraj patch. Consider the following case:
3 2 1 0
wh -> cb -> cb -> cb -> NULL
we start to process from 2 and handle all clients, in the end,
an "rcu" points to NULL and trigger the WARN_ON_ONCE. I see the
splat during the boot:
<snip>
[ 0.927699][ T16] ------------[ cut here ]------------
[ 0.930867][ T16] WARNING: CPU: 0 PID: 16 at kernel/rcu/tree.c:1721 rcu_gp_cleanup+0x37b/0x4a0
[ 0.930490][ T1] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 0.931401][ T16] Modules linked in:
[ 0.932400][ T1] PCI: Using configuration type 1 for base access
[ 0.932771][ T16]
[ 0.932773][ T16] CPU: 0 PID: 16 Comm: rcu_sched Not tainted 6.8.0-rc2-00089-g65ae0a6b86f0-dirty #1156
[ 0.937780][ T16] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 0.939402][ T16] RIP: 0010:rcu_gp_cleanup+0x37b/0x4a0
[ 0.940636][ T16] Code: b0 4b bd 72 09 48 81 ff e8 b0 4b bd 76 1e 4c 8b 27 48 83 c7 10 e8 a5 8e fb ff 4c 89 23 83 ed 01 74 0a 4c 89 e7 48 85 ff 75 d2 <0f> 0b 48 8b 35 14 d0 fd 02 48 89 1d 8d 64 d0 01 48 83 c4 08 48 c7
[ 0.942402][ T16] RSP: 0018:ffff9b4a8008fe88 EFLAGS: 00010246
[ 0.943648][ T16] RAX: 0000000000000000 RBX: ffffffffbd4bb0a8 RCX: 6c9b26c9b26c9b27
[ 0.944751][ T16] RDX: 0000000000000000 RSI: 00000000374b92b6 RDI: 0000000000000000
[ 0.945757][ T16] RBP: 0000000000000004 R08: fffffffffff54ea1 R09: 0000000000000000
[ 0.946753][ T16] R10: ffff89070098c278 R11: 0000000000000001 R12: 0000000000000000
[ 0.947752][ T16] R13: fffffffffffffcbc R14: 0000000000000000 R15: ffffffffbd3f1300
[ 0.948764][ T16] FS: 0000000000000000(0000) GS:ffff8915efe00000(0000) knlGS:0000000000000000
[ 0.950403][ T16] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.951656][ T16] CR2: ffff89163ffff000 CR3: 00000002eae26000 CR4: 00000000000006f0
[ 0.952755][ T16] Call Trace:
[ 0.953597][ T16] <TASK>
[ 0.955404][ T16] ? __warn+0x80/0x140
[ 0.956608][ T16] ? rcu_gp_cleanup+0x37b/0x4a0
[ 0.957621][ T16] ? report_bug+0x15d/0x180
[ 0.959403][ T16] ? handle_bug+0x3c/0x70
[ 0.960616][ T16] ? exc_invalid_op+0x17/0x70
[ 0.961620][ T16] ? asm_exc_invalid_op+0x1a/0x20
[ 0.962627][ T16] ? rcu_gp_cleanup+0x37b/0x4a0
[ 0.963622][ T16] ? rcu_gp_cleanup+0x36b/0x4a0
[ 0.965403][ T16] ? __pfx_rcu_gp_kthread+0x10/0x10
[ 0.967402][ T16] rcu_gp_kthread+0xf7/0x180
[ 0.968619][ T16] kthread+0xd3/0x100
[ 0.969602][ T16] ? __pfx_kthread+0x10/0x10
[ 0.971402][ T16] ret_from_fork+0x34/0x50
[ 0.972613][ T16] ? __pfx_kthread+0x10/0x10
[ 0.973615][ T16] ret_from_fork_asm+0x1b/0x30
[ 0.974624][ T16] </TASK>
[ 0.975587][ T16] ---[ end trace 0000000000000000 ]---
<snip>
--
Uladzislau Rezki
next prev parent reply other threads:[~2024-03-19 9:53 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-08 22:44 [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case Joel Fernandes (Google)
2024-03-08 22:44 ` [PATCH v2 rcu/dev 2/2] rcu/tree: Add comments explaining now-offline-CPU QS reports Joel Fernandes (Google)
2024-03-10 19:43 ` Paul E. McKenney
2024-03-11 16:01 ` Joel Fernandes
2024-03-18 18:58 ` [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case Uladzislau Rezki
2024-03-18 21:05 ` Joel Fernandes
2024-03-19 9:53 ` Uladzislau Rezki [this message]
2024-03-19 14:29 ` Joel Fernandes
2024-03-19 14:48 ` Uladzislau Rezki
2024-03-19 16:02 ` Joel Fernandes
2024-03-19 16:11 ` Joel Fernandes
2024-03-19 17:26 ` Uladzislau Rezki
2024-03-19 17:29 ` Joel Fernandes
2024-03-19 17:33 ` Joel Fernandes
2024-03-19 18:37 ` Uladzislau Rezki
2024-03-19 18:52 ` Joel Fernandes
2024-03-19 19:07 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZflgfrjZSZdqrLLw@pc636 \
--to=urezki@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=neeraj.iitr10@gmail.com \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.