public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	linux-kernel@vger.kernel.org, frederic@kernel.org,
	boqun.feng@gmail.com, neeraj.iitr10@gmail.com,
	rcu@vger.kernel.org, rostedt@goodmis.org,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang1211@gmail.com>
Subject: Re: [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case
Date: Tue, 19 Mar 2024 10:53:02 +0100	[thread overview]
Message-ID: <ZflgfrjZSZdqrLLw@pc636> (raw)
In-Reply-To: <404E28F5-B018-4CE7-BE57-D0362B0C9969@joelfernandes.org>

On Mon, Mar 18, 2024 at 05:05:31PM -0400, Joel Fernandes wrote:
> 
> 
> > On Mar 18, 2024, at 2:58 PM, Uladzislau Rezki <urezki@gmail.com> wrote:
> > 
> > Hello, Joel!
> > 
> > Sorry for late checking, see below few comments:
> > 
> >> In the synchronize_rcu() common case, we will have less than
> >> SR_MAX_USERS_WAKE_FROM_GP number of users per GP. Waking up the kworker
> >> is pointless just to free the last injected wait head since at that point,
> >> all the users have already been awakened.
> >> 
> >> Introduce a new counter to track this and prevent the wakeup in the
> >> common case.
> >> 
> >> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >> ---
> >> Rebased on paul/dev of today.
> >> 
> >> kernel/rcu/tree.c | 36 +++++++++++++++++++++++++++++++-----
> >> kernel/rcu/tree.h |  1 +
> >> 2 files changed, 32 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >> index 9fbb5ab57c84..bd29fe3c76bf 100644
> >> --- a/kernel/rcu/tree.c
> >> +++ b/kernel/rcu/tree.c
> >> @@ -96,6 +96,7 @@ static struct rcu_state rcu_state = {
> >>    .ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED,
> >>    .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
> >>        rcu_sr_normal_gp_cleanup_work),
> >> +    .srs_cleanups_pending = ATOMIC_INIT(0),
> >> };
> >> 
> >> /* Dump rcu_node combining tree at boot to verify correct setup. */
> >> @@ -1642,8 +1643,11 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >>     * the done tail list manipulations are protected here.
> >>     */
> >>    done = smp_load_acquire(&rcu_state.srs_done_tail);
> >> -    if (!done)
> >> +    if (!done) {
> >> +        /* See comments below. */
> >> +        atomic_dec_return_release(&rcu_state.srs_cleanups_pending);
> >>        return;
> >> +    }
> >> 
> >>    WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
> >>    head = done->next;
> >> @@ -1666,6 +1670,9 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >> 
> >>        rcu_sr_put_wait_head(rcu);
> >>    }
> >> +
> >> +    /* Order list manipulations with atomic access. */
> >> +    atomic_dec_return_release(&rcu_state.srs_cleanups_pending);
> >> }
> >> 
> >> /*
> >> @@ -1673,7 +1680,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> >>  */
> >> static void rcu_sr_normal_gp_cleanup(void)
> >> {
> >> -    struct llist_node *wait_tail, *next, *rcu;
> >> +    struct llist_node *wait_tail, *next = NULL, *rcu = NULL;
> >>    int done = 0;
> >> 
> >>    wait_tail = rcu_state.srs_wait_tail;
> >> @@ -1699,16 +1706,35 @@ static void rcu_sr_normal_gp_cleanup(void)
> >>            break;
> >>    }
> >> 
> >> -    // concurrent sr_normal_gp_cleanup work might observe this update.
> >> -    smp_store_release(&rcu_state.srs_done_tail, wait_tail);
> >> +    /*
> >> +     * Fast path, no more users to process. Remove the last wait head
> >> +     * if no inflight-workers. If there are in-flight workers, let them
> >> +     * remove the last wait head.
> >> +     */
> >> +    WARN_ON_ONCE(!rcu);
> >> 
> > This assumption is not correct. An "rcu" can be NULL in fact.
> 
> Hmm I could never trigger that. Are you saying that is true after Neeraj recent patch or something else?
> Note, after Neeraj patch to handle the lack of heads availability, it could be true so I requested
> him to rebase his patch on top of this one.
> 
> However I will revisit my patch and look for if it could occur but please let me know if you knew of a sequence of events to make it NULL.
> > 
I think we should agree on your patch first otherwise it becomes a bit
messy or go with a Neeraj as first step and then work on youth. So, i
reviewed this patch based on latest Paul's dev branch. I see that Neeraj
needs further work.

So this is true without Neeraj patch. Consider the following case:

3     2     1     0
wh -> cb -> cb -> cb -> NULL

we start to process from 2 and handle all clients, in the end,
an "rcu" points to NULL and trigger the WARN_ON_ONCE. I see the
splat during the boot:

<snip>
[    0.927699][   T16] ------------[ cut here ]------------
[    0.930867][   T16] WARNING: CPU: 0 PID: 16 at kernel/rcu/tree.c:1721 rcu_gp_cleanup+0x37b/0x4a0
[    0.930490][    T1] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.931401][   T16] Modules linked in:
[    0.932400][    T1] PCI: Using configuration type 1 for base access
[    0.932771][   T16]
[    0.932773][   T16] CPU: 0 PID: 16 Comm: rcu_sched Not tainted 6.8.0-rc2-00089-g65ae0a6b86f0-dirty #1156
[    0.937780][   T16] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.939402][   T16] RIP: 0010:rcu_gp_cleanup+0x37b/0x4a0
[    0.940636][   T16] Code: b0 4b bd 72 09 48 81 ff e8 b0 4b bd 76 1e 4c 8b 27 48 83 c7 10 e8 a5 8e fb ff 4c 89 23 83 ed 01 74 0a 4c 89 e7 48 85 ff 75 d2 <0f> 0b 48 8b 35 14 d0 fd 02 48 89 1d 8d 64 d0 01 48 83 c4 08 48 c7
[    0.942402][   T16] RSP: 0018:ffff9b4a8008fe88 EFLAGS: 00010246
[    0.943648][   T16] RAX: 0000000000000000 RBX: ffffffffbd4bb0a8 RCX: 6c9b26c9b26c9b27
[    0.944751][   T16] RDX: 0000000000000000 RSI: 00000000374b92b6 RDI: 0000000000000000
[    0.945757][   T16] RBP: 0000000000000004 R08: fffffffffff54ea1 R09: 0000000000000000
[    0.946753][   T16] R10: ffff89070098c278 R11: 0000000000000001 R12: 0000000000000000
[    0.947752][   T16] R13: fffffffffffffcbc R14: 0000000000000000 R15: ffffffffbd3f1300
[    0.948764][   T16] FS:  0000000000000000(0000) GS:ffff8915efe00000(0000) knlGS:0000000000000000
[    0.950403][   T16] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.951656][   T16] CR2: ffff89163ffff000 CR3: 00000002eae26000 CR4: 00000000000006f0
[    0.952755][   T16] Call Trace:
[    0.953597][   T16]  <TASK>
[    0.955404][   T16]  ? __warn+0x80/0x140
[    0.956608][   T16]  ? rcu_gp_cleanup+0x37b/0x4a0
[    0.957621][   T16]  ? report_bug+0x15d/0x180
[    0.959403][   T16]  ? handle_bug+0x3c/0x70
[    0.960616][   T16]  ? exc_invalid_op+0x17/0x70
[    0.961620][   T16]  ? asm_exc_invalid_op+0x1a/0x20
[    0.962627][   T16]  ? rcu_gp_cleanup+0x37b/0x4a0
[    0.963622][   T16]  ? rcu_gp_cleanup+0x36b/0x4a0
[    0.965403][   T16]  ? __pfx_rcu_gp_kthread+0x10/0x10
[    0.967402][   T16]  rcu_gp_kthread+0xf7/0x180
[    0.968619][   T16]  kthread+0xd3/0x100
[    0.969602][   T16]  ? __pfx_kthread+0x10/0x10
[    0.971402][   T16]  ret_from_fork+0x34/0x50
[    0.972613][   T16]  ? __pfx_kthread+0x10/0x10
[    0.973615][   T16]  ret_from_fork_asm+0x1b/0x30
[    0.974624][   T16]  </TASK>
[    0.975587][   T16] ---[ end trace 0000000000000000 ]---
<snip>

--
Uladzislau Rezki

  reply	other threads:[~2024-03-19  9:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-08 22:44 [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case Joel Fernandes (Google)
2024-03-08 22:44 ` [PATCH v2 rcu/dev 2/2] rcu/tree: Add comments explaining now-offline-CPU QS reports Joel Fernandes (Google)
2024-03-10 19:43   ` Paul E. McKenney
2024-03-11 16:01     ` Joel Fernandes
2024-03-18 18:58 ` [PATCH v2 rcu/dev 1/2] rcu/tree: Reduce wake up for synchronize_rcu() common case Uladzislau Rezki
2024-03-18 21:05   ` Joel Fernandes
2024-03-19  9:53     ` Uladzislau Rezki [this message]
2024-03-19 14:29       ` Joel Fernandes
2024-03-19 14:48         ` Uladzislau Rezki
2024-03-19 16:02           ` Joel Fernandes
2024-03-19 16:11             ` Joel Fernandes
2024-03-19 17:26               ` Uladzislau Rezki
2024-03-19 17:29                 ` Joel Fernandes
2024-03-19 17:33                   ` Joel Fernandes
2024-03-19 18:37                     ` Uladzislau Rezki
2024-03-19 18:52                       ` Joel Fernandes
2024-03-19 19:07                         ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZflgfrjZSZdqrLLw@pc636 \
    --to=urezki@gmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=neeraj.iitr10@gmail.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=qiang.zhang1211@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox