All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	RCU <rcu@vger.kernel.org>,
	Neeraj upadhyay <Neeraj.Upadhyay@amd.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Hillf Danton <hdanton@sina.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v5 2/4] rcu: Reduce synchronize_rcu() latency
Date: Mon, 4 Mar 2024 17:23:13 +0100	[thread overview]
Message-ID: <ZeX1cXWKv2kirDXg@pc638.lan> (raw)
In-Reply-To: <ZeW2w08WZo4yapQp@localhost.localdomain>

On Mon, Mar 04, 2024 at 12:55:47PM +0100, Frederic Weisbecker wrote:
> Le Wed, Feb 28, 2024 at 07:04:21PM +0100, Uladzislau Rezki a écrit :
> > On Tue, Feb 27, 2024 at 12:07:32AM +0100, Frederic Weisbecker wrote:
> > > On Tue, Feb 20, 2024 at 07:31:13PM +0100, Uladzislau Rezki (Sony) wrote:
> > > > +static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> > > > +{
> > > > +	struct llist_node *done, *rcu, *next, *head;
> > > > +
> > > > +	/*
> > > > +	 * This work execution can potentially execute
> > > > +	 * while a new done tail is being updated by
> > > > +	 * grace period kthread in rcu_sr_normal_gp_cleanup().
> > > > +	 * So, read and updates of done tail need to
> > > > +	 * follow acq-rel semantics.
> > > > +	 *
> > > > +	 * Given that wq semantics guarantees that a single work
> > > > +	 * cannot execute concurrently by multiple kworkers,
> > > > +	 * the done tail list manipulations are protected here.
> > > > +	 */
> > > > +	done = smp_load_acquire(&rcu_state.srs_done_tail);
> > > > +	if (!done)
> > > > +		return;
> > > > +
> > > > +	WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
> > > > +	head = done->next;
> > > > +	done->next = NULL;
> > > 
> > > Can the following race happen?
> > > 
> > > CPU 0                                                   CPU 1
> > > -----                                                   -----
> > > 
> > > // wait_tail == HEAD1
> > > rcu_sr_normal_gp_cleanup() {
> > >     // has passed SR_MAX_USERS_WAKE_FROM_GP
> > >     wait_tail->next = next;
> > >     // done_tail = HEAD1
> > >     smp_store_release(&rcu_state.srs_done_tail, wait_tail);
> > >     queue_work() {
> > >         test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)
> > >         __queue_work()
> > >     }
> > > }
> > > 
> > >                                                       set_work_pool_and_clear_pending()
> > >                                                       rcu_sr_normal_gp_cleanup_work() {
> > > // new GP, wait_tail == HEAD2
> > > rcu_sr_normal_gp_cleanup() {
> > >     // executes all completion, but stop at HEAD1
> > >     wait_tail->next = HEAD1;
> > >     // done_tail = HEAD2
> > >     smp_store_release(&rcu_state.srs_done_tail, wait_tail);
> > >     queue_work() {
> > >         test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)
> > >         __queue_work()
> > >     }
> > > }
> > >                                                           // done = HEAD2
> > >                                                           done = smp_load_acquire(&rcu_state.srs_done_tail);
> > >                                                           // head = HEAD1
> > >                                                           head = done->next;
> > >                                                           done->next = NULL;
> > >                                                           llist_for_each_safe() {
> > >                                                               // completes all callbacks, release HEAD1
> > >                                                           }
> > >                                                       }
> > >                                                       // Process second queue
> > >                                                       set_work_pool_and_clear_pending()
> > >                                                       rcu_sr_normal_gp_cleanup_work() {
> > >                                                           // done = HEAD2
> > >                                                           done = smp_load_acquire(&rcu_state.srs_done_tail);
> > > 
> > > // new GP, wait_tail == HEAD3
> > > rcu_sr_normal_gp_cleanup() {
> > >     // Finds HEAD2 with ->next == NULL at the end
> > >     rcu_sr_put_wait_head(HEAD2)
> > >     ...
> > > 
> > > // A few more GPs later
> > > rcu_sr_normal_gp_init() {
> > >      HEAD2 = rcu_sr_get_wait_head();
> > >      llist_add(HEAD2, &rcu_state.srs_next);
> > >                                                           // head == rcu_state.srs_next
> > >                                                           head = done->next;
> > >                                                           done->next = NULL;
> > >                                                           llist_for_each_safe() {
> > >                                                               // EXECUTE CALLBACKS TOO EARLY!!!
> > >                                                           }
> > >                                                       }
> > Looks like that. To address this, we should not release the head in the GP
> > > kthread.
> 
> But then you have to unconditionally schedule the work, right? Otherwise the
> HEADs are not released. And that means dropping this patch (right now I don't
> have a better idea).
>
The easiest way is to drop the patch. To address it we can go with:

<snip>
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 31f3a61f9c38..9aa2cd46583e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1661,16 +1661,8 @@ static void rcu_sr_normal_gp_cleanup(void)
 	 * wait-head is released if last. The worker is not kicked.
 	 */
 	llist_for_each_safe(rcu, next, wait_tail->next) {
-		if (rcu_sr_is_wait_head(rcu)) {
-			if (!rcu->next) {
-				rcu_sr_put_wait_head(rcu);
-				wait_tail->next = NULL;
-			} else {
-				wait_tail->next = rcu;
-			}
-
+		if (rcu_sr_is_wait_head(rcu))
 			break;
-		}
 
 		rcu_sr_normal_complete(rcu);
 		// It can be last, update a next on this step.
<snip>

i.e. the process of users from GP is still there. The work is triggered
to perform a final complete(if there are users) + releasing wait-heads
so we do not race anymore.

I am OK with both cases. Dropping the patch will make it more simple
for sure.

--
Uladzislau Rezki


  reply	other threads:[~2024-03-04 16:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-20 18:31 [PATCH v5 0/4] Reduce synchronize_rcu() latency(v5) Uladzislau Rezki (Sony)
2024-02-20 18:31 ` [PATCH v5 1/4] rcu: Add data structures for synchronize_rcu() Uladzislau Rezki (Sony)
2024-02-20 18:31 ` [PATCH v5 2/4] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
2024-02-26 23:07   ` Frederic Weisbecker
2024-02-27  6:39     ` Z qiang
2024-02-27 14:37       ` Frederic Weisbecker
2024-02-27 16:16         ` Frederic Weisbecker
2024-02-27 19:35     ` Uladzislau Rezki
2024-02-28 18:04     ` Uladzislau Rezki
2024-03-04 11:55       ` Frederic Weisbecker
2024-03-04 16:23         ` Uladzislau Rezki [this message]
2024-03-04 20:07           ` Paul E. McKenney
2024-03-05  9:35             ` Uladzislau Rezki
2024-03-04 22:56           ` Frederic Weisbecker
2024-03-05  9:38             ` Uladzislau Rezki
2024-03-05 11:36               ` Frederic Weisbecker
2024-02-27 16:15   ` Frederic Weisbecker
2024-02-27 17:03   ` Frederic Weisbecker
2024-02-27 20:51   ` Joel Fernandes
2024-02-28  9:28     ` Uladzislau Rezki
     [not found]   ` <4b932245-2825-4e53-87a4-44d2892e7c13@joelfernandes.org>
2024-02-27 22:50     ` Joel Fernandes
2024-02-27 22:53       ` Joel Fernandes
2024-02-28 14:32   ` Joel Fernandes
2024-02-28 16:44     ` Joel Fernandes
2024-02-20 18:31 ` [PATCH v5 3/4] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
2024-02-20 18:31 ` [PATCH v5 4/4] rcu: Support direct wake-up of synchronize_rcu() users Uladzislau Rezki (Sony)
2024-02-21  1:53 ` [PATCH v5 0/4] Reduce synchronize_rcu() latency(v5) Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeX1cXWKv2kirDXg@pc638.lan \
    --to=urezki@gmail.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.