All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	RCU <rcu@vger.kernel.org>,
	Neeraj upadhyay <Neeraj.Upadhyay@amd.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Hillf Danton <hdanton@sina.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [PATCH v3 4/7] rcu: Improve handling of synchronize_rcu() users
Date: Thu, 21 Dec 2023 11:52:33 +0100	[thread overview]
Message-ID: <ZYQY8bB3zpywfBxO@pc636> (raw)
In-Reply-To: <579f86e0-e03e-4ab3-9a85-a62064bcf2a1@paulmck-laptop>

On Tue, Dec 19, 2023 at 05:37:56PM -0800, Paul E. McKenney wrote:
> On Tue, Nov 28, 2023 at 09:00:30AM +0100, Uladzislau Rezki (Sony) wrote:
> > From: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> > 
> > Currently, processing of the next batch of rcu_synchronize nodes
> > for the new grace period, requires doing a llist reversal operation
> > to find the tail element of the list. This can be a very costly
> > operation (high number of cache misses) for a long list.
> > 
> > To address this, this patch introduces a "dummy-wait-node" entity.
> > At every grace period init, a new wait node is added to the llist.
> > This wait node is used as wait tail for this new grace period.
> > 
> > This allows lockless additions of new rcu_synchronize nodes in the
> > rcu_sr_normal_add_req(), while the cleanup work executes and does
> > the progress. The dummy nodes are removed on next round of cleanup
> > work execution.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> 
> This says that Uladzislau created the patch and that Neeraj
> acted as maintainer.  I am guessing that you both worked on it,
> in which case is should have the Co-developed-by tags as shown in
> Documentation/process/submitting-patches.rst.  Could you please update
> these to reflect the actual origin?
> 
Right. We both worked on it. Neeraj is an author whereas i should mark
myself as a Co-developed-by. This is a correct way. Thank you for
pointing on it!

>
> One question below toward the end.  There are probably others that I
> should be asking, but I have to start somewhere.  ;-)
> 
Good :)

> >  
> >  /*
> >   * Helper function for rcu_gp_init().
> >   */
> > -static void rcu_sr_normal_gp_init(void)
> > +static bool rcu_sr_normal_gp_init(void)
> >  {
> > -	struct llist_node *head, *tail;
> > +	struct llist_node *first;
> > +	struct llist_node *wait_head;
> > +	bool start_new_poll = false;
> >  
> > -	if (llist_empty(&sr.srs_next))
> > -		return;
> > +	first = READ_ONCE(sr.srs_next.first);
> > +	if (!first || rcu_sr_is_wait_head(first))
> > +		return start_new_poll;
> > +
> > +	wait_head = rcu_sr_get_wait_head();
> > +	if (!wait_head) {
> > +		// Kick another GP to retry.
> > +		start_new_poll = true;
> > +		return start_new_poll;
> > +	}
> >  
> > -	tail = llist_del_all(&sr.srs_next);
> > -	head = llist_reverse_order(tail);
> > +	/* Inject a wait-dummy-node. */
> > +	llist_add(wait_head, &sr.srs_next);
> >  
> >  	/*
> > -	 * A waiting list of GP should be empty on this step,
> > -	 * since a GP-kthread, rcu_gp_init() -> gp_cleanup(),
> > +	 * A waiting list of rcu_synchronize nodes should be empty on
> > +	 * this step, since a GP-kthread, rcu_gp_init() -> gp_cleanup(),
> >  	 * rolls it over. If not, it is a BUG, warn a user.
> >  	 */
> > -	WARN_ON_ONCE(!llist_empty(&sr.srs_wait));
> > +	WARN_ON_ONCE(sr.srs_wait_tail != NULL);
> > +	sr.srs_wait_tail = wait_head;
> > +	ASSERT_EXCLUSIVE_WRITER(sr.srs_wait_tail);
> >  
> > -	WRITE_ONCE(sr.srs_wait_tail, tail);
> > -	__llist_add_batch(head, tail, &sr.srs_wait);
> > +	return start_new_poll;
> >  }
> >  
> >  static void rcu_sr_normal_add_req(struct rcu_synchronize *rs)
> > @@ -1493,6 +1684,7 @@ static noinline_for_stack bool rcu_gp_init(void)
> >  	unsigned long mask;
> >  	struct rcu_data *rdp;
> >  	struct rcu_node *rnp = rcu_get_root();
> > +	bool start_new_poll;
> >  
> >  	WRITE_ONCE(rcu_state.gp_activity, jiffies);
> >  	raw_spin_lock_irq_rcu_node(rnp);
> > @@ -1517,11 +1709,15 @@ static noinline_for_stack bool rcu_gp_init(void)
> >  	/* Record GP times before starting GP, hence rcu_seq_start(). */
> >  	rcu_seq_start(&rcu_state.gp_seq);
> >  	ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
> > -	rcu_sr_normal_gp_init();
> > +	start_new_poll = rcu_sr_normal_gp_init();
> >  	trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start"));
> >  	rcu_poll_gp_seq_start(&rcu_state.gp_seq_polled_snap);
> >  	raw_spin_unlock_irq_rcu_node(rnp);
> >  
> > +	// New poll request after rnp unlock
> > +	if (start_new_poll)
> > +		(void) start_poll_synchronize_rcu();
> 
> You lost me on this one.  Anything that got moved to the wait list
> should be handled by the current grace period, right?  Or is the
> problem that rcu_sr_normal_gp_init() is being invoked after the call
> to rcu_seq_start()?  If that is the case, could it be moved ahead so
> that we don't need the extra grace period?
> 
> Or am I missing something subtle here?
> 
The problem is that, we are limited in number of "wait-heads" which we
add as a marker node for this/current grace period. If there are more clients
and there is no a wait-head available it means that a system, the deferred
kworker, is slow in processing callbacks, thus all wait-nodes are in use.

That is why we need an extra grace period. Basically to repeat our try one
more time, i.e. it might be that a current grace period is not able to handle
users due to the fact that a system is doing really slow, but this is rather
a corner case and is not a problem.

--
Uladzislau Rezki

  reply	other threads:[~2023-12-21 10:52 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28  8:00 [PATCH v3 0/7] Reduce synchronize_rcu() latency(V3) Uladzislau Rezki (Sony)
2023-11-28  8:00 ` [PATCH v3 1/7] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
2023-11-28  8:00 ` [PATCH v3 2/7] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
2023-11-28  8:00 ` [PATCH v3 3/7] doc: Add rcutree.rcu_normal_wake_from_gp to kernel-parameters.txt Uladzislau Rezki (Sony)
2023-12-20  1:17   ` Paul E. McKenney
2023-12-21 10:28     ` Uladzislau Rezki
2023-11-28  8:00 ` [PATCH v3 4/7] rcu: Improve handling of synchronize_rcu() users Uladzislau Rezki (Sony)
2023-12-20  1:37   ` Paul E. McKenney
2023-12-21 10:52     ` Uladzislau Rezki [this message]
2023-12-21 18:40       ` Paul E. McKenney
2023-12-22  9:27         ` Uladzislau Rezki
2023-12-22 18:58           ` Paul E. McKenney
2024-01-02 12:52             ` Uladzislau Rezki
2024-01-02 19:25               ` Paul E. McKenney
2024-01-03 13:16                 ` Uladzislau Rezki
2024-01-03 14:47                   ` Paul E. McKenney
2024-01-03 17:35                     ` Uladzislau Rezki
2024-01-03 17:56                       ` Paul E. McKenney
2024-01-03 19:02                         ` Uladzislau Rezki
2024-01-03 19:03                           ` Uladzislau Rezki
2024-01-03 19:33                             ` Paul E. McKenney
2024-01-04 11:17                               ` Uladzislau Rezki
2023-11-28  8:00 ` [PATCH v3 5/7] rcu: Support direct wake-up " Uladzislau Rezki (Sony)
2023-12-20  1:46   ` Paul E. McKenney
2023-12-21 11:22     ` Uladzislau Rezki
2023-11-28  8:00 ` [PATCH v3 6/7] rcu: Move sync related data to rcu_state structure Uladzislau Rezki (Sony)
2023-12-20  1:47   ` Paul E. McKenney
2023-12-21 10:56     ` Uladzislau Rezki
2023-11-28  8:00 ` [PATCH v3 7/7] rcu: Add CONFIG_RCU_SR_NORMAL_DEBUG_GP Uladzislau Rezki (Sony)
2023-12-20  1:14   ` Paul E. McKenney
2023-12-21 10:27     ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYQY8bB3zpywfBxO@pc636 \
    --to=urezki@gmail.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.