Re: [PATCH v3 1/2] rcu/tree: Add basic support for kfree_rcu batching

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org, Rao Shoaib <rao.shoaib@oracle.com>,
	max.byungchul.park@gmail.com, byungchul.park@lge.com,
	kernel-team@android.com, kernel-team@lge.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Josh Triplett <josh@joshtriplett.org>,
	Kees Cook <keescook@chromium.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-doc@vger.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v3 1/2] rcu/tree: Add basic support for kfree_rcu batching
Date: Wed, 14 Aug 2019 16:02:24 -0700	[thread overview]
Message-ID: <20190814230224.GB28441@linux.ibm.com> (raw)
In-Reply-To: <20190814223413.GB69375@google.com>

On Wed, Aug 14, 2019 at 06:34:13PM -0400, Joel Fernandes wrote:
> On Wed, Aug 14, 2019 at 11:44:29AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 14, 2019 at 01:22:33PM -0400, Joel Fernandes wrote:
> > > On Wed, Aug 14, 2019 at 10:38:17AM -0400, Joel Fernandes wrote:
> > > > On Tue, Aug 13, 2019 at 12:07:38PM -0700, Paul E. McKenney wrote:
> > >  [snip]
> > > > > > - * Queue an RCU callback for lazy invocation after a grace period.
> > > > > > - * This will likely be later named something like "call_rcu_lazy()",
> > > > > > - * but this change will require some way of tagging the lazy RCU
> > > > > > - * callbacks in the list of pending callbacks. Until then, this
> > > > > > - * function may only be called from __kfree_rcu().
> > > > > > + * Maximum number of kfree(s) to batch, if this limit is hit then the batch of
> > > > > > + * kfree(s) is queued for freeing after a grace period, right away.
> > > > > >   */
> > > > > > -void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > > > +struct kfree_rcu_cpu {
> > > > > > +	/* The rcu_work node for queuing work with queue_rcu_work(). The work
> > > > > > +	 * is done after a grace period.
> > > > > > +	 */
> > > > > > +	struct rcu_work rcu_work;
> > > > > > +
> > > > > > +	/* The list of objects being queued in a batch but are not yet
> > > > > > +	 * scheduled to be freed.
> > > > > > +	 */
> > > > > > +	struct rcu_head *head;
> > > > > > +
> > > > > > +	/* The list of objects that have now left ->head and are queued for
> > > > > > +	 * freeing after a grace period.
> > > > > > +	 */
> > > > > > +	struct rcu_head *head_free;
> > > > > 
> > > > > So this is not yet the one that does multiple batches concurrently
> > > > > awaiting grace periods, correct?  Or am I missing something subtle?
> > > > 
> > > > Yes, it is not. I honestly, still did not understand that idea. Or how it
> > > > would improve things. May be we can discuss at LPC on pen and paper? But I
> > > > think that can also be a follow-up optimization.
> > > 
> > > I got it now. Basically we can benefit a bit more by having another list
> > > (that is have multiple kfree_rcu batches in flight). I will think more about
> > > it - but hopefully we don't need to gate this patch by that.
> > 
> > I am willing to take this as a later optimization.
> > 
> > > It'll be interesting to see what rcuperf says about such an improvement :)
> > 
> > Indeed, no guarantees either way.  The reason for hope assumes a busy
> > system where each grace period is immediately followed by another
> > grace period.  On such a system, the current setup allows each CPU to
> > make use only of every second grace period for its kfree_rcu() work.
> > The hope would therefore be that this would reduce the memory footprint
> > substantially with no increase in overhead.
> 
> Good news! I was able to bring down memory foot print by almost 30% by adding
> another batch. Below is the patch. Thanks for the suggestion!

Nice!

> I can add this as a patch on top of the initial one, for easier review.

Yes, please!

> The memory consumed drops from 300-350MB to 200-250MB. Increasing
> KFREE_N_BATCHES did not cause a reduction in memory, though.

OK, good to know.

						Thanx, Paul

> ---8<-----------------------
> 
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Subject: [PATCH] WIP: Multiple batches
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/tree.c | 58 +++++++++++++++++++++++++++++++++--------------
>  1 file changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1d1847cadea2..a272c893dbdc 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2596,26 +2596,35 @@ EXPORT_SYMBOL_GPL(call_rcu);
>  
>  /* Maximum number of jiffies to wait before draining a batch. */
>  #define KFREE_DRAIN_JIFFIES (HZ / 50)
> +#define KFREE_N_BATCHES 2
> +
> +struct kfree_rcu_work {
> +	/* The rcu_work node for queuing work with queue_rcu_work(). The work
> +	 * is done after a grace period.
> +	 */
> +	struct rcu_work rcu_work;
> +
> +	/* The list of objects that have now left ->head and are queued for
> +	 * freeing after a grace period.
> +	 */
> +	struct rcu_head *head_free;
> +
> +	struct kfree_rcu_cpu *krc;
> +};
> +static DEFINE_PER_CPU(__typeof__(struct kfree_rcu_work)[KFREE_N_BATCHES], krw);
>  
>  /*
>   * Maximum number of kfree(s) to batch, if this limit is hit then the batch of
>   * kfree(s) is queued for freeing after a grace period, right away.
>   */
>  struct kfree_rcu_cpu {
> -	/* The rcu_work node for queuing work with queue_rcu_work(). The work
> -	 * is done after a grace period.
> -	 */
> -	struct rcu_work rcu_work;
>  
>  	/* The list of objects being queued in a batch but are not yet
>  	 * scheduled to be freed.
>  	 */
>  	struct rcu_head *head;
>  
> -	/* The list of objects that have now left ->head and are queued for
> -	 * freeing after a grace period.
> -	 */
> -	struct rcu_head *head_free;
> +	struct kfree_rcu_work *krw;
>  
>  	/* Protect concurrent access to this structure. */
>  	spinlock_t lock;
> @@ -2638,12 +2647,15 @@ static void kfree_rcu_work(struct work_struct *work)
>  {
>  	unsigned long flags;
>  	struct rcu_head *head, *next;
> -	struct kfree_rcu_cpu *krcp = container_of(to_rcu_work(work),
> -					struct kfree_rcu_cpu, rcu_work);
> +	struct kfree_rcu_work *krw = container_of(to_rcu_work(work),
> +					struct kfree_rcu_work, rcu_work);
> +	struct kfree_rcu_cpu *krcp;
> +
> +	krcp = krw->krc;
>  
>  	spin_lock_irqsave(&krcp->lock, flags);
> -	head = krcp->head_free;
> -	krcp->head_free = NULL;
> +	head = krw->head_free;
> +	krw->head_free = NULL;
>  	spin_unlock_irqrestore(&krcp->lock, flags);
>  
>  	/*
> @@ -2666,19 +2678,30 @@ static void kfree_rcu_work(struct work_struct *work)
>   */
>  static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
>  {
> +	int i = 0;
> +	struct kfree_rcu_work *krw = NULL;
> +
>  	lockdep_assert_held(&krcp->lock);
> +	while (i < KFREE_N_BATCHES) {
> +		if (!krcp->krw[i].head_free) {
> +			krw = &(krcp->krw[i]);
> +			break;
> +		}
> +		i++;
> +	}
>  
> -	/* If a previous RCU batch work is already in progress, we cannot queue
> +	/* If both RCU batches are already in progress, we cannot queue
>  	 * another one, just refuse the optimization and it will be retried
>  	 * again in KFREE_DRAIN_JIFFIES time.
>  	 */
> -	if (krcp->head_free)
> +	if (!krw)
>  		return false;
>  
> -	krcp->head_free = krcp->head;
> +	krw->head_free = krcp->head;
> +	krw->krc = krcp;   /* Should need to do only once, optimize later. */
>  	krcp->head = NULL;
> -	INIT_RCU_WORK(&krcp->rcu_work, kfree_rcu_work);
> -	queue_rcu_work(system_wq, &krcp->rcu_work);
> +	INIT_RCU_WORK(&krw->rcu_work, kfree_rcu_work);
> +	queue_rcu_work(system_wq, &krw->rcu_work);
>  
>  	return true;
>  }
> @@ -3631,6 +3654,7 @@ static void __init kfree_rcu_batch_init(void)
>  		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
>  
>  		spin_lock_init(&krcp->lock);
> +		krcp->krw = &(per_cpu(krw, cpu)[0]);
>  		INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor);
>  	}
>  }
> -- 
> 2.23.0.rc1.153.gdeed80330f-goog
>

     prev parent reply	other threads:[~2019-08-14 23:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 17:00 [PATCH v3 1/2] rcu/tree: Add basic support for kfree_rcu batching Joel Fernandes (Google)
2019-08-13 17:00 ` [PATCH v3 2/2] rcuperf: Add kfree_rcu performance Tests Joel Fernandes (Google)
2019-08-13 19:07 ` [PATCH v3 1/2] rcu/tree: Add basic support for kfree_rcu batching Paul E. McKenney
2019-08-14 14:38   ` Joel Fernandes
2019-08-14 17:22     ` Joel Fernandes
2019-08-14 18:44       ` Paul E. McKenney
2019-08-14 22:34         ` Joel Fernandes
2019-08-14 23:02           ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190814230224.GB28441@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=byungchul.park@lge.com \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=kernel-team@lge.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=max.byungchul.park@gmail.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=rao.shoaib@oracle.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.