Re: [patch 1/2] rcu batch tuning

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [patch 1/2] rcu batch tuning
@ 2006-01-27 19:57 Oleg Nesterov
  2006-01-27 23:42 ` Paul E. McKenney
  2006-01-28 17:08 ` Dipankar Sarma
  0 siblings, 2 replies; 12+ messages in thread
From: Oleg Nesterov @ 2006-01-27 19:57 UTC (permalink / raw)
  To: Dipankar Sarma
  Cc: linux-kernel, Paul E. McKenney, Andrew Morton, Linus Torvalds

Dipankar Sarma wrote:
>
> +static void force_quiescent_state(struct rcu_data *rdp,
> +			struct rcu_ctrlblk *rcp)
> +{
> +	int cpu;
> +	cpumask_t cpumask;
> +	set_need_resched();
> +	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
> +		rdp->last_rs_qlen = rdp->qlen;
> +		/*
> +		 * Don't send IPI to itself. With irqs disabled,
> +		 * rdp->cpu is the current cpu.
> +		 */
> +		cpumask = rcp->cpumask;
> +		cpu_clear(rdp->cpu, cpumask);
> +		for_each_cpu_mask(cpu, cpumask)
> +			smp_send_reschedule(cpu);
> +	}
> +}
> [ ... snip ... ]
>
> @@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
>  	rdp = &__get_cpu_var(rcu_data);
>  	*rdp->nxttail = head;
>  	rdp->nxttail = &head->next;
> -
> -	if (unlikely(++rdp->count > 10000))
> -		set_need_resched();
> -
> +	if (unlikely(++rdp->qlen > qhimark)) {
> +		rdp->blimit = INT_MAX;
> +		force_quiescent_state(rdp, &rcu_ctrlblk);
> +	}

When ->qlen exceeds qhimark for the first time we send reschedule IPI to
other CPUs and force_quiescent_state() records ->last_rs_qlen = ->qlen.
But we don't reset ->last_rs_qlen when ->qlen goes to 0, this means that
next time we need ++rdp->qlen > qhimark + rsinterval to force other CPUS
to pass quiescent state, no?

Also, it seems to me it's better to have 2 counters, one for length(->donelist)
and another for length(->curlist + ->nxtlist). I think we don't need
force_quiescent_state() when all rcu callbacks are placed in ->donelist,
we only need to increase rdp->blimit in this case.

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-27 19:57 [patch 1/2] rcu batch tuning Oleg Nesterov
@ 2006-01-27 23:42 ` Paul E. McKenney
  2006-01-28 18:07   ` Oleg Nesterov
  2006-01-28 17:08 ` Dipankar Sarma
  1 sibling, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2006-01-27 23:42 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Dipankar Sarma, linux-kernel, Andrew Morton, Linus Torvalds

On Fri, Jan 27, 2006 at 10:57:59PM +0300, Oleg Nesterov wrote:
> Dipankar Sarma wrote:
> >
> > +static void force_quiescent_state(struct rcu_data *rdp,
> > +			struct rcu_ctrlblk *rcp)
> > +{
> > +	int cpu;
> > +	cpumask_t cpumask;
> > +	set_need_resched();
> > +	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
> > +		rdp->last_rs_qlen = rdp->qlen;
> > +		/*
> > +		 * Don't send IPI to itself. With irqs disabled,
> > +		 * rdp->cpu is the current cpu.
> > +		 */
> > +		cpumask = rcp->cpumask;
> > +		cpu_clear(rdp->cpu, cpumask);
> > +		for_each_cpu_mask(cpu, cpumask)
> > +			smp_send_reschedule(cpu);
> > +	}
> > +}
> > [ ... snip ... ]
> >
> > @@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
> >  	rdp = &__get_cpu_var(rcu_data);
> >  	*rdp->nxttail = head;
> >  	rdp->nxttail = &head->next;
> > -
> > -	if (unlikely(++rdp->count > 10000))
> > -		set_need_resched();
> > -
> > +	if (unlikely(++rdp->qlen > qhimark)) {
> > +		rdp->blimit = INT_MAX;
> > +		force_quiescent_state(rdp, &rcu_ctrlblk);
> > +	}
> 
> When ->qlen exceeds qhimark for the first time we send reschedule IPI to
> other CPUs and force_quiescent_state() records ->last_rs_qlen = ->qlen.
> But we don't reset ->last_rs_qlen when ->qlen goes to 0, this means that
> next time we need ++rdp->qlen > qhimark + rsinterval to force other CPUS
> to pass quiescent state, no?

Good catch -- this could well explain Lee's continuing to hit
latency problems.  Although this would not cause the first
latency event, only subsequent ones, it seems to me that ->last_rs_qlen
should be reset whenever ->blimit is reset.

Dipankar, thoughts?

> Also, it seems to me it's better to have 2 counters, one for length(->donelist)
> and another for length(->curlist + ->nxtlist). I think we don't need
> force_quiescent_state() when all rcu callbacks are placed in ->donelist,
> we only need to increase rdp->blimit in this case.

True, currently the patch keeps the sum of the length of all three lists,
and takes both actions when the sum gets too large.  But the only way
you would get unneeded IPIs would be if callback processing was
stalled, but callback generation and grace-period processing was
still proceeding.  Seems at first glance to be an unusual corner
case, with the only downside being some extra IPIs.  Or am I missing
some aspect?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-27 23:42 ` Paul E. McKenney
@ 2006-01-28 18:07   ` Oleg Nesterov
  2006-01-30  3:30     ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2006-01-28 18:07 UTC (permalink / raw)
  To: paulmck; +Cc: Dipankar Sarma, linux-kernel, Andrew Morton, Linus Torvalds

"Paul E. McKenney" wrote:
> 
> On Fri, Jan 27, 2006 at 10:57:59PM +0300, Oleg Nesterov wrote:
> >
> > When ->qlen exceeds qhimark for the first time we send reschedule IPI to
> > other CPUs and force_quiescent_state() records ->last_rs_qlen = ->qlen.
> > But we don't reset ->last_rs_qlen when ->qlen goes to 0, this means that
> > next time we need ++rdp->qlen > qhimark + rsinterval to force other CPUS
> > to pass quiescent state, no?
> 
> Good catch -- this could well explain Lee's continuing to hit
> latency problems.  Although this would not cause the first
> latency event, only subsequent ones, it seems to me that ->last_rs_qlen
> should be reset whenever ->blimit is reset.

May be it's better to do it in other way?

struct rcu_ctrlblk {
	...
	int signaled;
	...
};

void force_quiescent_state(rdp, rcp)
{
	if (!rcp->signaled) {
		// racy, but tolerable
		rcp->signaled = 1;

		for_each_cpu_mask(cpu, cpumask)
			smp_send_reschedule(cpu);
	}
}

void rcu_start_batch(rcp, rdp)
{
	if (->next_pending && ->completed == ->cur) {
		...
		rcp->signaled = 0;
		...
	}
}

Probably it is also makes sense to tasklet_schedule(rcu_tasklet)
in call_rcu() when ++rdp->qlen > qhimark, this way we can detect
that we need to start the next batch earlier.

> > Also, it seems to me it's better to have 2 counters, one for length(->donelist)
> > and another for length(->curlist + ->nxtlist). I think we don't need
> > force_quiescent_state() when all rcu callbacks are placed in ->donelist,
> > we only need to increase rdp->blimit in this case.
>
> True, currently the patch keeps the sum of the length of all three lists,
> and takes both actions when the sum gets too large.  But the only way
> you would get unneeded IPIs would be if callback processing was
> stalled, but callback generation and grace-period processing was
> still proceeding.  Seems at first glance to be an unusual corner
> case, with the only downside being some extra IPIs.  Or am I missing
> some aspect?

Yes, it is probably not worth to complicate the code.

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-28 18:07   ` Oleg Nesterov
@ 2006-01-30  3:30     ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2006-01-30  3:30 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Dipankar Sarma, linux-kernel, Andrew Morton, Linus Torvalds

On Sat, Jan 28, 2006 at 09:07:13PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > On Fri, Jan 27, 2006 at 10:57:59PM +0300, Oleg Nesterov wrote:
> > >
> > > When ->qlen exceeds qhimark for the first time we send reschedule IPI to
> > > other CPUs and force_quiescent_state() records ->last_rs_qlen = ->qlen.
> > > But we don't reset ->last_rs_qlen when ->qlen goes to 0, this means that
> > > next time we need ++rdp->qlen > qhimark + rsinterval to force other CPUS
> > > to pass quiescent state, no?
> > 
> > Good catch -- this could well explain Lee's continuing to hit
> > latency problems.  Although this would not cause the first
> > latency event, only subsequent ones, it seems to me that ->last_rs_qlen
> > should be reset whenever ->blimit is reset.
> 
> May be it's better to do it in other way?
> 
> struct rcu_ctrlblk {
> 	...
> 	int signaled;
> 	...
> };
> 
> void force_quiescent_state(rdp, rcp)
> {
> 	if (!rcp->signaled) {
> 		// racy, but tolerable
> 		rcp->signaled = 1;
> 
> 		for_each_cpu_mask(cpu, cpumask)
> 			smp_send_reschedule(cpu);
> 	}
> }
> 
> void rcu_start_batch(rcp, rdp)
> {
> 	if (->next_pending && ->completed == ->cur) {
> 		...
> 		rcp->signaled = 0;
> 		...
> 	}
> }

Possibly...  But the best thing would be for you and Dipankar to
get together to work out the best strategy for this.

							Thanx, Paul

> Probably it is also makes sense to tasklet_schedule(rcu_tasklet)
> in call_rcu() when ++rdp->qlen > qhimark, this way we can detect
> that we need to start the next batch earlier.
> 
> > > Also, it seems to me it's better to have 2 counters, one for length(->donelist)
> > > and another for length(->curlist + ->nxtlist). I think we don't need
> > > force_quiescent_state() when all rcu callbacks are placed in ->donelist,
> > > we only need to increase rdp->blimit in this case.
> >
> > True, currently the patch keeps the sum of the length of all three lists,
> > and takes both actions when the sum gets too large.  But the only way
> > you would get unneeded IPIs would be if callback processing was
> > stalled, but callback generation and grace-period processing was
> > still proceeding.  Seems at first glance to be an unusual corner
> > case, with the only downside being some extra IPIs.  Or am I missing
> > some aspect?
> 
> Yes, it is probably not worth to complicate the code.
> 
> Oleg.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-27 19:57 [patch 1/2] rcu batch tuning Oleg Nesterov
  2006-01-27 23:42 ` Paul E. McKenney
@ 2006-01-28 17:08 ` Dipankar Sarma
  1 sibling, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-01-28 17:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Paul E. McKenney, Andrew Morton, Linus Torvalds

On Fri, Jan 27, 2006 at 10:57:59PM +0300, Oleg Nesterov wrote:
> Dipankar Sarma wrote:
> >
> > @@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
> >  	rdp = &__get_cpu_var(rcu_data);
> >  	*rdp->nxttail = head;
> >  	rdp->nxttail = &head->next;
> > -
> > -	if (unlikely(++rdp->count > 10000))
> > -		set_need_resched();
> > -
> > +	if (unlikely(++rdp->qlen > qhimark)) {
> > +		rdp->blimit = INT_MAX;
> > +		force_quiescent_state(rdp, &rcu_ctrlblk);
> > +	}
> 
> When ->qlen exceeds qhimark for the first time we send reschedule IPI to
> other CPUs and force_quiescent_state() records ->last_rs_qlen = ->qlen.
> But we don't reset ->last_rs_qlen when ->qlen goes to 0, this means that
> next time we need ++rdp->qlen > qhimark + rsinterval to force other CPUS
> to pass quiescent state, no?

Yes. I have fixed it my code.

> 
> Also, it seems to me it's better to have 2 counters, one for length(->donelist)
> and another for length(->curlist + ->nxtlist). I think we don't need
> force_quiescent_state() when all rcu callbacks are placed in ->donelist,
> we only need to increase rdp->blimit in this case.

We could, but I am not sure that ->qlen is not a good measure of
grace period not happening. In fact, in the past, long donelists
often resulted in longer grace periods. So, no harm in forcing
reschedule and allowing ksoftirq to relinquish cpu.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] RCU updates
@ 2006-02-17 15:41 Dipankar Sarma
  2006-02-17 15:43 ` [PATCH 1/2] rcu batch tuning Dipankar Sarma
  0 siblings, 1 reply; 12+ messages in thread
From: Dipankar Sarma @ 2006-02-17 15:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Paul E.McKenney

Here are the patches that I have been testing that should help
some of the latency and OOM issues (file limit) that we had
discussed in the past. If the patchset looks ok,
we should queue them up in -mm for some testing before
merging. I have lightly tested the patchset on both ppc64 and x86_64
using ltp, dbench etc.

Update since the last time I published - file counting stuff uses
percpu_counter.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu batch tuning
  2006-02-17 15:41 [PATCH 0/2] RCU updates Dipankar Sarma
@ 2006-02-17 15:43 ` Dipankar Sarma
  2006-02-17 20:33   ` Dipankar Sarma
  2006-02-18  8:45   ` Andrew Morton
  0 siblings, 2 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-02-17 15:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Paul E.McKenney


This patch adds new tunables for RCU queue and finished batches.
There are two types of controls - number of completed RCU updates
invoked in a batch (blimit) and monitoring for high rate of
incoming RCUs on a cpu (qhimark, qlowmark). By default,
the per-cpu batch limit is set to a small value. If
the input RCU rate exceeds the high watermark, we do two things -
force quiescent state on all cpus and set the batch limit
of the CPU to INTMAX. Setting batch limit to INTMAX forces all
finished RCUs to be processed in one shot. If we have more than
INTMAX RCUs queued up, then we have bigger problems anyway.
Once the incoming queued RCUs fall below the low watermark, the batch limit
is set to the default.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
---


 include/linux/rcupdate.h |    6 +++
 kernel/rcupdate.c        |   76 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 63 insertions(+), 19 deletions(-)

diff -puN include/linux/rcupdate.h~rcu-batch-tuning include/linux/rcupdate.h
--- linux-2.6.16-rc3-rcu/include/linux/rcupdate.h~rcu-batch-tuning	2006-02-15 16:00:02.000000000 +0530
+++ linux-2.6.16-rc3-rcu-dipankar/include/linux/rcupdate.h	2006-02-15 16:00:02.000000000 +0530
@@ -98,13 +98,17 @@ struct rcu_data {
 	long  	       	batch;           /* Batch # for current RCU batch */
 	struct rcu_head *nxtlist;
 	struct rcu_head **nxttail;
-	long            count; /* # of queued items */
+	long            qlen; 	 	 /* # of queued callbacks */
 	struct rcu_head *curlist;
 	struct rcu_head **curtail;
 	struct rcu_head *donelist;
 	struct rcu_head **donetail;
+	long		blimit;		 /* Upper limit on a processed batch */
 	int cpu;
 	struct rcu_head barrier;
+#ifdef CONFIG_SMP
+	long		last_rs_qlen;	 /* qlen during the last resched */
+#endif
 };
 
 DECLARE_PER_CPU(struct rcu_data, rcu_data);
diff -puN kernel/rcupdate.c~rcu-batch-tuning kernel/rcupdate.c
--- linux-2.6.16-rc3-rcu/kernel/rcupdate.c~rcu-batch-tuning	2006-02-15 16:00:02.000000000 +0530
+++ linux-2.6.16-rc3-rcu-dipankar/kernel/rcupdate.c	2006-02-15 16:00:02.000000000 +0530
@@ -67,7 +67,43 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_d
 
 /* Fake initialization required by compiler */
 static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
-static int maxbatch = 10000;
+static int blimit = 10;
+static int qhimark = 10000;
+static int qlowmark = 100;
+#ifdef CONFIG_SMP
+static int rsinterval = 1000;
+#endif
+
+static atomic_t rcu_barrier_cpu_count;
+static struct semaphore rcu_barrier_sema;
+static struct completion rcu_barrier_completion;
+
+#ifdef CONFIG_SMP
+static void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	int cpu;
+	cpumask_t cpumask;
+	set_need_resched();
+	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
+		rdp->last_rs_qlen = rdp->qlen;
+		/*
+		 * Don't send IPI to itself. With irqs disabled,
+		 * rdp->cpu is the current cpu.
+		 */
+		cpumask = rcp->cpumask;
+		cpu_clear(rdp->cpu, cpumask);
+		for_each_cpu_mask(cpu, cpumask)
+			smp_send_reschedule(cpu);
+	}
+}
+#else 
+static inline void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	set_need_resched();
+}
+#endif
 
 /**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
@@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
 	rdp = &__get_cpu_var(rcu_data);
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
-
-	if (unlikely(++rdp->count > 10000))
-		set_need_resched();
-
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_ctrlblk);
+	}
 	local_irq_restore(flags);
 }
 
-static atomic_t rcu_barrier_cpu_count;
-static struct semaphore rcu_barrier_sema;
-static struct completion rcu_barrier_completion;
-
 /**
  * call_rcu_bh - Queue an RCU for invocation after a quicker grace period.
  * @head: structure to be used for queueing the RCU updates.
@@ -131,12 +163,12 @@ void fastcall call_rcu_bh(struct rcu_hea
 	rdp = &__get_cpu_var(rcu_bh_data);
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
-	rdp->count++;
-/*
- *  Should we directly call rcu_do_batch() here ?
- *  if (unlikely(rdp->count > 10000))
- *      rcu_do_batch(rdp);
- */
+
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_bh_ctrlblk);
+	}
+
 	local_irq_restore(flags);
 }
 
@@ -199,10 +231,12 @@ static void rcu_do_batch(struct rcu_data
 		next = rdp->donelist = list->next;
 		list->func(list);
 		list = next;
-		rdp->count--;
-		if (++count >= maxbatch)
+		rdp->qlen--;
+		if (++count >= rdp->blimit)
 			break;
 	}
+	if (rdp->blimit == INT_MAX && rdp->qlen <= qlowmark)
+		rdp->blimit = blimit;
 	if (!rdp->donelist)
 		rdp->donetail = &rdp->donelist;
 	else
@@ -473,6 +507,7 @@ static void rcu_init_percpu_data(int cpu
 	rdp->quiescbatch = rcp->completed;
 	rdp->qs_pending = 0;
 	rdp->cpu = cpu;
+	rdp->blimit = blimit;
 }
 
 static void __devinit rcu_online_cpu(int cpu)
@@ -567,7 +602,12 @@ void synchronize_kernel(void)
 	synchronize_rcu();
 }
 
-module_param(maxbatch, int, 0);
+module_param(blimit, int, 0);
+module_param(qhimark, int, 0);
+module_param(qlowmark, int, 0);
+#ifdef CONFIG_SMP
+module_param(rsinterval, int, 0);
+#endif
 EXPORT_SYMBOL_GPL(rcu_batches_completed);
 EXPORT_SYMBOL(call_rcu);  /* WARNING: GPL-only in April 2006. */
 EXPORT_SYMBOL(call_rcu_bh);  /* WARNING: GPL-only in April 2006. */

_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu batch tuning
  2006-02-17 15:43 ` [PATCH 1/2] rcu batch tuning Dipankar Sarma
@ 2006-02-17 20:33   ` Dipankar Sarma
  2006-02-18  8:45   ` Andrew Morton
  1 sibling, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-02-17 20:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Paul E.McKenney

Andrew,

The earlier version of this patch was against 2.6.16-rc3 and didn't
apply cleanly to 2.6.16-rc3-mm1. This version applies cleanly
to 2.6.16-rc3-mm1.

This patch adds new tunables for RCU queue and finished batches.
There are two types of controls - number of completed RCU updates
invoked in a batch (blimit) and monitoring for high rate of
incoming RCUs on a cpu (qhimark, qlowmark). By default,
the per-cpu batch limit is set to a small value. If
the input RCU rate exceeds the high watermark, we do two things -
force quiescent state on all cpus and set the batch limit
of the CPU to INTMAX. Setting batch limit to INTMAX forces all
finished RCUs to be processed in one shot. If we have more than
INTMAX RCUs queued up, then we have bigger problems anyway.
Once the incoming queued RCUs fall below the low watermark, the batch limit
is set to the default.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
---


 include/linux/rcupdate.h |    6 +++
 kernel/rcupdate.c        |   74 ++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 63 insertions(+), 17 deletions(-)

diff -puN include/linux/rcupdate.h~rcu-batch-tuning include/linux/rcupdate.h
--- linux-2.6.16-rc3-mm1-rcu/include/linux/rcupdate.h~rcu-batch-tuning	2006-02-18 01:18:24.000000000 +0530
+++ linux-2.6.16-rc3-mm1-rcu-dipankar/include/linux/rcupdate.h	2006-02-18 01:18:24.000000000 +0530
@@ -98,13 +98,17 @@ struct rcu_data {
 	long  	       	batch;           /* Batch # for current RCU batch */
 	struct rcu_head *nxtlist;
 	struct rcu_head **nxttail;
-	long            count; /* # of queued items */
+	long            qlen; 	 	 /* # of queued callbacks */
 	struct rcu_head *curlist;
 	struct rcu_head **curtail;
 	struct rcu_head *donelist;
 	struct rcu_head **donetail;
+	long		blimit;		 /* Upper limit on a processed batch */
 	int cpu;
 	struct rcu_head barrier;
+#ifdef CONFIG_SMP
+	long		last_rs_qlen;	 /* qlen during the last resched */
+#endif
 };
 
 DECLARE_PER_CPU(struct rcu_data, rcu_data);
diff -puN kernel/rcupdate.c~rcu-batch-tuning kernel/rcupdate.c
--- linux-2.6.16-rc3-mm1-rcu/kernel/rcupdate.c~rcu-batch-tuning	2006-02-18 01:18:24.000000000 +0530
+++ linux-2.6.16-rc3-mm1-rcu-dipankar/kernel/rcupdate.c	2006-02-18 01:24:07.000000000 +0530
@@ -68,7 +68,43 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_d
 
 /* Fake initialization required by compiler */
 static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
-static int maxbatch = 10000;
+static int blimit = 10;
+static int qhimark = 10000;
+static int qlowmark = 100;
+#ifdef CONFIG_SMP
+static int rsinterval = 1000;
+#endif
+
+static atomic_t rcu_barrier_cpu_count;
+static DEFINE_MUTEX(rcu_barrier_mutex);
+static struct completion rcu_barrier_completion;
+
+#ifdef CONFIG_SMP
+static void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	int cpu;
+	cpumask_t cpumask;
+	set_need_resched();
+	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
+		rdp->last_rs_qlen = rdp->qlen;
+		/*
+		 * Don't send IPI to itself. With irqs disabled,
+		 * rdp->cpu is the current cpu.
+		 */
+		cpumask = rcp->cpumask;
+		cpu_clear(rdp->cpu, cpumask);
+		for_each_cpu_mask(cpu, cpumask)
+			smp_send_reschedule(cpu);
+	}
+}
+#else 
+static inline void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	set_need_resched();
+}
+#endif
 
 /**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
@@ -94,16 +130,14 @@ void fastcall call_rcu(struct rcu_head *
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
 
-	if (unlikely(++rdp->count > 10000))
-		set_need_resched();
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_ctrlblk);
+	}
 
 	local_irq_restore(flags);
 }
 
-static atomic_t rcu_barrier_cpu_count;
-static DEFINE_MUTEX(rcu_barrier_mutex);
-static struct completion rcu_barrier_completion;
-
 /**
  * call_rcu_bh - Queue an RCU for invocation after a quicker grace period.
  * @head: structure to be used for queueing the RCU updates.
@@ -132,12 +166,12 @@ void fastcall call_rcu_bh(struct rcu_hea
 	rdp = &__get_cpu_var(rcu_bh_data);
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
-	rdp->count++;
-/*
- *  Should we directly call rcu_do_batch() here ?
- *  if (unlikely(rdp->count > 10000))
- *      rcu_do_batch(rdp);
- */
+
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_bh_ctrlblk);
+	}
+
 	local_irq_restore(flags);
 }
 
@@ -200,10 +234,12 @@ static void rcu_do_batch(struct rcu_data
 		next = rdp->donelist = list->next;
 		list->func(list);
 		list = next;
-		rdp->count--;
-		if (++count >= maxbatch)
+		rdp->qlen--;
+		if (++count >= rdp->blimit)
 			break;
 	}
+	if (rdp->blimit == INT_MAX && rdp->qlen <= qlowmark)
+		rdp->blimit = blimit;
 	if (!rdp->donelist)
 		rdp->donetail = &rdp->donelist;
 	else
@@ -473,6 +509,7 @@ static void rcu_init_percpu_data(int cpu
 	rdp->quiescbatch = rcp->completed;
 	rdp->qs_pending = 0;
 	rdp->cpu = cpu;
+	rdp->blimit = blimit;
 }
 
 static void __devinit rcu_online_cpu(int cpu)
@@ -566,7 +603,12 @@ void synchronize_kernel(void)
 	synchronize_rcu();
 }
 
-module_param(maxbatch, int, 0);
+module_param(blimit, int, 0);
+module_param(qhimark, int, 0);
+module_param(qlowmark, int, 0);
+#ifdef CONFIG_SMP
+module_param(rsinterval, int, 0);
+#endif
 EXPORT_SYMBOL_GPL(rcu_batches_completed);
 EXPORT_SYMBOL_GPL_FUTURE(call_rcu);	/* WARNING: GPL-only in April 2006. */
 EXPORT_SYMBOL_GPL_FUTURE(call_rcu_bh);	/* WARNING: GPL-only in April 2006. */

_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu batch tuning
  2006-02-17 15:43 ` [PATCH 1/2] rcu batch tuning Dipankar Sarma
  2006-02-17 20:33   ` Dipankar Sarma
@ 2006-02-18  8:45   ` Andrew Morton
  2006-02-18  9:15     ` Dipankar Sarma
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2006-02-18  8:45 UTC (permalink / raw)
  To: dipankar; +Cc: linux-kernel, paulmck

Dipankar Sarma <dipankar@in.ibm.com> wrote:
>
> -module_param(maxbatch, int, 0);
>  +module_param(blimit, int, 0);
>  +module_param(qhimark, int, 0);
>  +module_param(qlowmark, int, 0);
>  +#ifdef CONFIG_SMP
>  +module_param(rsinterval, int, 0);
>  +#endif

It's a bit unusual to add boot-time tunables via module_param, but there's
no law against it.

But you do get arrested for not adding them to
Documentation/kernel-parameters.txt.  That's if you think they're permanent
(I hope they aren't).  If they are, they'll probably need a more extensive
description than kernel-parameters.txt entries normally provide.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu batch tuning
  2006-02-18  8:45   ` Andrew Morton
@ 2006-02-18  9:15     ` Dipankar Sarma
  0 siblings, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-02-18  9:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, paulmck

On Sat, Feb 18, 2006 at 12:45:51AM -0800, Andrew Morton wrote:
> Dipankar Sarma <dipankar@in.ibm.com> wrote:
> >
> > -module_param(maxbatch, int, 0);
> >  +module_param(blimit, int, 0);
> >  +module_param(qhimark, int, 0);
> >  +module_param(qlowmark, int, 0);
> >  +#ifdef CONFIG_SMP
> >  +module_param(rsinterval, int, 0);
> >  +#endif
> 
> It's a bit unusual to add boot-time tunables via module_param, but there's
> no law against it.
> 
> But you do get arrested for not adding them to
> Documentation/kernel-parameters.txt.  That's if you think they're permanent
> (I hope they aren't).  If they are, they'll probably need a more extensive
> description than kernel-parameters.txt entries normally provide.

I hope that we will not need that many tunables eventually. But my
theory has been that with widespread use and experiments with
the tunables in the event of OOM and latency problems will allow
us to figure out what kind of automatic tuning really works.

Regardless, I think there should be documentation. I will send
a documentation patch.

Thanks
Dipankar


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 0/2] RCU: fix various latency/oom issues
@ 2006-01-26 18:40 Dipankar Sarma
  2006-01-26 18:41 ` [patch 1/2] rcu batch tuning Dipankar Sarma
  0 siblings, 1 reply; 12+ messages in thread
From: Dipankar Sarma @ 2006-01-26 18:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, Paul E.McKenney, linux-kernel

Here are the patches that I have been testing that should help
some of the latency and OOM issues (file limit) that we had
discussed in the past. If the patchset looks ok,
we should probably queue them up in -mm for some testing before
merging. I have lightly tested the patchset on both ppc64 and x86_64 
using ltp, dbench etc.

Thanks
Dipnkar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-26 18:40 [patch 0/2] RCU: fix various latency/oom issues Dipankar Sarma
@ 2006-01-26 18:41 ` Dipankar Sarma
  2006-01-26 19:33   ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Dipankar Sarma @ 2006-01-26 18:41 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, Paul E.McKenney, linux-kernel


This patch adds new tunables for RCU queue and finished batches.
There are two types of controls - number of completed RCU updates
invoked in a batch (blimit) and monitoring for high rate of
incoming RCUs on a cpu (qhimark, qlowmark). By default,
the per-cpu batch limit is set to a small value. If
the input RCU rate exceeds the high watermark, we do two things -
force quiescent state on all cpus and set the batch limit
of the CPU to INTMAX. Setting batch limit to INTMAX forces all
finished RCUs to be processed in one shot. If we have more than
INTMAX RCUs queued up, then we have bigger problems anyway.
Once the incoming queued RCUs fall below the low watermark, the batch limit
is set to the default.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
---


 include/linux/rcupdate.h |    6 +++
 kernel/rcupdate.c        |   76 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 63 insertions(+), 19 deletions(-)

diff -puN include/linux/rcupdate.h~rcu-batch-tuning include/linux/rcupdate.h
--- linux-2.6.16-rc1-rcu/include/linux/rcupdate.h~rcu-batch-tuning	2006-01-25 00:09:54.000000000 +0530
+++ linux-2.6.16-rc1-rcu-dipankar/include/linux/rcupdate.h	2006-01-25 01:07:39.000000000 +0530
@@ -98,13 +98,17 @@ struct rcu_data {
 	long  	       	batch;           /* Batch # for current RCU batch */
 	struct rcu_head *nxtlist;
 	struct rcu_head **nxttail;
-	long            count; /* # of queued items */
+	long            qlen; 	 	 /* # of queued callbacks */
 	struct rcu_head *curlist;
 	struct rcu_head **curtail;
 	struct rcu_head *donelist;
 	struct rcu_head **donetail;
+	long		blimit;		 /* Upper limit on a processed batch */
 	int cpu;
 	struct rcu_head barrier;
+#ifdef CONFIG_SMP
+	long		last_rs_qlen;	 /* qlen during the last resched */
+#endif
 };
 
 DECLARE_PER_CPU(struct rcu_data, rcu_data);
diff -puN kernel/rcupdate.c~rcu-batch-tuning kernel/rcupdate.c
--- linux-2.6.16-rc1-rcu/kernel/rcupdate.c~rcu-batch-tuning	2006-01-25 00:09:54.000000000 +0530
+++ linux-2.6.16-rc1-rcu-dipankar/kernel/rcupdate.c	2006-01-25 23:08:03.000000000 +0530
@@ -67,7 +67,43 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_d
 
 /* Fake initialization required by compiler */
 static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
-static int maxbatch = 10000;
+static int blimit = 10;
+static int qhimark = 10000;
+static int qlowmark = 100;
+#ifdef CONFIG_SMP
+static int rsinterval = 1000;
+#endif
+
+static atomic_t rcu_barrier_cpu_count;
+static struct semaphore rcu_barrier_sema;
+static struct completion rcu_barrier_completion;
+
+#ifdef CONFIG_SMP
+static void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	int cpu;
+	cpumask_t cpumask;
+	set_need_resched();
+	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
+		rdp->last_rs_qlen = rdp->qlen;
+		/*
+		 * Don't send IPI to itself. With irqs disabled,
+		 * rdp->cpu is the current cpu.
+		 */
+		cpumask = rcp->cpumask;
+		cpu_clear(rdp->cpu, cpumask);
+		for_each_cpu_mask(cpu, cpumask)
+			smp_send_reschedule(cpu);
+	}
+}
+#else 
+static inline void force_quiescent_state(struct rcu_data *rdp,
+			struct rcu_ctrlblk *rcp)
+{
+	set_need_resched();
+}
+#endif
 
 /**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
@@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
 	rdp = &__get_cpu_var(rcu_data);
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
-
-	if (unlikely(++rdp->count > 10000))
-		set_need_resched();
-
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_ctrlblk);
+	}
 	local_irq_restore(flags);
 }
 
-static atomic_t rcu_barrier_cpu_count;
-static struct semaphore rcu_barrier_sema;
-static struct completion rcu_barrier_completion;
-
 /**
  * call_rcu_bh - Queue an RCU for invocation after a quicker grace period.
  * @head: structure to be used for queueing the RCU updates.
@@ -131,12 +163,12 @@ void fastcall call_rcu_bh(struct rcu_hea
 	rdp = &__get_cpu_var(rcu_bh_data);
 	*rdp->nxttail = head;
 	rdp->nxttail = &head->next;
-	rdp->count++;
-/*
- *  Should we directly call rcu_do_batch() here ?
- *  if (unlikely(rdp->count > 10000))
- *      rcu_do_batch(rdp);
- */
+
+	if (unlikely(++rdp->qlen > qhimark)) {
+		rdp->blimit = INT_MAX;
+		force_quiescent_state(rdp, &rcu_bh_ctrlblk);
+	}
+
 	local_irq_restore(flags);
 }
 
@@ -199,10 +231,12 @@ static void rcu_do_batch(struct rcu_data
 		next = rdp->donelist = list->next;
 		list->func(list);
 		list = next;
-		rdp->count--;
-		if (++count >= maxbatch)
+		rdp->qlen--;
+		if (++count >= rdp->blimit)
 			break;
 	}
+	if (rdp->blimit == INT_MAX && rdp->qlen <= qlowmark)
+		rdp->blimit = blimit;
 	if (!rdp->donelist)
 		rdp->donetail = &rdp->donelist;
 	else
@@ -473,6 +507,7 @@ static void rcu_init_percpu_data(int cpu
 	rdp->quiescbatch = rcp->completed;
 	rdp->qs_pending = 0;
 	rdp->cpu = cpu;
+	rdp->blimit = blimit;
 }
 
 static void __devinit rcu_online_cpu(int cpu)
@@ -567,7 +602,12 @@ void synchronize_kernel(void)
 	synchronize_rcu();
 }
 
-module_param(maxbatch, int, 0);
+module_param(blimit, int, 0);
+module_param(qhimark, int, 0);
+module_param(qlowmark, int, 0);
+#ifdef CONFIG_SMP
+module_param(rsinterval, int, 0);
+#endif
 EXPORT_SYMBOL_GPL(rcu_batches_completed);
 EXPORT_SYMBOL(call_rcu);  /* WARNING: GPL-only in April 2006. */
 EXPORT_SYMBOL(call_rcu_bh);  /* WARNING: GPL-only in April 2006. */

_

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-26 18:41 ` [patch 1/2] rcu batch tuning Dipankar Sarma
@ 2006-01-26 19:33   ` Paul E. McKenney
  2006-01-26 19:42     ` Dipankar Sarma
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2006-01-26 19:33 UTC (permalink / raw)
  To: Dipankar Sarma; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

On Fri, Jan 27, 2006 at 12:11:27AM +0530, Dipankar Sarma wrote:
> 
> This patch adds new tunables for RCU queue and finished batches.
> There are two types of controls - number of completed RCU updates
> invoked in a batch (blimit) and monitoring for high rate of
> incoming RCUs on a cpu (qhimark, qlowmark). By default,
> the per-cpu batch limit is set to a small value. If
> the input RCU rate exceeds the high watermark, we do two things -
> force quiescent state on all cpus and set the batch limit
> of the CPU to INTMAX. Setting batch limit to INTMAX forces all
> finished RCUs to be processed in one shot. If we have more than
> INTMAX RCUs queued up, then we have bigger problems anyway.
> Once the incoming queued RCUs fall below the low watermark, the batch limit
> is set to the default.

Looks good to me!  We might have to have more sophisticated adjustment
of blimit, but starting simple is definitely the right way to go.

						Thanx, Paul

Acked-by: <paulmck@us.ibm.com>
> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
> ---
> 
> 
>  include/linux/rcupdate.h |    6 +++
>  kernel/rcupdate.c        |   76 +++++++++++++++++++++++++++++++++++------------
>  2 files changed, 63 insertions(+), 19 deletions(-)
> 
> diff -puN include/linux/rcupdate.h~rcu-batch-tuning include/linux/rcupdate.h
> --- linux-2.6.16-rc1-rcu/include/linux/rcupdate.h~rcu-batch-tuning	2006-01-25 00:09:54.000000000 +0530
> +++ linux-2.6.16-rc1-rcu-dipankar/include/linux/rcupdate.h	2006-01-25 01:07:39.000000000 +0530
> @@ -98,13 +98,17 @@ struct rcu_data {
>  	long  	       	batch;           /* Batch # for current RCU batch */
>  	struct rcu_head *nxtlist;
>  	struct rcu_head **nxttail;
> -	long            count; /* # of queued items */
> +	long            qlen; 	 	 /* # of queued callbacks */
>  	struct rcu_head *curlist;
>  	struct rcu_head **curtail;
>  	struct rcu_head *donelist;
>  	struct rcu_head **donetail;
> +	long		blimit;		 /* Upper limit on a processed batch */
>  	int cpu;
>  	struct rcu_head barrier;
> +#ifdef CONFIG_SMP
> +	long		last_rs_qlen;	 /* qlen during the last resched */
> +#endif
>  };
>  
>  DECLARE_PER_CPU(struct rcu_data, rcu_data);
> diff -puN kernel/rcupdate.c~rcu-batch-tuning kernel/rcupdate.c
> --- linux-2.6.16-rc1-rcu/kernel/rcupdate.c~rcu-batch-tuning	2006-01-25 00:09:54.000000000 +0530
> +++ linux-2.6.16-rc1-rcu-dipankar/kernel/rcupdate.c	2006-01-25 23:08:03.000000000 +0530
> @@ -67,7 +67,43 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_d
>  
>  /* Fake initialization required by compiler */
>  static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
> -static int maxbatch = 10000;
> +static int blimit = 10;
> +static int qhimark = 10000;
> +static int qlowmark = 100;
> +#ifdef CONFIG_SMP
> +static int rsinterval = 1000;
> +#endif
> +
> +static atomic_t rcu_barrier_cpu_count;
> +static struct semaphore rcu_barrier_sema;
> +static struct completion rcu_barrier_completion;
> +
> +#ifdef CONFIG_SMP
> +static void force_quiescent_state(struct rcu_data *rdp,
> +			struct rcu_ctrlblk *rcp)
> +{
> +	int cpu;
> +	cpumask_t cpumask;
> +	set_need_resched();
> +	if (unlikely(rdp->qlen - rdp->last_rs_qlen > rsinterval)) {
> +		rdp->last_rs_qlen = rdp->qlen;
> +		/*
> +		 * Don't send IPI to itself. With irqs disabled,
> +		 * rdp->cpu is the current cpu.
> +		 */
> +		cpumask = rcp->cpumask;
> +		cpu_clear(rdp->cpu, cpumask);
> +		for_each_cpu_mask(cpu, cpumask)
> +			smp_send_reschedule(cpu);
> +	}
> +}
> +#else 
> +static inline void force_quiescent_state(struct rcu_data *rdp,
> +			struct rcu_ctrlblk *rcp)
> +{
> +	set_need_resched();
> +}
> +#endif
>  
>  /**
>   * call_rcu - Queue an RCU callback for invocation after a grace period.
> @@ -92,17 +128,13 @@ void fastcall call_rcu(struct rcu_head *
>  	rdp = &__get_cpu_var(rcu_data);
>  	*rdp->nxttail = head;
>  	rdp->nxttail = &head->next;
> -
> -	if (unlikely(++rdp->count > 10000))
> -		set_need_resched();
> -
> +	if (unlikely(++rdp->qlen > qhimark)) {
> +		rdp->blimit = INT_MAX;
> +		force_quiescent_state(rdp, &rcu_ctrlblk);
> +	}
>  	local_irq_restore(flags);
>  }
>  
> -static atomic_t rcu_barrier_cpu_count;
> -static struct semaphore rcu_barrier_sema;
> -static struct completion rcu_barrier_completion;
> -
>  /**
>   * call_rcu_bh - Queue an RCU for invocation after a quicker grace period.
>   * @head: structure to be used for queueing the RCU updates.
> @@ -131,12 +163,12 @@ void fastcall call_rcu_bh(struct rcu_hea
>  	rdp = &__get_cpu_var(rcu_bh_data);
>  	*rdp->nxttail = head;
>  	rdp->nxttail = &head->next;
> -	rdp->count++;
> -/*
> - *  Should we directly call rcu_do_batch() here ?
> - *  if (unlikely(rdp->count > 10000))
> - *      rcu_do_batch(rdp);
> - */
> +
> +	if (unlikely(++rdp->qlen > qhimark)) {
> +		rdp->blimit = INT_MAX;
> +		force_quiescent_state(rdp, &rcu_bh_ctrlblk);
> +	}
> +
>  	local_irq_restore(flags);
>  }
>  
> @@ -199,10 +231,12 @@ static void rcu_do_batch(struct rcu_data
>  		next = rdp->donelist = list->next;
>  		list->func(list);
>  		list = next;
> -		rdp->count--;
> -		if (++count >= maxbatch)
> +		rdp->qlen--;
> +		if (++count >= rdp->blimit)
>  			break;
>  	}
> +	if (rdp->blimit == INT_MAX && rdp->qlen <= qlowmark)
> +		rdp->blimit = blimit;
>  	if (!rdp->donelist)
>  		rdp->donetail = &rdp->donelist;
>  	else
> @@ -473,6 +507,7 @@ static void rcu_init_percpu_data(int cpu
>  	rdp->quiescbatch = rcp->completed;
>  	rdp->qs_pending = 0;
>  	rdp->cpu = cpu;
> +	rdp->blimit = blimit;
>  }
>  
>  static void __devinit rcu_online_cpu(int cpu)
> @@ -567,7 +602,12 @@ void synchronize_kernel(void)
>  	synchronize_rcu();
>  }
>  
> -module_param(maxbatch, int, 0);
> +module_param(blimit, int, 0);
> +module_param(qhimark, int, 0);
> +module_param(qlowmark, int, 0);
> +#ifdef CONFIG_SMP
> +module_param(rsinterval, int, 0);
> +#endif
>  EXPORT_SYMBOL_GPL(rcu_batches_completed);
>  EXPORT_SYMBOL(call_rcu);  /* WARNING: GPL-only in April 2006. */
>  EXPORT_SYMBOL(call_rcu_bh);  /* WARNING: GPL-only in April 2006. */
> 
> _
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 1/2] rcu batch tuning
  2006-01-26 19:33   ` Paul E. McKenney
@ 2006-01-26 19:42     ` Dipankar Sarma
  0 siblings, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-01-26 19:42 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Andrew Morton, Linus Torvalds, linux-kernel

On Thu, Jan 26, 2006 at 11:33:17AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 27, 2006 at 12:11:27AM +0530, Dipankar Sarma wrote:
> > 
> > This patch adds new tunables for RCU queue and finished batches.
> > There are two types of controls - number of completed RCU updates
> > invoked in a batch (blimit) and monitoring for high rate of
> > incoming RCUs on a cpu (qhimark, qlowmark). By default,
> > the per-cpu batch limit is set to a small value. If
> > the input RCU rate exceeds the high watermark, we do two things -
> > force quiescent state on all cpus and set the batch limit
> > of the CPU to INTMAX. Setting batch limit to INTMAX forces all
> > finished RCUs to be processed in one shot. If we have more than
> > INTMAX RCUs queued up, then we have bigger problems anyway.
> > Once the incoming queued RCUs fall below the low watermark, the batch limit
> > is set to the default.
> 
> Looks good to me!  We might have to have more sophisticated adjustment
> of blimit, but starting simple is definitely the right way to go.

Yes. Once I am set up to do latency measurements, I will look at
the possible need for more sophisticated adjustment of blimit.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-02-18  9:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-27 19:57 [patch 1/2] rcu batch tuning Oleg Nesterov
2006-01-27 23:42 ` Paul E. McKenney
2006-01-28 18:07   ` Oleg Nesterov
2006-01-30  3:30     ` Paul E. McKenney
2006-01-28 17:08 ` Dipankar Sarma
  -- strict thread matches above, loose matches on Subject: below --
2006-02-17 15:41 [PATCH 0/2] RCU updates Dipankar Sarma
2006-02-17 15:43 ` [PATCH 1/2] rcu batch tuning Dipankar Sarma
2006-02-17 20:33   ` Dipankar Sarma
2006-02-18  8:45   ` Andrew Morton
2006-02-18  9:15     ` Dipankar Sarma
2006-01-26 18:40 [patch 0/2] RCU: fix various latency/oom issues Dipankar Sarma
2006-01-26 18:41 ` [patch 1/2] rcu batch tuning Dipankar Sarma
2006-01-26 19:33   ` Paul E. McKenney
2006-01-26 19:42     ` Dipankar Sarma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox