From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, rostedt@goodmis.org,
Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
eric.dumazet@gmail.com, darren@dvhart.com,
"Paul E. McKenney" <paul.mckenney@linaro.org>
Subject: Re: [PATCH RFC tip/core/rcu 11/11] rcu: move TREE_RCU from softirq to kthread
Date: Fri, 25 Feb 2011 16:17:58 +0800 [thread overview]
Message-ID: <4D6765B6.1030401@cn.fujitsu.com> (raw)
In-Reply-To: <1298425183-21265-11-git-send-email-paulmck@linux.vnet.ibm.com>
On 02/23/2011 09:39 AM, Paul E. McKenney wrote:
> From: Paul E. McKenney <paul.mckenney@linaro.org>
>
> If RCU priority boosting is to be meaningful, callback invocation must
> be boosted in addition to preempted RCU readers. Otherwise, in presence
> of CPU real-time threads, the grace period ends, but the callbacks don't
> get invoked. If the callbacks don't get invoked, the associated memory
> doesn't get freed, so the system is still subject to OOM.
>
> But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
> moves the callback invocations to a kthread, which can be boosted easily.
>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
> include/linux/interrupt.h | 1 -
> include/trace/events/irq.h | 3 +-
> kernel/rcutree.c | 324 ++++++++++++++++++++++++++++++++++-
> kernel/rcutree.h | 8 +
> kernel/rcutree_plugin.h | 4 +-
> tools/perf/util/trace-event-parse.c | 1 -
> 6 files changed, 331 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 79d0c4f..ed47deb 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -385,7 +385,6 @@ enum
> TASKLET_SOFTIRQ,
> SCHED_SOFTIRQ,
> HRTIMER_SOFTIRQ,
> - RCU_SOFTIRQ, /* Preferable RCU should always be the last softirq */
>
> NR_SOFTIRQS
> };
> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 1c09820..ae045ca 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -20,8 +20,7 @@ struct softirq_action;
> softirq_name(BLOCK_IOPOLL), \
> softirq_name(TASKLET), \
> softirq_name(SCHED), \
> - softirq_name(HRTIMER), \
> - softirq_name(RCU))
> + softirq_name(HRTIMER))
>
> /**
> * irq_handler_entry - called immediately before the irq action handler
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 0ac1cc0..2241f28 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -47,6 +47,8 @@
> #include <linux/mutex.h>
> #include <linux/time.h>
> #include <linux/kernel_stat.h>
> +#include <linux/wait.h>
> +#include <linux/kthread.h>
>
> #include "rcutree.h"
>
> @@ -82,6 +84,18 @@ DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
> int rcu_scheduler_active __read_mostly;
> EXPORT_SYMBOL_GPL(rcu_scheduler_active);
>
> +/* Control variables for per-CPU and per-rcu_node kthreads. */
I think "per-leaf-rcu_node" is better. It seems that only the leaf rcu_node
of rcu_sched are used for rcu_node kthreads and they also serve for
other rcu domains(rcu_bh, rcu_preempt)? I think we need to add some
comments for it.
> +/*
> + * Timer handler to initiate the waking up of per-CPU kthreads that
> + * have yielded the CPU due to excess numbers of RCU callbacks.
> + */
> +static void rcu_cpu_kthread_timer(unsigned long arg)
> +{
> + unsigned long flags;
> + struct rcu_data *rdp = (struct rcu_data *)arg;
> + struct rcu_node *rnp = rdp->mynode;
> + struct task_struct *t;
> +
> + raw_spin_lock_irqsave(&rnp->lock, flags);
> + rnp->wakemask |= rdp->grpmask;
I think there is no reason that the rnp->lock also protects the
rnp->node_kthread_task. "raw_spin_unlock_irqrestore(&rnp->lock, flags);"
can be moved up here.
> + t = rnp->node_kthread_task;
> + if (t == NULL) {
> + raw_spin_unlock_irqrestore(&rnp->lock, flags);
> + return;
> + }
> + wake_up_process(t);
> + raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +}
> +
> +/*
> + * Drop to non-real-time priority and yield, but only after posting a
> + * timer that will cause us to regain our real-time priority if we
> + * remain preempted. Either way, we restore our real-time priority
> + * before returning.
> + */
> +static void rcu_yield(int cpu)
> +{
> + struct rcu_data *rdp = per_cpu_ptr(rcu_sched_state.rda, cpu);
> + struct sched_param sp;
> + struct timer_list yield_timer;
> +
> + setup_timer(&yield_timer, rcu_cpu_kthread_timer, (unsigned long)rdp);
> + mod_timer(&yield_timer, jiffies + 2);
> + sp.sched_priority = 0;
> + sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> + schedule();
> + sp.sched_priority = RCU_KTHREAD_PRIO;
> + sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
> + del_timer(&yield_timer);
> +}
> +
> +/*
> + * Handle cases where the rcu_cpu_kthread() ends up on the wrong CPU.
> + * This can happen while the corresponding CPU is either coming online
> + * or going offline. We cannot wait until the CPU is fully online
> + * before starting the kthread, because the various notifier functions
> + * can wait for RCU grace periods. So we park rcu_cpu_kthread() until
> + * the corresponding CPU is online.
> + *
> + * Return 1 if the kthread needs to stop, 0 otherwise.
> + *
> + * Caller must disable bh. This function can momentarily enable it.
> + */
> +static int rcu_cpu_kthread_should_stop(int cpu)
> +{
> + while (cpu_is_offline(cpu) || smp_processor_id() != cpu) {
> + if (kthread_should_stop())
> + return 1;
> + local_bh_enable();
> + schedule_timeout_uninterruptible(1);
> + if (smp_processor_id() != cpu)
> + set_cpus_allowed_ptr(current, cpumask_of(cpu));
The current task is PF_THREAD_BOUND,
Why do "set_cpus_allowed_ptr(current, cpumask_of(cpu));" ?
> + local_bh_disable();
> + }
> + return 0;
> +}
> +
> +/*
> + * Per-CPU kernel thread that invokes RCU callbacks. This replaces the
> + * earlier RCU softirq.
> + */
> +static int rcu_cpu_kthread(void *arg)
> +{
> + int cpu = (int)(long)arg;
> + unsigned long flags;
> + int spincnt = 0;
> + wait_queue_head_t *wqp = &per_cpu(rcu_cpu_wq, cpu);
> + char work;
> + char *workp = &per_cpu(rcu_cpu_has_work, cpu);
> +
> + for (;;) {
> + wait_event_interruptible(*wqp,
> + *workp != 0 || kthread_should_stop());
> + local_bh_disable();
> + if (rcu_cpu_kthread_should_stop(cpu)) {
> + local_bh_enable();
> + break;
> + }
> + local_irq_save(flags);
> + work = *workp;
> + *workp = 0;
> + local_irq_restore(flags);
> + if (work)
> + rcu_process_callbacks();
> + local_bh_enable();
> + if (*workp != 0)
> + spincnt++;
> + else
> + spincnt = 0;
> + if (spincnt > 10) {
"10" is a magic number here.
> + rcu_yield(cpu);
> + spincnt = 0;
> + }
> + }
> + return 0;
> +}
> +
> +/*
> + * Per-rcu_node kthread, which is in charge of waking up the per-CPU
> + * kthreads when needed.
> + */
> +static int rcu_node_kthread(void *arg)
> +{
> + int cpu;
> + unsigned long flags;
> + unsigned long mask;
> + struct rcu_node *rnp = (struct rcu_node *)arg;
> + struct sched_param sp;
> + struct task_struct *t;
> +
> + for (;;) {
> + wait_event_interruptible(rnp->node_wq, rnp->wakemask != 0 ||
> + kthread_should_stop());
> + if (kthread_should_stop())
> + break;
> + raw_spin_lock_irqsave(&rnp->lock, flags);
> + mask = rnp->wakemask;
> + rnp->wakemask = 0;
> + raw_spin_unlock_irqrestore(&rnp->lock, flags);
> + for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++, mask <<= 1) {
> + if ((mask & 0x1) == 0)
> + continue;
> + preempt_disable();
> + per_cpu(rcu_cpu_has_work, cpu) = 1;
> + t = per_cpu(rcu_cpu_kthread_task, cpu);
> + if (t == NULL) {
> + preempt_enable();
> + continue;
> + }
Obviously preempt_disable() is not for protecting remote percpu data.
Is it for disabling cpu hotplug? I am afraid the @t may leave
and become invalid.
next prev parent reply other threads:[~2011-02-25 8:29 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-23 1:39 [PATCH tip/core/rcu 0/14] Preview of RCU patches for 2.6.39 Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 01/11] rcu: call __rcu_read_unlock() in exit_rcu for tiny RCU Paul E. McKenney
2011-02-25 8:29 ` Lai Jiangshan
2011-02-25 19:40 ` Paul E. McKenney
2011-03-24 3:45 ` Lai Jiangshan
2011-03-24 13:07 ` Paul E. McKenney
2011-03-25 2:30 ` Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 02/11] rcutorture: Get rid of duplicate sched.h include Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 03/11] rcu: add documentation saying which RCU flavor to choose Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 04/11] rcupdate: remove dead code Paul E. McKenney
2011-02-23 14:36 ` Mathieu Desnoyers
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 05/11] rcu: add comment saying why DEBUG_OBJECTS_RCU_HEAD depends on PREEMPT Paul E. McKenney
2011-02-23 3:23 ` Steven Rostedt
2011-02-23 13:59 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP615CB0BE0A2623EF62925096DB0@phx.gbl>
2011-02-23 14:11 ` Steven Rostedt
2011-02-23 14:37 ` Mathieu Desnoyers
2011-02-23 14:55 ` Steven Rostedt
2011-02-23 15:02 ` Mathieu Desnoyers
2011-02-23 15:13 ` [PATCH] debug rcu head support !PREEMPT config Mathieu Desnoyers
[not found] ` <BLU0-SMTP1519908E0ACAEE1384F71896DB0@phx.gbl>
2011-02-23 15:27 ` Steven Rostedt
2011-02-23 15:37 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP42770DC9BDE561B962274096DB0@phx.gbl>
2011-02-23 18:31 ` Paul E. McKenney
2011-02-23 18:40 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP900C4ABCF4001FBCB1594696DB0@phx.gbl>
2011-02-23 17:49 ` Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 06/11] smp: Document transitivity for memory barriers Paul E. McKenney
2011-02-23 3:29 ` Steven Rostedt
2011-02-23 6:21 ` Lai Jiangshan
2011-02-23 15:14 ` Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 07/11] rcu: Remove conditional compilation for RCU CPU stall warnings Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 08/11] rcu: Decrease memory-barrier usage based on semi-formal proof Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 09/11] rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 10/11] rcu: Update documentation to reflect blocked_tasks[] merge Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 11/11] rcu: move TREE_RCU from softirq to kthread Paul E. McKenney
2011-02-23 2:44 ` Frederic Weisbecker
2011-02-23 15:11 ` Paul E. McKenney
2011-02-23 3:09 ` Frederic Weisbecker
2011-02-23 15:12 ` Paul E. McKenney
2011-02-23 14:02 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP211F39903EDACD9B7E025C96DB0@phx.gbl>
2011-02-23 14:42 ` Steven Rostedt
2011-02-23 16:16 ` Frederic Weisbecker
2011-02-23 16:41 ` Steven Rostedt
2011-02-23 17:03 ` Mathieu Desnoyers
2011-02-23 17:14 ` Frederic Weisbecker
[not found] ` <BLU0-SMTP5642728A153E83B94895F896DB0@phx.gbl>
2011-02-23 17:30 ` Frederic Weisbecker
[not found] ` <BLU0-SMTP65F733B8D1D704C7EA1F8796DB0@phx.gbl>
2011-02-23 17:34 ` Christoph Lameter
2011-02-23 18:17 ` Steven Rostedt
2011-02-23 18:29 ` Christoph Lameter
2011-02-23 18:32 ` Steven Rostedt
2011-02-23 19:19 ` Christoph Lameter
2011-02-23 19:23 ` Peter Zijlstra
2011-02-23 19:35 ` Steven Rostedt
2011-02-23 19:40 ` Christoph Lameter
2011-02-23 20:15 ` Paul E. McKenney
2011-02-23 19:16 ` Paul E. McKenney
2011-02-23 19:24 ` Christoph Lameter
2011-02-23 20:45 ` Paul E. McKenney
2011-02-23 18:38 ` Mathieu Desnoyers
2011-02-23 18:27 ` Mathieu Desnoyers
2011-02-23 19:10 ` Paul E. McKenney
2011-02-23 19:22 ` Christoph Lameter
2011-02-23 19:39 ` Paul E. McKenney
2011-02-23 16:50 ` Frederic Weisbecker
2011-02-23 19:06 ` Paul E. McKenney
2011-02-23 19:13 ` Frederic Weisbecker
2011-02-23 20:41 ` Paul E. McKenney
[not found] ` <BLU0-SMTP57EE20F30B92B8763FD2FE96DB0@phx.gbl>
2011-02-23 18:52 ` Paul E. McKenney
2011-02-25 8:17 ` Lai Jiangshan [this message]
2011-02-25 20:32 ` Paul E. McKenney
2011-02-28 3:29 ` Lai Jiangshan
2011-02-28 9:47 ` Peter Zijlstra
2011-03-01 0:13 ` Paul E. McKenney
2011-03-01 14:38 ` Peter Zijlstra
2011-03-02 0:07 ` Paul E. McKenney
2011-03-02 22:41 ` Paul E. McKenney
2011-02-28 23:51 ` Paul E. McKenney
2011-03-02 1:52 ` Lai Jiangshan
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 12/14] rcu: priority boosting for TREE_PREEMPT_RCU Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 13/14] rcu: eliminate unused boosting statistics Paul E. McKenney
2011-02-23 1:39 ` [PATCH RFC tip/core/rcu 14/14] rcu: Add boosting to TREE_PREEMPT_RCU tracing Paul E. McKenney
2011-02-23 3:07 ` Lai Jiangshan
2011-02-23 16:31 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D6765B6.1030401@cn.fujitsu.com \
--to=laijs@cn.fujitsu.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=paul.mckenney@linaro.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.