Re: [GIT PULL rcu/next] RCU commits for 3.1

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Li Zefan <lizf@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	eric.dumazet@gmail.com, shaohua.li@intel.com, ak@linux.intel.com,
	mhocko@suse.cz, alex.shi@intel.com, efault@gmx.de,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Paul Turner <pjt@google.com>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [GIT PULL rcu/next] RCU commits for 3.1
Date: Wed, 2 Nov 2011 12:23:12 -0700	[thread overview]
Message-ID: <20111102192312.GS2287@linux.vnet.ibm.com> (raw)
In-Reply-To: <4EAF5B68.8090005@cn.fujitsu.com>

On Tue, Nov 01, 2011 at 10:37:28AM +0800, Li Zefan wrote:
> (I shoud have cced Stephane Eranian instead of Turner..)
> 
> Paul E. McKenney wrote:
> > On Mon, Oct 31, 2011 at 04:09:19PM +0800, Li Zefan wrote:
> >> (Let's cc Peter and Paul Turner for this perf cgroup issue.)
> >>
> >>> Thank you for the analysis.  Does the following patch fix this problem?
> >>>
> >>> 							Thanx, Paul
> >>>
> >>> ------------------------------------------------------------------------
> >>>
> >>> fs: Add RCU protection in set_task_comm()
> >>>
> >>> Running "perf stat true" results in the following RCU-lockdep splat:
> >>>
> >>> ===============================
> >>> [ INFO: suspicious RCU usage. ]
> >>> -------------------------------
> >>> include/linux/cgroup.h:548 suspicious rcu_dereference_check() usage!
> >>>
> >>> other info that might help us debug this:
> >>>
> >>> rcu_scheduler_active = 1, debug_locks = 0
> >>> 1 lock held by true/655:
> >>> #0:  (&sig->cred_guard_mutex){+.+.+.}, at: [<810d1bd7>] prepare_bprm_creds+0x27/0x70
> >>>
> >>> stack backtrace:
> >>> Pid: 655, comm: true Not tainted 3.1.0-tip-01868-g1271bd2-dirty #161079
> >>> Call Trace:
> >>> [<81abe239>] ? printk+0x18/0x1a
> >>> [<81064920>] lockdep_rcu_suspicious+0xc0/0xd0
> >>> [<8108aa02>] perf_event_enable_on_exec+0x1d2/0x1e0
> >>> [<81063764>] ? __lock_release+0x54/0xb0
> >>> [<8108cca8>] perf_event_comm+0x18/0x60
> >>> [<810d1abd>] ? set_task_comm+0x5d/0x80
> >>> [<81af622d>] ? _raw_spin_unlock+0x1d/0x40
> >>> [<810d1ac4>] set_task_comm+0x64/0x80
> >>> [<810d25fd>] setup_new_exec+0xbd/0x1d0
> >>> [<810d1b61>] ? flush_old_exec+0x81/0xa0
> >>> [<8110753e>] load_elf_binary+0x28e/0xa00
> >>> [<810d2101>] ? search_binary_handler+0xd1/0x1d0
> >>> [<81063764>] ? __lock_release+0x54/0xb0
> >>> [<811072b0>] ? load_elf_library+0x260/0x260
> >>> [<810d2108>] search_binary_handler+0xd8/0x1d0
> >>> [<810d2060>] ? search_binary_handler+0x30/0x1d0
> >>> [<810d242f>] do_execve_common+0x22f/0x2a0
> >>> [<810d24b2>] do_execve+0x12/0x20
> >>> [<81009592>] sys_execve+0x32/0x70
> >>> [<81af7752>] ptregs_execve+0x12/0x20
> >>> [<81af76d4>] ? sysenter_do_call+0x12/0x36
> >>>
> >>> Li Zefan noted that this is due to set_task_comm() dropping the task
> >>> lock before invoking perf_event_comm(), which could in fact result in
> >>> the task being freed up before perf_event_comm() completed tracing in
> >>> the case where one task invokes set_task_comm() on another task -- which
> >>> actually does occur via comm_write(), which can be invoked via /proc.
> >>>
> >>
> >> This is not true. The caller should ensure @tsk is valid during
> >> set_task_comm().
> >>
> >> The warning comes from perf_cgroup_from_task(). We can trigger this warning
> >> in some other cases where perf cgroup is used, for example:
> > 
> > I must defer to your greater knowledge of this situation.  What patch
> > would you propose?
> > 
> 
> With the following patch, we should see no rcu warning from perf, but as I
> don't know the internel of perf, I guess we have to defer to Peter and
> Stephane. ;)
> 
> I have two doubts:
> 
> - in perf_cgroup_sched_out/in(), we retrieve the task's cgroup twice in the function
> and it's callee perf_cgroup_switch(), but the task can move to another cgroup between
> two calls, so they might return two different cgroup pointers. Does it matter?
> 
> - in perf_cgroup_switch():
> 
> 	 cpuctx->cgrp = perf_cgroup_from_task(task);
> 
> but seems the cgroup is not pinned, so cpuctx->cgrp can be invalid in later use.

Looks sane to me, for whatever that might be worth.

								Thanx, Paul

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index d1a1bee..f5e05ce 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -302,7 +302,10 @@ static inline void update_cgrp_time_from_event(struct perf_event *event)
>  	if (!is_cgroup_event(event))
>  		return;
> 
> +	rcu_read_lock();
>  	cgrp = perf_cgroup_from_task(current);
> +	rcu_read_unlock();
> +
>  	/*
>  	 * Do not update time when cgroup is not active
>  	 */
> @@ -325,9 +328,11 @@ perf_cgroup_set_timestamp(struct task_struct *task,
>  	if (!task || !ctx->nr_cgroups)
>  		return;
> 
> +	rcu_read_lock();
>  	cgrp = perf_cgroup_from_task(task);
>  	info = this_cpu_ptr(cgrp->info);
>  	info->timestamp = ctx->timestamp;
> +	rcu_read_unlock();
>  }
> 
>  #define PERF_CGROUP_SWOUT	0x1 /* cgroup switch out every event */
> @@ -406,6 +411,8 @@ static inline void perf_cgroup_sched_out(struct task_struct *task,
>  	struct perf_cgroup *cgrp1;
>  	struct perf_cgroup *cgrp2 = NULL;
> 
> +	rcu_read_lock();
> +
>  	/*
>  	 * we come here when we know perf_cgroup_events > 0
>  	 */
> @@ -418,6 +425,8 @@ static inline void perf_cgroup_sched_out(struct task_struct *task,
>  	if (next)
>  		cgrp2 = perf_cgroup_from_task(next);
> 
> +	rcu_read_unlock();
> +
>  	/*
>  	 * only schedule out current cgroup events if we know
>  	 * that we are switching to a different cgroup. Otherwise,
> @@ -433,6 +442,8 @@ static inline void perf_cgroup_sched_in(struct task_struct *prev,
>  	struct perf_cgroup *cgrp1;
>  	struct perf_cgroup *cgrp2 = NULL;
> 
> +	rcu_read_lock();
> +
>  	/*
>  	 * we come here when we know perf_cgroup_events > 0
>  	 */
> @@ -441,6 +452,8 @@ static inline void perf_cgroup_sched_in(struct task_struct *prev,
>  	/* prev can never be NULL */
>  	cgrp2 = perf_cgroup_from_task(prev);
> 
> +	rcu_read_unlock();
> +
>  	/*
>  	 * only need to schedule in cgroup events if we are changing
>  	 * cgroup during ctxsw. Cgroup events were not scheduled
>

next prev parent reply	other threads:[~2011-11-02 19:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110930204503.GA32687@linux.vnet.ibm.com>
     [not found] ` <20111001152514.GA16930@elte.hu>
     [not found]   ` <20111003055302.GA23527@elte.hu>
     [not found]     ` <20111003161335.GA2403@linux.vnet.ibm.com>
2011-10-04  7:46       ` [GIT PULL rcu/next] RCU commits for 3.1 Ingo Molnar
2011-10-24 10:05         ` Paul E. McKenney
2011-10-24 11:48           ` Paul E. McKenney
2011-10-26 20:30             ` Ingo Molnar
2011-10-27  7:59               ` Paul E. McKenney
2011-10-27  8:00                 ` Ingo Molnar
2011-10-28  2:34                   ` Li Zefan
2011-10-29 18:27                     ` Paul E. McKenney
2011-10-31  8:09                       ` Li Zefan
2011-10-31  9:32                         ` Paul E. McKenney
2011-11-01  2:37                           ` Li Zefan
2011-11-02 19:23                             ` Paul E. McKenney [this message]
2011-11-02 19:55                               ` Stephane Eranian
2011-11-03 12:50                             ` Stephane Eranian
2011-11-04  8:44                               ` Li Zefan
2011-11-04  9:02                                 ` Stephane Eranian
2011-11-07 14:24                                   ` Stephane Eranian
2011-11-07 14:41                                     ` Eric Dumazet
2011-11-07 14:44                                       ` Stephane Eranian
2011-11-07 15:15                             ` Peter Zijlstra
2011-11-07 16:16                               ` Stephane Eranian
2011-11-07 16:35                                 ` Peter Zijlstra
2011-11-07 16:56                                   ` Paul E. McKenney
2011-11-07 17:09                                     ` Peter Zijlstra
2011-11-07 17:55                                       ` Paul E. McKenney
2011-11-08 13:10                                         ` Stephane Eranian
2011-11-07 17:11                                     ` Peter Zijlstra
2011-11-07 17:12                                     ` Stephane Eranian
2011-11-07 17:26                                       ` Peter Zijlstra
2011-11-07 17:50                                         ` Stephane Eranian
2011-11-07 17:53                                         ` Paul E. McKenney
2011-11-07 17:53                                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111102192312.GS2287@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ak@linux.intel.com \
    --cc=alex.shi@intel.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mhocko@suse.cz \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.