From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754886Ab1J3I11 (ORCPT ); Sun, 30 Oct 2011 04:27:27 -0400 Received: from e4.ny.us.ibm.com ([32.97.182.144]:44119 "EHLO e4.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752994Ab1J3I1Z (ORCPT ); Sun, 30 Oct 2011 04:27:25 -0400 Date: Sat, 29 Oct 2011 11:27:10 -0700 From: "Paul E. McKenney" To: Li Zefan Cc: Ingo Molnar , eric.dumazet@gmail.com, shaohua.li@intel.com, ak@linux.intel.com, mhocko@suse.cz, alex.shi@intel.com, efault@gmx.de, linux-kernel@vger.kernel.org Subject: Re: [GIT PULL rcu/next] RCU commits for 3.1 Message-ID: <20111029182710.GG6160@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20111001152514.GA16930@elte.hu> <20111003055302.GA23527@elte.hu> <20111003161335.GA2403@linux.vnet.ibm.com> <20111004074637.GA14061@elte.hu> <20111024100501.GA24913@linux.vnet.ibm.com> <20111024114806.GA3340@linux.vnet.ibm.com> <20111026203020.GA10285@elte.hu> <20111027075901.GB2313@linux.vnet.ibm.com> <20111027080016.GA16885@elte.hu> <4EAA14A1.5060204@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EAA14A1.5060204@cn.fujitsu.com> User-Agent: Mutt/1.5.20 (2009-06-14) x-cbid: 11103008-3534-0000-0000-0000010AA511 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 28, 2011 at 10:34:09AM +0800, Li Zefan wrote: > >>>> > >>>> And I cannot reproduce after merging into 3.1. :-( > >>> > >>> Here's another one i just got with latest -tip: > >>> > >>> PM: Adding info for No Bus:vcsa2 > >>> > >>> =============================== > >>> [ INFO: suspicious RCU usage. ] > >>> ------------------------------- > >>> include/linux/cgroup.h:548 suspicious rcu_dereference_check() usage! > >>> > >>> other info that might help us debug this: > >>> > >>> > >>> rcu_scheduler_active = 1, debug_locks = 0 > >>> 1 lock held by true/655: > >>> #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<810d1bd7>] prepare_bprm_creds+0x27/0x70 > >>> > >>> stack backtrace: > >>> Pid: 655, comm: true Not tainted 3.1.0-tip-01868-g1271bd2-dirty #161079 > >>> Call Trace: > >>> [<81abe239>] ? printk+0x18/0x1a > >>> [<81064920>] lockdep_rcu_suspicious+0xc0/0xd0 > >>> [<8108aa02>] perf_event_enable_on_exec+0x1d2/0x1e0 > >>> [<81063764>] ? __lock_release+0x54/0xb0 > >>> [<8108cca8>] perf_event_comm+0x18/0x60 > >>> [<810d1abd>] ? set_task_comm+0x5d/0x80 > > void set_task_comm(struct task_struct *tsk, char *buf) > { > task_lock(tsk); > ... > task_unlock(tsk); > perf_event_comm(tsk); > } > > see, perf_event_comm() is called after releasing task_lock. > > perf_event_comm() > perf_event_enable_on_exec() > perf_cgroup_sched_out() > perf_cgroup_from_task() > task_subsys_state() > > No proper lock is held, hence the warning. Thank you for the analysis. Does the following patch fix this problem? Thanx, Paul ------------------------------------------------------------------------ fs: Add RCU protection in set_task_comm() Running "perf stat true" results in the following RCU-lockdep splat: =============================== [ INFO: suspicious RCU usage. ] ------------------------------- include/linux/cgroup.h:548 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by true/655: #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<810d1bd7>] prepare_bprm_creds+0x27/0x70 stack backtrace: Pid: 655, comm: true Not tainted 3.1.0-tip-01868-g1271bd2-dirty #161079 Call Trace: [<81abe239>] ? printk+0x18/0x1a [<81064920>] lockdep_rcu_suspicious+0xc0/0xd0 [<8108aa02>] perf_event_enable_on_exec+0x1d2/0x1e0 [<81063764>] ? __lock_release+0x54/0xb0 [<8108cca8>] perf_event_comm+0x18/0x60 [<810d1abd>] ? set_task_comm+0x5d/0x80 [<81af622d>] ? _raw_spin_unlock+0x1d/0x40 [<810d1ac4>] set_task_comm+0x64/0x80 [<810d25fd>] setup_new_exec+0xbd/0x1d0 [<810d1b61>] ? flush_old_exec+0x81/0xa0 [<8110753e>] load_elf_binary+0x28e/0xa00 [<810d2101>] ? search_binary_handler+0xd1/0x1d0 [<81063764>] ? __lock_release+0x54/0xb0 [<811072b0>] ? load_elf_library+0x260/0x260 [<810d2108>] search_binary_handler+0xd8/0x1d0 [<810d2060>] ? search_binary_handler+0x30/0x1d0 [<810d242f>] do_execve_common+0x22f/0x2a0 [<810d24b2>] do_execve+0x12/0x20 [<81009592>] sys_execve+0x32/0x70 [<81af7752>] ptregs_execve+0x12/0x20 [<81af76d4>] ? sysenter_do_call+0x12/0x36 Li Zefan noted that this is due to set_task_comm() dropping the task lock before invoking perf_event_comm(), which could in fact result in the task being freed up before perf_event_comm() completed tracing in the case where one task invokes set_task_comm() on another task -- which actually does occur via comm_write(), which can be invoked via /proc. This commit fixes this problem by entering an RCU read-side critical section before acquiring the task lock and exiting this critical section after perf_event_comm() returns. Reported-by: Ingo Molnar Signed-off-by: Paul E. McKenney diff --git a/fs/exec.c b/fs/exec.c index 25dcbe5..fb928d3 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1056,6 +1056,7 @@ EXPORT_SYMBOL_GPL(get_task_comm); void set_task_comm(struct task_struct *tsk, char *buf) { + rcu_read_lock(); /* protect task pointer through tracing. */ task_lock(tsk); /* @@ -1069,6 +1070,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) strlcpy(tsk->comm, buf, sizeof(tsk->comm)); task_unlock(tsk); perf_event_comm(tsk); + rcu_read_unlock(); } int flush_old_exec(struct linux_binprm * bprm)