From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [PATCH V2] mm/mempolicy: fix sleeping function called from invalid context Date: Wed, 25 Jun 2014 09:43:45 -0400 Message-ID: <20140625134345.GA26883@htj.dyndns.org> References: <53AA2C7E.3050707@cn.fujitsu.com> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=lkEkDU27A4OzBJ9NILxto29V4BBNLGY8dQ1ZX/WxwyQ=; b=RMBjq1IHgKmhiSX/UivoDCDQYLqu2I+ayxbLGppdLH61nN9nh1WjwhH733Y/hCQaH+ yReEDqV5mOsi6csAV5qeXQGESt0lsNIIDdNzmmn/y592rSHWpB06QGOJAEnGKzRi2/m7 4fEMhZ+usbrL0tv8LpAItQoMwvwwNa27sloeBI7INHp2fLFhT/aj+/4v/MEwJx89MNtL JsULgqfysOotNO79bDqdvwNMiNZoze/S5RWSXmwXRs0jJfvmyNRvG7bDB+S002ShGISC CzMC0vCXnUZ2SscX4AkKFxDM2bHSbYkbuolsVu0sdmPNyF9PuBlRVrXq4YFJ6/INuLSV 5/aQ== Content-Disposition: inline In-Reply-To: <53AA2C7E.3050707@cn.fujitsu.com> Sender: owner-linux-mm@kvack.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Gu Zheng Cc: linux-kernel , Andrew Morton , linux-mm@kvack.org, Cgroups , stable@vger.kernel.org, Li Zefan , David Rientjes On Wed, Jun 25, 2014 at 09:57:18AM +0800, Gu Zheng wrote: > When runing with the kernel(3.15-rc7+), the follow bug occurs: > [ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586 > [ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python > [ 9969.441175] INFO: lockdep is turned off. > [ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G A 3.15.0-rc7+ #85 > [ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012 > [ 9969.706052] ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18 > [ 9969.795323] ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c > [ 9969.884710] ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000 > [ 9969.974071] Call Trace: > [ 9970.003403] [] dump_stack+0x4d/0x66 > [ 9970.065074] [] __might_sleep+0xfa/0x130 > [ 9970.130743] [] mutex_lock_nested+0x3c/0x4f0 > [ 9970.200638] [] ? kmem_cache_alloc+0x1bc/0x210 > [ 9970.272610] [] cpuset_mems_allowed+0x27/0x140 > [ 9970.344584] [] ? __mpol_dup+0x63/0x150 > [ 9970.409282] [] __mpol_dup+0xe5/0x150 > [ 9970.471897] [] ? __mpol_dup+0x63/0x150 > [ 9970.536585] [] ? copy_process.part.23+0x606/0x1d40 > [ 9970.613763] [] ? trace_hardirqs_on+0xd/0x10 > [ 9970.683660] [] ? monotonic_to_bootbased+0x2f/0x50 > [ 9970.759795] [] copy_process.part.23+0x670/0x1d40 > [ 9970.834885] [] do_fork+0xd8/0x380 > [ 9970.894375] [] ? __audit_syscall_entry+0x9c/0xf0 > [ 9970.969470] [] SyS_clone+0x16/0x20 > [ 9971.030011] [] stub_clone+0x69/0x90 > [ 9971.091573] [] ? system_call_fastpath+0x16/0x1b > > The cause is that cpuset_mems_allowed() try to take mutex_lock(&callback_mutex) > under the rcu_read_lock(which was hold in __mpol_dup()). And in cpuset_mems_allowed(), > the access to cpuset is under rcu_read_lock, so in __mpol_dup, we can reduce the > rcu_read_lock protection region to protect the access to cpuset only in > current_cpuset_is_being_rebound(). So that we can avoid this bug. > This patch is a temporary solution that just addresses the bug mentioned above, > can not fix the long-standing issue about cpuset.mems rebinding on fork(): > " > When the forker's task_struct is duplicated (which includes ->mems_allowed) > and it races with an update to cpuset_being_rebound in update_tasks_nodemask() > then the task's mems_allowed doesn't get updated. And the child task's > mems_allowed can be wrong if the cpuset's nodemask changes before the > child has been added to the cgroup's tasklist. > " > > Signed-off-by: Gu Zheng > Cc: stable Applied to cgroup/for-3.16-fixes w/ minor updates to patch subject and description. Please format the text to 80 columns. The error messages are fine but it's usually nicer to remove the timestamps. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org