From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:52714 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750966AbdAWQEQ (ORCPT ); Mon, 23 Jan 2017 11:04:16 -0500 Subject: Patch "ceph: fix scheduler warning due to nested blocking" has been added to the 4.9-stable tree To: kernel@kyup.com, gregkh@linuxfoundation.org, zyan@redhat.com Cc: , From: Date: Mon, 23 Jan 2017 17:02:53 +0100 Message-ID: <14851873733877@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled ceph: fix scheduler warning due to nested blocking to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: ceph-fix-scheduler-warning-due-to-nested-blocking.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From 5c341ee32881c554727ec14b71ec3e8832f01989 Mon Sep 17 00:00:00 2001 From: Nikolay Borisov Date: Tue, 11 Oct 2016 12:04:09 +0300 Subject: ceph: fix scheduler warning due to nested blocking From: Nikolay Borisov commit 5c341ee32881c554727ec14b71ec3e8832f01989 upstream. try_get_cap_refs can be used as a condition in a wait_event* calls. This is all fine until it has to call __ceph_do_pending_vmtruncate, which in turn acquires the i_truncate_mutex. This leads to a situation in which a task's state is !TASK_RUNNING and at the same time it's trying to acquire a sleeping primitive. In essence a nested sleeping primitives are being used. This causes the following warning: WARNING: CPU: 22 PID: 11064 at kernel/sched/core.c:7631 __might_sleep+0x9f/0xb0() do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait_event+0x5d/0x110 ipmi_msghandler tcp_scalable ib_qib dca ib_mad ib_core ib_addr ipv6 CPU: 22 PID: 11064 Comm: fs_checker.pl Tainted: G O 4.4.20-clouder2 #6 Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015 0000000000000000 ffff8838b416fa88 ffffffff812f4409 ffff8838b416fad0 ffffffff81a034f2 ffff8838b416fac0 ffffffff81052b46 ffffffff81a0432c 0000000000000061 0000000000000000 0000000000000000 ffff88167bda54a0 Call Trace: [] dump_stack+0x67/0x9e [] warn_slowpath_common+0x86/0xc0 [] warn_slowpath_fmt+0x4c/0x50 [] ? prepare_to_wait_event+0x5d/0x110 [] ? prepare_to_wait_event+0x5d/0x110 [] __might_sleep+0x9f/0xb0 [] mutex_lock+0x20/0x40 [] __ceph_do_pending_vmtruncate+0x44/0x1a0 [ceph] [] try_get_cap_refs+0xa2/0x320 [ceph] [] ceph_get_caps+0x255/0x2b0 [ceph] [] ? wait_woken+0xb0/0xb0 [] ceph_write_iter+0x2b1/0xde0 [ceph] [] ? schedule_timeout+0x202/0x260 [] ? kmem_cache_free+0x1ea/0x200 [] ? iput+0x9e/0x230 [] ? __might_sleep+0x52/0xb0 [] ? __might_fault+0x37/0x40 [] ? cp_new_stat+0x153/0x170 [] __vfs_write+0xaa/0xe0 [] vfs_write+0xa9/0x190 [] ? set_close_on_exec+0x31/0x70 [] SyS_write+0x46/0xa0 This happens since wait_event_interruptible can interfere with the mutex locking code, since they both fiddle with the task state. Fix the issue by using the newly-added nested blocking infrastructure in 61ada528dea0 ("sched/wait: Provide infrastructure to deal with nested blocking") Link: https://lwn.net/Articles/628628/ Signed-off-by: Nikolay Borisov Signed-off-by: Yan, Zheng Signed-off-by: Greg Kroah-Hartman --- fs/ceph/caps.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2507,9 +2507,15 @@ int ceph_get_caps(struct ceph_inode_info if (err < 0) ret = err; } else { - ret = wait_event_interruptible(ci->i_cap_wq, - try_get_cap_refs(ci, need, want, endoff, - true, &_got, &err)); + DEFINE_WAIT_FUNC(wait, woken_wake_function); + add_wait_queue(&ci->i_cap_wq, &wait); + + while (!try_get_cap_refs(ci, need, want, endoff, + true, &_got, &err)) + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); + + remove_wait_queue(&ci->i_cap_wq, &wait); + if (err == -EAGAIN) continue; if (err < 0) Patches currently in stable-queue which might be from kernel@kyup.com are queue-4.9/ceph-fix-scheduler-warning-due-to-nested-blocking.patch