From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fengguang Wu Subject: Re: livelock in __writeback_inodes_wb ? Date: Tue, 11 Dec 2012 16:23:27 +0800 Message-ID: <20121211082327.GA15706@localhost> References: <20121128145515.GA26564@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org To: Dave Jones , linux-mm@kvack.org, Linux Kernel Return-path: Content-Disposition: inline In-Reply-To: <20121128145515.GA26564@redhat.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > We had a user report the soft lockup detector kicked after 22 > seconds of no progress, with this trace.. Where is the original report? The reporter may help provide some clues on the workload that triggered the bug. > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > :Call Trace: > : [] redirty_tail+0x5e/0x80 > : [] __writeback_inodes_wb+0x72/0xd0 > : [] wb_writeback+0x23b/0x2d0 > : [] wb_do_writeback+0xac/0x1f0 > : [] ? __internal_add_timer+0x130/0x130 > : [] bdi_writeback_thread+0x8b/0x230 > : [] ? wb_do_writeback+0x1f0/0x1f0 > : [] kthread+0x93/0xa0 > : [] kernel_thread_helper+0x4/0x10 > : [] ? kthread_freezable_should_stop+0x70/0x70 > : [] ? gs_change+0x13/0x13 > > Looking over the code, is it possible that something could be > dirtying pages faster than writeback can get them written out, > keeping us in this loop indefitely ? The bug reporter should know best whether there are heavy IO. However I suspect it's not directly caused by heavy IO: we will release &wb->list_lock before each __writeback_single_inode() call, which starts writeback IO for each inode. > Should there be something in this loop periodically poking > the watchdog perhaps ? It seems we failed to release &wb->list_lock in wb_writeback() for long time (dozens of seconds). That is, the inode_sleep_on_writeback() is somehow not called. However it's not obvious to me how come this can happen.. Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: livelock in __writeback_inodes_wb ? Date: Tue, 11 Dec 2012 14:41:13 +0100 Message-ID: <20121211134113.GA15801@quack.suse.cz> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dave Jones , linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org To: Fengguang Wu Return-path: Content-Disposition: inline In-Reply-To: <20121211082327.GA15706@localhost> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Tue 11-12-12 16:23:27, Wu Fengguang wrote: > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > We had a user report the soft lockup detector kicked after 22 > > seconds of no progress, with this trace.. > > Where is the original report? The reporter may help provide some clues > on the workload that triggered the bug. > > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > > :Call Trace: > > : [] redirty_tail+0x5e/0x80 > > : [] __writeback_inodes_wb+0x72/0xd0 > > : [] wb_writeback+0x23b/0x2d0 > > : [] wb_do_writeback+0xac/0x1f0 > > : [] ? __internal_add_timer+0x130/0x130 > > : [] bdi_writeback_thread+0x8b/0x230 > > : [] ? wb_do_writeback+0x1f0/0x1f0 > > : [] kthread+0x93/0xa0 > > : [] kernel_thread_helper+0x4/0x10 > > : [] ? kthread_freezable_should_stop+0x70/0x70 > > : [] ? gs_change+0x13/0x13 > > > > Looking over the code, is it possible that something could be > > dirtying pages faster than writeback can get them written out, > > keeping us in this loop indefitely ? > > The bug reporter should know best whether there are heavy IO. > > However I suspect it's not directly caused by heavy IO: we will > release &wb->list_lock before each __writeback_single_inode() call, > which starts writeback IO for each inode. Umm, it's not about releasing wb->list_lock I think. Softlockup will trigger whenever we are looping in a kernel for more than given timeout (e.g. those 22 s) without sleeping. > > Should there be something in this loop periodically poking > > the watchdog perhaps ? > > It seems we failed to release &wb->list_lock in wb_writeback() for > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > is somehow not called. However it's not obvious to me how come this > can happen.. Maybe, progress is always non-zero but small and nr_pages is high (e.g. when writeback is triggered by wakeup_flusher_threads()). What filesystem is the guy using? I remember e.g. btrfs used to have always-dirty inodes which could confuse us. >>From the backtrace it is clear there's some superblock which has s_umount locked and we cannot writeback inodes there. So if this superblock contains most of the dirty pages we need to write and there's another superblock with always dirty inode we would livelock like observed... So my question would be about what filesystems are there in the system (/proc/mounts), what load does trigger this, trigger sysrq-w when the lockup happens. Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Jones Subject: Re: livelock in __writeback_inodes_wb ? Date: Tue, 11 Dec 2012 09:29:42 -0500 Message-ID: <20121211142942.GA1943@redhat.com> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org To: Fengguang Wu Return-path: Content-Disposition: inline In-Reply-To: <20121211082327.GA15706@localhost> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote: > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > We had a user report the soft lockup detector kicked after 22 > > seconds of no progress, with this trace.. > > Where is the original report? The reporter may help provide some clues > on the workload that triggered the bug. https://bugzilla.redhat.com/show_bug.cgi?id=880949 > The bug reporter should know best whether there are heavy IO. > > However I suspect it's not directly caused by heavy IO: we will > release &wb->list_lock before each __writeback_single_inode() call, > which starts writeback IO for each inode. > > > Should there be something in this loop periodically poking > > the watchdog perhaps ? > > It seems we failed to release &wb->list_lock in wb_writeback() for > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > is somehow not called. However it's not obvious to me how come this > can happen.. Right, it seems that we only drop the lock when there is more work to do. And if there is no work to do, then we would have bailed from the loop. mysterious. Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Jeons Subject: Re: livelock in __writeback_inodes_wb ? Date: Tue, 11 Dec 2012 20:26:22 -0600 Message-ID: <1355279182.1532.4.camel@kernel.cn.ibm.com> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> <20121211142942.GA1943@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Fengguang Wu , linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org To: Dave Jones Return-path: In-Reply-To: <20121211142942.GA1943@redhat.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Tue, 2012-12-11 at 09:29 -0500, Dave Jones wrote: > On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote: > > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > > We had a user report the soft lockup detector kicked after 22 > > > seconds of no progress, with this trace.. > > > > Where is the original report? The reporter may help provide some clues > > on the workload that triggered the bug. > > https://bugzilla.redhat.com/show_bug.cgi?id=880949 > > > The bug reporter should know best whether there are heavy IO. > > > > However I suspect it's not directly caused by heavy IO: we will > > release &wb->list_lock before each __writeback_single_inode() call, > > which starts writeback IO for each inode. > > > > > Should there be something in this loop periodically poking > > > the watchdog perhaps ? > > > > It seems we failed to release &wb->list_lock in wb_writeback() for > > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > > is somehow not called. However it's not obvious to me how come this > > can happen.. > > Right, it seems that we only drop the lock when there is more work to do. > And if there is no work to do, then we would have bailed from the loop. If no work to do, lock will be dropped after for loop. > > mysterious. > > Dave > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fengguang Wu Subject: Re: livelock in __writeback_inodes_wb ? Date: Fri, 14 Dec 2012 17:13:46 +0800 Message-ID: <20121214091346.GA8149@localhost> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> <20121211134113.GA15801@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dave Jones , linux-mm@kvack.org, Linux Kernel , Andrew Morton , linux-fsdevel@vger.kernel.org To: Jan Kara Return-path: Content-Disposition: inline In-Reply-To: <20121211134113.GA15801@quack.suse.cz> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Dec 11, 2012 at 02:41:13PM +0100, Jan Kara wrote: > On Tue 11-12-12 16:23:27, Wu Fengguang wrote: > > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > > We had a user report the soft lockup detector kicked after 22 > > > seconds of no progress, with this trace.. > > > > Where is the original report? The reporter may help provide some clues > > on the workload that triggered the bug. > > > > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > > > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > > > :Call Trace: > > > : [] redirty_tail+0x5e/0x80 > > > : [] __writeback_inodes_wb+0x72/0xd0 > > > : [] wb_writeback+0x23b/0x2d0 > > > : [] wb_do_writeback+0xac/0x1f0 > > > : [] ? __internal_add_timer+0x130/0x130 > > > : [] bdi_writeback_thread+0x8b/0x230 > > > : [] ? wb_do_writeback+0x1f0/0x1f0 > > > : [] kthread+0x93/0xa0 > > > : [] kernel_thread_helper+0x4/0x10 > > > : [] ? kthread_freezable_should_stop+0x70/0x70 > > > : [] ? gs_change+0x13/0x13 > > > > > > Looking over the code, is it possible that something could be > > > dirtying pages faster than writeback can get them written out, > > > keeping us in this loop indefitely ? > > > > The bug reporter should know best whether there are heavy IO. > > > > However I suspect it's not directly caused by heavy IO: we will > > release &wb->list_lock before each __writeback_single_inode() call, > > which starts writeback IO for each inode. > Umm, it's not about releasing wb->list_lock I think. Softlockup will > trigger whenever we are looping in a kernel for more than given timeout > (e.g. those 22 s) without sleeping. Yes (and the spinlock is the typical reason to prevent it from sleeping). > > > Should there be something in this loop periodically poking > > > the watchdog perhaps ? > > > > It seems we failed to release &wb->list_lock in wb_writeback() for > > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > > is somehow not called. However it's not obvious to me how come this > > can happen.. > Maybe, progress is always non-zero but small and nr_pages is high (e.g. > when writeback is triggered by wakeup_flusher_threads()). What filesystem > is the guy using? I remember e.g. btrfs used to have always-dirty inodes > which could confuse us. Judging from the comm "flush-8:16", it's not btrfs. I cannot find the reporter's email address in the original bug report. So we may at best guess it's likely running the fedora's default filesystem. Also device mapper is used judging from the boot param "rd.lvm.lv=vg_san-pc/lv_root". > >From the backtrace it is clear there's some superblock which has s_umount > locked and we cannot writeback inodes there. So if this superblock contains > most of the dirty pages we need to write and there's another superblock > with always dirty inode we would livelock like observed... So my question > would be about what filesystems are there in the system (/proc/mounts), > what load does trigger this, trigger sysrq-w when the lockup happens. Yeah, it's likely related to some busy superblock. Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx199.postini.com [74.125.245.199]) by kanga.kvack.org (Postfix) with SMTP id F248E6B0083 for ; Wed, 28 Nov 2012 09:55:18 -0500 (EST) Date: Wed, 28 Nov 2012 09:55:15 -0500 From: Dave Jones Subject: livelock in __writeback_inodes_wb ? Message-ID: <20121128145515.GA26564@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org Cc: Linux Kernel We had a user report the soft lockup detector kicked after 22 seconds of no progress, with this trace.. :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 :Call Trace: : [] redirty_tail+0x5e/0x80 : [] __writeback_inodes_wb+0x72/0xd0 : [] wb_writeback+0x23b/0x2d0 : [] wb_do_writeback+0xac/0x1f0 : [] ? __internal_add_timer+0x130/0x130 : [] bdi_writeback_thread+0x8b/0x230 : [] ? wb_do_writeback+0x1f0/0x1f0 : [] kthread+0x93/0xa0 : [] kernel_thread_helper+0x4/0x10 : [] ? kthread_freezable_should_stop+0x70/0x70 : [] ? gs_change+0x13/0x13 Looking over the code, is it possible that something could be dirtying pages faster than writeback can get them written out, keeping us in this loop indefitely ? Should there be something in this loop periodically poking the watchdog perhaps ? Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755539Ab2K1OzV (ORCPT ); Wed, 28 Nov 2012 09:55:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40867 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754538Ab2K1OzT (ORCPT ); Wed, 28 Nov 2012 09:55:19 -0500 Date: Wed, 28 Nov 2012 09:55:15 -0500 From: Dave Jones To: linux-mm@kvack.org Cc: Linux Kernel Subject: livelock in __writeback_inodes_wb ? Message-ID: <20121128145515.GA26564@redhat.com> Mail-Followup-To: Dave Jones , linux-mm@kvack.org, Linux Kernel MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We had a user report the soft lockup detector kicked after 22 seconds of no progress, with this trace.. :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 :Call Trace: : [] redirty_tail+0x5e/0x80 : [] __writeback_inodes_wb+0x72/0xd0 : [] wb_writeback+0x23b/0x2d0 : [] wb_do_writeback+0xac/0x1f0 : [] ? __internal_add_timer+0x130/0x130 : [] bdi_writeback_thread+0x8b/0x230 : [] ? wb_do_writeback+0x1f0/0x1f0 : [] kthread+0x93/0xa0 : [] kernel_thread_helper+0x4/0x10 : [] ? kthread_freezable_should_stop+0x70/0x70 : [] ? gs_change+0x13/0x13 Looking over the code, is it possible that something could be dirtying pages faster than writeback can get them written out, keeping us in this loop indefitely ? Should there be something in this loop periodically poking the watchdog perhaps ? Dave From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752474Ab2LKIXi (ORCPT ); Tue, 11 Dec 2012 03:23:38 -0500 Received: from mga11.intel.com ([192.55.52.93]:24227 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752163Ab2LKIXg (ORCPT ); Tue, 11 Dec 2012 03:23:36 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,256,1355126400"; d="scan'208";a="260375460" Date: Tue, 11 Dec 2012 16:23:27 +0800 From: Fengguang Wu To: Dave Jones , linux-mm@kvack.org, Linux Kernel Cc: Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org Subject: Re: livelock in __writeback_inodes_wb ? Message-ID: <20121211082327.GA15706@localhost> References: <20121128145515.GA26564@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121128145515.GA26564@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > We had a user report the soft lockup detector kicked after 22 > seconds of no progress, with this trace.. Where is the original report? The reporter may help provide some clues on the workload that triggered the bug. > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > :Call Trace: > : [] redirty_tail+0x5e/0x80 > : [] __writeback_inodes_wb+0x72/0xd0 > : [] wb_writeback+0x23b/0x2d0 > : [] wb_do_writeback+0xac/0x1f0 > : [] ? __internal_add_timer+0x130/0x130 > : [] bdi_writeback_thread+0x8b/0x230 > : [] ? wb_do_writeback+0x1f0/0x1f0 > : [] kthread+0x93/0xa0 > : [] kernel_thread_helper+0x4/0x10 > : [] ? kthread_freezable_should_stop+0x70/0x70 > : [] ? gs_change+0x13/0x13 > > Looking over the code, is it possible that something could be > dirtying pages faster than writeback can get them written out, > keeping us in this loop indefitely ? The bug reporter should know best whether there are heavy IO. However I suspect it's not directly caused by heavy IO: we will release &wb->list_lock before each __writeback_single_inode() call, which starts writeback IO for each inode. > Should there be something in this loop periodically poking > the watchdog perhaps ? It seems we failed to release &wb->list_lock in wb_writeback() for long time (dozens of seconds). That is, the inode_sleep_on_writeback() is somehow not called. However it's not obvious to me how come this can happen.. Thanks, Fengguang From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753532Ab2LKNlR (ORCPT ); Tue, 11 Dec 2012 08:41:17 -0500 Received: from cantor2.suse.de ([195.135.220.15]:45227 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753104Ab2LKNlQ (ORCPT ); Tue, 11 Dec 2012 08:41:16 -0500 Date: Tue, 11 Dec 2012 14:41:13 +0100 From: Jan Kara To: Fengguang Wu Cc: Dave Jones , linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org Subject: Re: livelock in __writeback_inodes_wb ? Message-ID: <20121211134113.GA15801@quack.suse.cz> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121211082327.GA15706@localhost> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 11-12-12 16:23:27, Wu Fengguang wrote: > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > We had a user report the soft lockup detector kicked after 22 > > seconds of no progress, with this trace.. > > Where is the original report? The reporter may help provide some clues > on the workload that triggered the bug. > > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > > :Call Trace: > > : [] redirty_tail+0x5e/0x80 > > : [] __writeback_inodes_wb+0x72/0xd0 > > : [] wb_writeback+0x23b/0x2d0 > > : [] wb_do_writeback+0xac/0x1f0 > > : [] ? __internal_add_timer+0x130/0x130 > > : [] bdi_writeback_thread+0x8b/0x230 > > : [] ? wb_do_writeback+0x1f0/0x1f0 > > : [] kthread+0x93/0xa0 > > : [] kernel_thread_helper+0x4/0x10 > > : [] ? kthread_freezable_should_stop+0x70/0x70 > > : [] ? gs_change+0x13/0x13 > > > > Looking over the code, is it possible that something could be > > dirtying pages faster than writeback can get them written out, > > keeping us in this loop indefitely ? > > The bug reporter should know best whether there are heavy IO. > > However I suspect it's not directly caused by heavy IO: we will > release &wb->list_lock before each __writeback_single_inode() call, > which starts writeback IO for each inode. Umm, it's not about releasing wb->list_lock I think. Softlockup will trigger whenever we are looping in a kernel for more than given timeout (e.g. those 22 s) without sleeping. > > Should there be something in this loop periodically poking > > the watchdog perhaps ? > > It seems we failed to release &wb->list_lock in wb_writeback() for > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > is somehow not called. However it's not obvious to me how come this > can happen.. Maybe, progress is always non-zero but small and nr_pages is high (e.g. when writeback is triggered by wakeup_flusher_threads()). What filesystem is the guy using? I remember e.g. btrfs used to have always-dirty inodes which could confuse us. >>From the backtrace it is clear there's some superblock which has s_umount locked and we cannot writeback inodes there. So if this superblock contains most of the dirty pages we need to write and there's another superblock with always dirty inode we would livelock like observed... So my question would be about what filesystems are there in the system (/proc/mounts), what load does trigger this, trigger sysrq-w when the lockup happens. Honza -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753661Ab2LKOj2 (ORCPT ); Tue, 11 Dec 2012 09:39:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:21216 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752867Ab2LKOj1 (ORCPT ); Tue, 11 Dec 2012 09:39:27 -0500 Date: Tue, 11 Dec 2012 09:29:42 -0500 From: Dave Jones To: Fengguang Wu Cc: linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org Subject: Re: livelock in __writeback_inodes_wb ? Message-ID: <20121211142942.GA1943@redhat.com> Mail-Followup-To: Dave Jones , Fengguang Wu , linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121211082327.GA15706@localhost> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote: > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > We had a user report the soft lockup detector kicked after 22 > > seconds of no progress, with this trace.. > > Where is the original report? The reporter may help provide some clues > on the workload that triggered the bug. https://bugzilla.redhat.com/show_bug.cgi?id=880949 > The bug reporter should know best whether there are heavy IO. > > However I suspect it's not directly caused by heavy IO: we will > release &wb->list_lock before each __writeback_single_inode() call, > which starts writeback IO for each inode. > > > Should there be something in this loop periodically poking > > the watchdog perhaps ? > > It seems we failed to release &wb->list_lock in wb_writeback() for > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > is somehow not called. However it's not obvious to me how come this > can happen.. Right, it seems that we only drop the lock when there is more work to do. And if there is no work to do, then we would have bailed from the loop. mysterious. Dave From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754911Ab2LLCcW (ORCPT ); Tue, 11 Dec 2012 21:32:22 -0500 Received: from mail-pa0-f46.google.com ([209.85.220.46]:40022 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754243Ab2LLCcV (ORCPT ); Tue, 11 Dec 2012 21:32:21 -0500 Message-ID: <1355279182.1532.4.camel@kernel.cn.ibm.com> Subject: Re: livelock in __writeback_inodes_wb ? From: Simon Jeons To: Dave Jones Cc: Fengguang Wu , linux-mm@kvack.org, Linux Kernel , Andrew Morton , Jan Kara , linux-fsdevel@vger.kernel.org Date: Tue, 11 Dec 2012 20:26:22 -0600 In-Reply-To: <20121211142942.GA1943@redhat.com> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> <20121211142942.GA1943@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4 (3.4.4-2.fc17) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2012-12-11 at 09:29 -0500, Dave Jones wrote: > On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote: > > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > > We had a user report the soft lockup detector kicked after 22 > > > seconds of no progress, with this trace.. > > > > Where is the original report? The reporter may help provide some clues > > on the workload that triggered the bug. > > https://bugzilla.redhat.com/show_bug.cgi?id=880949 > > > The bug reporter should know best whether there are heavy IO. > > > > However I suspect it's not directly caused by heavy IO: we will > > release &wb->list_lock before each __writeback_single_inode() call, > > which starts writeback IO for each inode. > > > > > Should there be something in this loop periodically poking > > > the watchdog perhaps ? > > > > It seems we failed to release &wb->list_lock in wb_writeback() for > > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > > is somehow not called. However it's not obvious to me how come this > > can happen.. > > Right, it seems that we only drop the lock when there is more work to do. > And if there is no work to do, then we would have bailed from the loop. If no work to do, lock will be dropped after for loop. > > mysterious. > > Dave > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755475Ab2LNJNw (ORCPT ); Fri, 14 Dec 2012 04:13:52 -0500 Received: from mga11.intel.com ([192.55.52.93]:28826 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753501Ab2LNJNu (ORCPT ); Fri, 14 Dec 2012 04:13:50 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,279,1355126400"; d="scan'208";a="262066038" Date: Fri, 14 Dec 2012 17:13:46 +0800 From: Fengguang Wu To: Jan Kara Cc: Dave Jones , linux-mm@kvack.org, Linux Kernel , Andrew Morton , linux-fsdevel@vger.kernel.org Subject: Re: livelock in __writeback_inodes_wb ? Message-ID: <20121214091346.GA8149@localhost> References: <20121128145515.GA26564@redhat.com> <20121211082327.GA15706@localhost> <20121211134113.GA15801@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121211134113.GA15801@quack.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 11, 2012 at 02:41:13PM +0100, Jan Kara wrote: > On Tue 11-12-12 16:23:27, Wu Fengguang wrote: > > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote: > > > We had a user report the soft lockup detector kicked after 22 > > > seconds of no progress, with this trace.. > > > > Where is the original report? The reporter may help provide some clues > > on the workload that triggered the bug. > > > > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137] > > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1 > > > :RIP: 0010:[] [] __list_del_entry+0x2c/0xd0 > > > :Call Trace: > > > : [] redirty_tail+0x5e/0x80 > > > : [] __writeback_inodes_wb+0x72/0xd0 > > > : [] wb_writeback+0x23b/0x2d0 > > > : [] wb_do_writeback+0xac/0x1f0 > > > : [] ? __internal_add_timer+0x130/0x130 > > > : [] bdi_writeback_thread+0x8b/0x230 > > > : [] ? wb_do_writeback+0x1f0/0x1f0 > > > : [] kthread+0x93/0xa0 > > > : [] kernel_thread_helper+0x4/0x10 > > > : [] ? kthread_freezable_should_stop+0x70/0x70 > > > : [] ? gs_change+0x13/0x13 > > > > > > Looking over the code, is it possible that something could be > > > dirtying pages faster than writeback can get them written out, > > > keeping us in this loop indefitely ? > > > > The bug reporter should know best whether there are heavy IO. > > > > However I suspect it's not directly caused by heavy IO: we will > > release &wb->list_lock before each __writeback_single_inode() call, > > which starts writeback IO for each inode. > Umm, it's not about releasing wb->list_lock I think. Softlockup will > trigger whenever we are looping in a kernel for more than given timeout > (e.g. those 22 s) without sleeping. Yes (and the spinlock is the typical reason to prevent it from sleeping). > > > Should there be something in this loop periodically poking > > > the watchdog perhaps ? > > > > It seems we failed to release &wb->list_lock in wb_writeback() for > > long time (dozens of seconds). That is, the inode_sleep_on_writeback() > > is somehow not called. However it's not obvious to me how come this > > can happen.. > Maybe, progress is always non-zero but small and nr_pages is high (e.g. > when writeback is triggered by wakeup_flusher_threads()). What filesystem > is the guy using? I remember e.g. btrfs used to have always-dirty inodes > which could confuse us. Judging from the comm "flush-8:16", it's not btrfs. I cannot find the reporter's email address in the original bug report. So we may at best guess it's likely running the fedora's default filesystem. Also device mapper is used judging from the boot param "rd.lvm.lv=vg_san-pc/lv_root". > >From the backtrace it is clear there's some superblock which has s_umount > locked and we cannot writeback inodes there. So if this superblock contains > most of the dirty pages we need to write and there's another superblock > with always dirty inode we would livelock like observed... So my question > would be about what filesystems are there in the system (/proc/mounts), > what load does trigger this, trigger sysrq-w when the lockup happens. Yeah, it's likely related to some busy superblock. Thanks, Fengguang