From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755593AbbCFWYd (ORCPT ); Fri, 6 Mar 2015 17:24:33 -0500 Received: from cantor2.suse.de ([195.135.220.15]:53442 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755237AbbCFWYb (ORCPT ); Fri, 6 Mar 2015 17:24:31 -0500 Message-ID: <1425680663.19505.65.camel@stgolabs.net> Subject: Re: [PATCH -next] locking/rwsem: don't spin in heavy contention From: Davidlohr Bueso To: Dave Chinner Cc: Ming Lei , linux-kernel@vger.kernel.org, "Peter Zijlstra (Intel)" , Jason Low , Linus Torvalds , Michel Lespinasse , "Paul E. McKenney" , Tim Chen , Ingo Molnar , "Theodore Ts'o" Date: Fri, 06 Mar 2015 14:24:23 -0800 In-Reply-To: <20150306215417.GD13958@dastard> References: <1425654790-10727-1-git-send-email-ming.lei@canonical.com> <20150306215417.GD13958@dastard> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.9 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2015-03-07 at 08:54 +1100, Dave Chinner wrote: > On Fri, Mar 06, 2015 at 11:13:10PM +0800, Ming Lei wrote: > > Before commit b3fd4f03ca0b995(locking/rwsem: Avoid deceiving lock > > spinners), rwsem_spin_on_owner() returns false if the owner is changed. > > This commit just returns true under the situation, then kernel > > softlock can be triggered easily in xfstest. > > > > So this patch recovers to previous behaviour, and it should be > > reasonable to stop spining in case of heavy contention. > > > > The soft lockup can be reproduced easily in xfstests(generic/299) > > over ext4: > > > > [ 236.417011] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [kworker/5:80:3288] > > [ 236.417011] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw > > [ 236.417011] CPU: 5 PID: 3288 Comm: kworker/5:80 Not tainted 4.0.0-rc1-next-20150303+ #69 > > [ 236.417011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > [ 236.417011] Workqueue: dio/sda dio_aio_complete_work > > [ 236.417011] task: ffff8800b87c0000 ti: ffff8800b703c000 task.ti: ffff8800b703c000 > > [ 236.417011] RIP: 0010:[] [] __rcu_read_unlock+0x47/0x55 > > [ 236.417011] RSP: 0018:ffff8800b703fb98 EFLAGS: 00000246 > > [ 236.417011] RAX: 0000000000000000 RBX: ffff8800b703fb48 RCX: 000000000003b080 > > [ 236.417011] RDX: fffffffe00000001 RSI: ffff880231f03a20 RDI: ffff8800bb755568 > > [ 236.417011] RBP: ffff8800b703fba8 R08: ffff880227908078 R09: ffff8800b87c0000 > > [ 236.417011] R10: 0000000000000001 R11: 0000000000000020 R12: ffff8800b703c000 > > [ 236.417011] R13: ffff8800b87c0000 R14: 000000000000000f R15: 0000000000000101 > > [ 236.417011] FS: 0000000000000000(0000) GS:ffff88023eca0000(0000) knlGS:0000000000000000 > > [ 236.417011] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 236.417011] CR2: 00007f549369f948 CR3: 00000000ba891000 CR4: 00000000000007e0 > > [ 236.417011] Stack: > > [ 236.417011] fffffffe00000001 ffff8800bb755568 ffff8800b703fbc8 ffffffff81073917 > > [ 236.417011] ffff8800bb755568 ffff8800bb755584 ffff8800b703fc48 ffffffff814d1ba4 > > [ 236.417011] ffff8800b703fbe8 ffff8800b87c0000 ffff8800b703fc78 ffffffff811cf51b > > [ 236.417011] Call Trace: > > [ 236.417011] [] rwsem_spin_on_owner+0x2b/0x79 > > [ 236.417011] [] rwsem_down_write_failed+0xc0/0x2f1 > > [ 236.417011] [] ? start_this_handle+0x494/0x4bd > > [ 236.417011] [] ? trace_preempt_on+0x12/0x2f > > [ 236.417011] [] call_rwsem_down_write_failed+0x13/0x20 > > [ 236.417011] [] ? down_write+0x24/0x33 > > [ 236.417011] [] ext4_map_blocks+0x236/0x3cb > > [ 236.417011] [] ? ext4_convert_unwritten_extents+0xd2/0x19c > > [ 236.417011] [] ? __ext4_journal_start_sb+0x77/0xb8 > > [ 236.417011] [] ext4_convert_unwritten_extents+0xf9/0x19c > > [ 236.417011] [] ext4_put_io_end+0x3a/0x5d > > [ 236.417011] [] ext4_end_io_dio+0x2a/0x2c > > [ 236.417011] [] dio_complete+0x97/0x12d > > [ 236.417011] [] dio_aio_complete_work+0x21/0x23 > > If you're getting stuff there, I'd be looking for a bug in ext4, not > the rwsem code. There's no way there should be enough unwritten > extent conversion pending to lock up the system for that length of > time. Especially considering the test has concurrent truncates > running which should drain the entire IO queue every couple of > seconds at worst.... FYI, this issue is being handled here: https://lkml.org/lkml/2015/3/6/811 Thanks, Davidlohr