From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E942FC63777 for ; Mon, 23 Nov 2020 19:52:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E7C020719 for ; Mon, 23 Nov 2020 19:52:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730874AbgKWTwE (ORCPT ); Mon, 23 Nov 2020 14:52:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:44110 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728631AbgKWTwE (ORCPT ); Mon, 23 Nov 2020 14:52:04 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CCE7CAC65; Mon, 23 Nov 2020 19:52:02 +0000 (UTC) Received: by ds.suse.cz (Postfix, from userid 10065) id 861C4DA818; Mon, 23 Nov 2020 20:50:14 +0100 (CET) Date: Mon, 23 Nov 2020 20:50:14 +0100 From: David Sterba To: fdmanana@kernel.org Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH] btrfs: fix lockdep splat when enabling and disabling qgroups Message-ID: <20201123195014.GO8669@twin.jikos.cz> Reply-To: dsterba@suse.cz Mail-Followup-To: dsterba@suse.cz, fdmanana@kernel.org, linux-btrfs@vger.kernel.org References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Mon, Nov 23, 2020 at 06:31:02PM +0000, fdmanana@kernel.org wrote: > From: Filipe Manana > > When running test case btrfs/017 from fstests, lockdep reported the > following splat: > > [ 1297.067385] ====================================================== > [ 1297.067708] WARNING: possible circular locking dependency detected > [ 1297.068022] 5.10.0-rc4-btrfs-next-73 #1 Not tainted > [ 1297.068322] ------------------------------------------------------ > [ 1297.068629] btrfs/189080 is trying to acquire lock: > [ 1297.068929] ffff9f2725731690 (sb_internal#2){.+.+}-{0:0}, at: btrfs_quota_enable+0xaf/0xa70 [btrfs] > [ 1297.069274] > but task is already holding lock: > [ 1297.069868] ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] > [ 1297.070219] > which lock already depends on the new lock. > > [ 1297.071131] > the existing dependency chain (in reverse order) is: > [ 1297.071721] > -> #1 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}: > [ 1297.072375] lock_acquire+0xd8/0x490 > [ 1297.072710] __mutex_lock+0xa3/0xb30 > [ 1297.073061] btrfs_qgroup_inherit+0x59/0x6a0 [btrfs] > [ 1297.073421] create_subvol+0x194/0x990 [btrfs] > [ 1297.073780] btrfs_mksubvol+0x3fb/0x4a0 [btrfs] > [ 1297.074133] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs] > [ 1297.074498] btrfs_ioctl_snap_create+0x58/0x80 [btrfs] > [ 1297.074872] btrfs_ioctl+0x1a90/0x36f0 [btrfs] > [ 1297.075245] __x64_sys_ioctl+0x83/0xb0 > [ 1297.075617] do_syscall_64+0x33/0x80 > [ 1297.075993] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 1297.076380] > -> #0 (sb_internal#2){.+.+}-{0:0}: > [ 1297.077166] check_prev_add+0x91/0xc60 > [ 1297.077572] __lock_acquire+0x1740/0x3110 > [ 1297.077984] lock_acquire+0xd8/0x490 > [ 1297.078411] start_transaction+0x3c5/0x760 [btrfs] > [ 1297.078853] btrfs_quota_enable+0xaf/0xa70 [btrfs] > [ 1297.079323] btrfs_ioctl+0x2c60/0x36f0 [btrfs] > [ 1297.079789] __x64_sys_ioctl+0x83/0xb0 > [ 1297.080232] do_syscall_64+0x33/0x80 > [ 1297.080680] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 1297.081139] > other info that might help us debug this: > > [ 1297.082536] Possible unsafe locking scenario: > > [ 1297.083510] CPU0 CPU1 > [ 1297.084005] ---- ---- > [ 1297.084500] lock(&fs_info->qgroup_ioctl_lock); > [ 1297.084994] lock(sb_internal#2); > [ 1297.085485] lock(&fs_info->qgroup_ioctl_lock); > [ 1297.085974] lock(sb_internal#2); > [ 1297.086454] > *** DEADLOCK *** > [ 1297.087880] 3 locks held by btrfs/189080: > [ 1297.088324] #0: ffff9f2725731470 (sb_writers#14){.+.+}-{0:0}, at: btrfs_ioctl+0xa73/0x36f0 [btrfs] > [ 1297.088799] #1: ffff9f2702b60cc0 (&fs_info->subvol_sem){++++}-{3:3}, at: btrfs_ioctl+0x1f4d/0x36f0 [btrfs] > [ 1297.089284] #2: ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] > [ 1297.089771] > stack backtrace: > [ 1297.090662] CPU: 5 PID: 189080 Comm: btrfs Not tainted 5.10.0-rc4-btrfs-next-73 #1 > [ 1297.091132] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > [ 1297.092123] Call Trace: > [ 1297.092629] dump_stack+0x8d/0xb5 > [ 1297.093115] check_noncircular+0xff/0x110 > [ 1297.093596] check_prev_add+0x91/0xc60 > [ 1297.094076] ? kvm_clock_read+0x14/0x30 > [ 1297.094553] ? kvm_sched_clock_read+0x5/0x10 > [ 1297.095029] __lock_acquire+0x1740/0x3110 > [ 1297.095510] lock_acquire+0xd8/0x490 > [ 1297.095993] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] > [ 1297.096476] start_transaction+0x3c5/0x760 [btrfs] > [ 1297.096962] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] > [ 1297.097451] btrfs_quota_enable+0xaf/0xa70 [btrfs] > [ 1297.097941] ? btrfs_ioctl+0x1f4d/0x36f0 [btrfs] > [ 1297.098429] btrfs_ioctl+0x2c60/0x36f0 [btrfs] > [ 1297.098904] ? do_user_addr_fault+0x20c/0x430 > [ 1297.099382] ? kvm_clock_read+0x14/0x30 > [ 1297.099854] ? kvm_sched_clock_read+0x5/0x10 > [ 1297.100328] ? sched_clock+0x5/0x10 > [ 1297.100801] ? sched_clock_cpu+0x12/0x180 > [ 1297.101272] ? __x64_sys_ioctl+0x83/0xb0 > [ 1297.101739] __x64_sys_ioctl+0x83/0xb0 > [ 1297.102207] do_syscall_64+0x33/0x80 > [ 1297.102673] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 1297.103148] RIP: 0033:0x7f773ff65d87 > > This is because during the quota enable ioctl we lock first the mutex > qgroup_ioctl_lock and then start a transaction, and starting a transaction > acquires a fs freeze semaphore (at the vfs level). However, every other > code path, except for the quota disable ioctl path, we do the opposite: > we start a transaction and then lock the mutex. > > So fix this by making the quota enable and disable paths to start the > transaction without having the mutex locked, and then, after starting the > transaction, lock the mutex and check if some other task already enabled > or disabled the quotas, bailing with success if that was the case. > > Signed-off-by: Filipe Manana Added to misc-next, thanks.