From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Nikolay Borisov <nborisov@suse.com>,
Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 16/74] btrfs: qgroup: Dont hold qgroup_ioctl_lock in btrfs_qgroup_inherit()
Date: Mon, 5 Aug 2019 15:02:29 +0200 [thread overview]
Message-ID: <20190805124937.103902140@linuxfoundation.org> (raw)
In-Reply-To: <20190805124935.819068648@linuxfoundation.org>
[ Upstream commit e88439debd0a7f969b3ddba6f147152cd0732676 ]
[BUG]
Lockdep will report the following circular locking dependency:
WARNING: possible circular locking dependency detected
5.2.0-rc2-custom #24 Tainted: G O
------------------------------------------------------
btrfs/8631 is trying to acquire lock:
000000002536438c (&fs_info->qgroup_ioctl_lock#2){+.+.}, at: btrfs_qgroup_inherit+0x40/0x620 [btrfs]
but task is already holding lock:
000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&fs_info->tree_log_mutex){+.+.}:
__mutex_lock+0x76/0x940
mutex_lock_nested+0x1b/0x20
btrfs_commit_transaction+0x475/0xa00 [btrfs]
btrfs_commit_super+0x71/0x80 [btrfs]
close_ctree+0x2bd/0x320 [btrfs]
btrfs_put_super+0x15/0x20 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x16/0xa0 [btrfs]
deactivate_locked_super+0x3a/0x80
deactivate_super+0x51/0x60
cleanup_mnt+0x3f/0x80
__cleanup_mnt+0x12/0x20
task_work_run+0x94/0xb0
exit_to_usermode_loop+0xd8/0xe0
do_syscall_64+0x210/0x240
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #1 (&fs_info->reloc_mutex){+.+.}:
__mutex_lock+0x76/0x940
mutex_lock_nested+0x1b/0x20
btrfs_commit_transaction+0x40d/0xa00 [btrfs]
btrfs_quota_enable+0x2da/0x730 [btrfs]
btrfs_ioctl+0x2691/0x2b40 [btrfs]
do_vfs_ioctl+0xa9/0x6d0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x65/0x240
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (&fs_info->qgroup_ioctl_lock#2){+.+.}:
lock_acquire+0xa7/0x190
__mutex_lock+0x76/0x940
mutex_lock_nested+0x1b/0x20
btrfs_qgroup_inherit+0x40/0x620 [btrfs]
create_pending_snapshot+0x9d7/0xe60 [btrfs]
create_pending_snapshots+0x94/0xb0 [btrfs]
btrfs_commit_transaction+0x415/0xa00 [btrfs]
btrfs_mksubvol+0x496/0x4e0 [btrfs]
btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
btrfs_ioctl+0xa90/0x2b40 [btrfs]
do_vfs_ioctl+0xa9/0x6d0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x65/0x240
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of:
&fs_info->qgroup_ioctl_lock#2 --> &fs_info->reloc_mutex --> &fs_info->tree_log_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&fs_info->tree_log_mutex);
lock(&fs_info->reloc_mutex);
lock(&fs_info->tree_log_mutex);
lock(&fs_info->qgroup_ioctl_lock#2);
*** DEADLOCK ***
6 locks held by btrfs/8631:
#0: 00000000ed8f23f6 (sb_writers#12){.+.+}, at: mnt_want_write_file+0x28/0x60
#1: 000000009fb1597a (&type->i_mutex_dir_key#10/1){+.+.}, at: btrfs_mksubvol+0x70/0x4e0 [btrfs]
#2: 0000000088c5ad88 (&fs_info->subvol_sem){++++}, at: btrfs_mksubvol+0x128/0x4e0 [btrfs]
#3: 000000009606fc3e (sb_internal#2){.+.+}, at: start_transaction+0x37a/0x520 [btrfs]
#4: 00000000f82bbdf5 (&fs_info->reloc_mutex){+.+.}, at: btrfs_commit_transaction+0x40d/0xa00 [btrfs]
#5: 000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs]
[CAUSE]
Due to the delayed subvolume creation, we need to call
btrfs_qgroup_inherit() inside commit transaction code, with a lot of
other mutex hold.
This hell of lock chain can lead to above problem.
[FIX]
On the other hand, we don't really need to hold qgroup_ioctl_lock if
we're in the context of create_pending_snapshot().
As in that context, we're the only one being able to modify qgroup.
All other qgroup functions which needs qgroup_ioctl_lock are either
holding a transaction handle, or will start a new transaction:
Functions will start a new transaction():
* btrfs_quota_enable()
* btrfs_quota_disable()
Functions hold a transaction handler:
* btrfs_add_qgroup_relation()
* btrfs_del_qgroup_relation()
* btrfs_create_qgroup()
* btrfs_remove_qgroup()
* btrfs_limit_qgroup()
* btrfs_qgroup_inherit() call inside create_subvol()
So we have a higher level protection provided by transaction, thus we
don't need to always hold qgroup_ioctl_lock in btrfs_qgroup_inherit().
Only the btrfs_qgroup_inherit() call in create_subvol() needs to hold
qgroup_ioctl_lock, while the btrfs_qgroup_inherit() call in
create_pending_snapshot() is already protected by transaction.
So the fix is to detect the context by checking
trans->transaction->state.
If we're at TRANS_STATE_COMMIT_DOING, then we're in commit transaction
context and no need to get the mutex.
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/qgroup.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index e46e83e876001..734866ab51941 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2249,6 +2249,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
int ret = 0;
int i;
u64 *i_qgroups;
+ bool committing = false;
struct btrfs_fs_info *fs_info = trans->fs_info;
struct btrfs_root *quota_root;
struct btrfs_qgroup *srcgroup;
@@ -2256,7 +2257,25 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
u32 level_size = 0;
u64 nums;
- mutex_lock(&fs_info->qgroup_ioctl_lock);
+ /*
+ * There are only two callers of this function.
+ *
+ * One in create_subvol() in the ioctl context, which needs to hold
+ * the qgroup_ioctl_lock.
+ *
+ * The other one in create_pending_snapshot() where no other qgroup
+ * code can modify the fs as they all need to either start a new trans
+ * or hold a trans handler, thus we don't need to hold
+ * qgroup_ioctl_lock.
+ * This would avoid long and complex lock chain and make lockdep happy.
+ */
+ spin_lock(&fs_info->trans_lock);
+ if (trans->transaction->state == TRANS_STATE_COMMIT_DOING)
+ committing = true;
+ spin_unlock(&fs_info->trans_lock);
+
+ if (!committing)
+ mutex_lock(&fs_info->qgroup_ioctl_lock);
if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
goto out;
@@ -2420,7 +2439,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
unlock:
spin_unlock(&fs_info->qgroup_lock);
out:
- mutex_unlock(&fs_info->qgroup_ioctl_lock);
+ if (!committing)
+ mutex_unlock(&fs_info->qgroup_ioctl_lock);
return ret;
}
--
2.20.1
next prev parent reply other threads:[~2019-08-05 13:09 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-05 13:02 [PATCH 4.19 00/74] 4.19.65-stable review Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 01/74] ARM: riscpc: fix DMA Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 02/74] ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200 Greg Kroah-Hartman
2019-08-05 14:41 ` Pavel Machek
2019-08-07 2:26 ` Sasha Levin
2019-08-05 13:02 ` [PATCH 4.19 03/74] ARM: dts: rockchip: Make rk3288-veyron-mickeys emmc work again Greg Kroah-Hartman
2019-08-05 14:45 ` Pavel Machek
2019-08-13 0:01 ` Doug Anderson
2019-08-05 13:02 ` [PATCH 4.19 04/74] ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend Greg Kroah-Hartman
2019-08-05 14:47 ` Pavel Machek
2019-08-13 0:04 ` Doug Anderson
2019-08-05 13:02 ` [PATCH 4.19 05/74] ftrace: Enable trampoline when rec count returns back to one Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 06/74] dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 07/74] arm64: dts: rockchip: fix isp iommu clocks and power domain Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 08/74] kernel/module.c: Only return -EEXIST for modules that have finished loading Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 09/74] firmware/psci: psci_checker: Park kthreads before stopping them Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 10/74] MIPS: lantiq: Fix bitfield masking Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 11/74] dmaengine: rcar-dmac: Reject zero-length slave DMA requests Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 12/74] clk: tegra210: fix PLLU and PLLU_OUT1 Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 13/74] fs/adfs: super: fix use-after-free bug Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 14/74] clk: sprd: Add check for return value of sprd_clk_regmap_init() Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 15/74] btrfs: fix minimum number of chunk errors for DUP Greg Kroah-Hartman
2019-08-05 13:02 ` Greg Kroah-Hartman [this message]
2019-08-05 13:02 ` [PATCH 4.19 17/74] cifs: Fix a race condition with cifs_echo_request Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 18/74] ceph: fix improper use of smp_mb__before_atomic() Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 19/74] ceph: return -ERANGE if virtual xattr value didnt fit in buffer Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 20/74] ACPI: blacklist: fix clang warning for unused DMI table Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 21/74] scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 22/74] perf version: Fix segfault due to missing OPT_END() Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 23/74] x86: kvm: avoid constant-conversion warning Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 24/74] ACPI: fix false-positive -Wuninitialized warning Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 25/74] be2net: Signal that the device cannot transmit during reconfiguration Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 26/74] x86/apic: Silence -Wtype-limits compiler warnings Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 27/74] x86: math-emu: Hide clang warnings for 16-bit overflow Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 28/74] mm/cma.c: fail if fixed declaration cant be honored Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 29/74] lib/test_overflow.c: avoid tainting the kernel and fix wrap size Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 30/74] lib/test_string.c: avoid masking memset16/32/64 failures Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 31/74] coda: add error handling for fget Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 32/74] coda: fix build using bare-metal toolchain Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 33/74] uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 34/74] drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 35/74] ipc/mqueue.c: only perform resource calculation if user valid Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 36/74] mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 37/74] xen/pv: Fix a boot up hang revealed by int3 self test Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 38/74] x86/kvm: Dont call kvm_spurious_fault() from .fixup Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 39/74] x86/paravirt: Fix callee-saved function ELF sizes Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 40/74] x86, boot: Remove multiple copy of static function sanitize_boot_params() Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 41/74] drm/nouveau: fix memory leak in nouveau_conn_reset() Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 42/74] kconfig: Clear "written" flag to avoid data loss Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 43/74] kbuild: initialize CLANG_FLAGS correctly in the top Makefile Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 44/74] Btrfs: fix incremental send failure after deduplication Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 45/74] Btrfs: fix race leading to fs corruption after transaction abort Greg Kroah-Hartman
2019-08-05 13:02 ` [PATCH 4.19 46/74] mmc: dw_mmc: Fix occasional hang after tuning on eMMC Greg Kroah-Hartman
2019-08-06 22:31 ` Pavel Machek
2019-08-06 22:48 ` Sasha Levin
2019-08-05 13:03 ` [PATCH 4.19 47/74] mmc: meson-mx-sdio: Fix misuse of GENMASK macro Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 48/74] gpiolib: fix incorrect IRQ requesting of an active-low lineevent Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 49/74] IB/hfi1: Fix Spectre v1 vulnerability Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 50/74] mtd: rawnand: micron: handle on-die "ECC-off" devices correctly Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 51/74] selinux: fix memory leak in policydb_init() Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 52/74] ALSA: hda: Fix 1-minute detection delay when i915 module is not available Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 53/74] mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 54/74] s390/dasd: fix endless loop after read unit address configuration Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 55/74] cgroup: kselftest: relax fs_spec checks Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 56/74] parisc: Fix build of compressed kernel even with debug enabled Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 57/74] drivers/perf: arm_pmu: Fix failure path in PM notifier Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 58/74] arm64: compat: Allow single-byte watchpoints on all addresses Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 59/74] arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG} Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 60/74] nbd: replace kill_bdev() with __invalidate_device() again Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 61/74] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 62/74] IB/mlx5: Fix unreg_umr to ignore the mkey state Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 63/74] IB/mlx5: Use direct mkey destroy command upon UMR unreg failure Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 64/74] IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 65/74] IB/mlx5: Fix clean_mr() to work in the expected order Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 66/74] IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 67/74] IB/hfi1: Check for error on call to alloc_rsm_map_table Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 68/74] drm/i915/gvt: fix incorrect cache entry for guest page mapping Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 69/74] eeprom: at24: make spd world-readable again Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 70/74] ARC: enable uboot support unconditionally Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 71/74] objtool: Support GCC 9 cold subfunction naming scheme Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 72/74] gcc-9: properly declare the {pv,hv}clock_page storage Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 73/74] x86/vdso: Prevent segfaults due to hoisted vclock reads Greg Kroah-Hartman
2019-08-05 13:03 ` [PATCH 4.19 74/74] scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA Greg Kroah-Hartman
2019-08-05 18:35 ` [PATCH 4.19 00/74] 4.19.65-stable review kernelci.org bot
2019-08-06 0:56 ` shuah
2019-08-06 5:45 ` Naresh Kamboju
2019-08-06 15:49 ` Guenter Roeck
2019-08-06 18:30 ` Jon Hunter
2019-08-06 18:30 ` Jon Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190805124937.103902140@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.