From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Qu Wenruo <wqu@suse.com>,
David Sterba <dsterba@suse.com>
Subject: [PATCH 4.19 31/37] btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info
Date: Mon, 4 May 2020 19:57:44 +0200 [thread overview]
Message-ID: <20200504165451.463599438@linuxfoundation.org> (raw)
In-Reply-To: <20200504165448.264746645@linuxfoundation.org>
From: Qu Wenruo <wqu@suse.com>
commit fcc99734d1d4ced30167eb02e17f656735cb9928 upstream.
[BUG]
One run of btrfs/063 triggered the following lockdep warning:
============================================
WARNING: possible recursive locking detected
5.6.0-rc7-custom+ #48 Not tainted
--------------------------------------------
kworker/u24:0/7 is trying to acquire lock:
ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
but task is already holding lock:
ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(sb_internal#2);
lock(sb_internal#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u24:0/7:
#0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80
#1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80
#2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
#3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs]
stack backtrace:
CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
Call Trace:
dump_stack+0xc2/0x11a
__lock_acquire.cold+0xce/0x214
lock_acquire+0xe6/0x210
__sb_start_write+0x14e/0x290
start_transaction+0x66c/0x890 [btrfs]
btrfs_join_transaction+0x1d/0x20 [btrfs]
find_free_extent+0x1504/0x1a50 [btrfs]
btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
btrfs_alloc_tree_block+0x1ac/0x570 [btrfs]
btrfs_copy_root+0x213/0x580 [btrfs]
create_reloc_root+0x3bd/0x470 [btrfs]
btrfs_init_reloc_root+0x2d2/0x310 [btrfs]
record_root_in_trans+0x191/0x1d0 [btrfs]
btrfs_record_root_in_trans+0x90/0xd0 [btrfs]
start_transaction+0x16e/0x890 [btrfs]
btrfs_join_transaction+0x1d/0x20 [btrfs]
btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs]
finish_ordered_fn+0x15/0x20 [btrfs]
btrfs_work_helper+0x116/0x9a0 [btrfs]
process_one_work+0x632/0xb80
worker_thread+0x80/0x690
kthread+0x1a3/0x1f0
ret_from_fork+0x27/0x50
It's pretty hard to reproduce, only one hit so far.
[CAUSE]
This is because we're calling btrfs_join_transaction() without re-using
the current running one:
btrfs_finish_ordered_io()
|- btrfs_join_transaction() <<< Call #1
|- btrfs_record_root_in_trans()
|- btrfs_reserve_extent()
|- btrfs_join_transaction() <<< Call #2
Normally such btrfs_join_transaction() call should re-use the existing
one, without trying to re-start a transaction.
But the problem is, in btrfs_join_transaction() call #1, we call
btrfs_record_root_in_trans() before initializing current::journal_info.
And in btrfs_join_transaction() call #2, we're relying on
current::journal_info to avoid such deadlock.
[FIX]
Call btrfs_record_root_in_trans() after we have initialized
current::journal_info.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/transaction.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -572,10 +572,19 @@ again:
}
got_it:
- btrfs_record_root_in_trans(h, root);
-
if (!current->journal_info)
current->journal_info = h;
+
+ /*
+ * btrfs_record_root_in_trans() needs to alloc new extents, and may
+ * call btrfs_join_transaction() while we're also starting a
+ * transaction.
+ *
+ * Thus it need to be called after current->journal_info initialized,
+ * or we can deadlock.
+ */
+ btrfs_record_root_in_trans(h, root);
+
return h;
join_fail:
next prev parent reply other threads:[~2020-05-04 18:02 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-04 17:57 [PATCH 4.19 00/37] 4.19.121-rc1 review Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 01/37] drm/edid: Fix off-by-one in DispID DTD pixel clock Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 02/37] drm/qxl: qxl_release leak in qxl_draw_dirty_fb() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 03/37] drm/qxl: qxl_release leak in qxl_hw_surface_alloc() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 04/37] drm/qxl: qxl_release use after free Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 05/37] btrfs: fix block group leak when removing fails Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 06/37] ALSA: hda/realtek - Two front mics on a Lenovo ThinkCenter Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 07/37] ALSA: usb-audio: Correct a typo of NuPrime DAC-10 USB ID Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 08/37] ALSA: hda/hdmi: fix without unlocked before return Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 09/37] ALSA: pcm: oss: Place the plugin buffer overflow checks correctly Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 10/37] PM: ACPI: Output correct message on target power state Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in software_resume() Greg Kroah-Hartman
2020-05-05 12:09 ` Pavel Machek
2020-05-05 16:57 ` Dexuan Cui
2020-05-04 17:57 ` [PATCH 4.19 12/37] dm verity fec: fix hash block number in verity_fec_decode Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 13/37] dm writecache: fix data corruption when reloading the target Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 14/37] dm multipath: use updated MPATHF_QUEUE_IO on mapping for bio-based mpath Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 15/37] scsi: qla2xxx: set UNLOADING before waiting for session deletion Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 16/37] scsi: qla2xxx: check UNLOADING before posting async work Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 17/37] RDMA/mlx5: Set GRH fields in query QP on RoCE Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 18/37] RDMA/mlx4: Initialize ib_spec on the stack Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 19/37] RDMA/core: Prevent mixed use of FDs between shared ufiles Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 20/37] RDMA/core: Fix race between destroy and release FD object Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 21/37] vfio: avoid possible overflow in vfio_iommu_type1_pin_pages Greg Kroah-Hartman
2020-05-05 12:17 ` Pavel Machek
2020-05-04 17:57 ` [PATCH 4.19 22/37] vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 23/37] iommu/qcom: Fix local_base status check Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 24/37] scsi: target/iblock: fix WRITE SAME zeroing Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 25/37] iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 26/37] ALSA: opti9xx: shut up gcc-10 range warning Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 27/37] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 28/37] dmaengine: dmatest: Fix iteration non-stop logic Greg Kroah-Hartman
2020-05-05 12:31 ` Pavel Machek
2020-05-05 12:51 ` Andy Shevchenko
2020-05-05 12:58 ` Pavel Machek
2020-05-05 13:19 ` Andy Shevchenko
2020-05-05 13:37 ` Pavel Machek
2020-05-05 14:05 ` Andy Shevchenko
2020-05-05 14:53 ` Pavel Machek
2020-05-05 15:32 ` Sasha Levin
2020-05-05 15:57 ` Pavel Machek
2020-05-05 21:37 ` Andy Shevchenko
2020-05-04 17:57 ` [PATCH 4.19 29/37] selinux: properly handle multiple messages in selinux_netlink_send() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 30/37] btrfs: fix partial loss of prealloc extent past i_size after fsync Greg Kroah-Hartman
2020-05-04 17:57 ` Greg Kroah-Hartman [this message]
2020-05-04 17:57 ` [PATCH 4.19 32/37] mmc: cqhci: Avoid false "cqhci: CQE stuck on" by not open-coding timeout loop Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 33/37] mmc: sdhci-xenon: fix annoying 1.8V regulator warning Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 34/37] mmc: sdhci-pci: Fix eMMC driver strength for BYT-based controllers Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 35/37] mmc: sdhci-msm: Enable host capabilities pertains to R1b response Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 36/37] mmc: meson-mx-sdio: Set MMC_CAP_WAIT_WHILE_BUSY Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 37/37] mmc: meson-mx-sdio: remove the broken ->card_busy() op Greg Kroah-Hartman
2020-05-05 7:42 ` [PATCH 4.19 00/37] 4.19.121-rc1 review Chris Paterson
2020-05-05 9:17 ` Greg Kroah-Hartman
2020-05-05 8:37 ` Jon Hunter
2020-05-05 15:24 ` Naresh Kamboju
2020-05-05 15:50 ` shuah
2020-05-05 16:02 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200504165451.463599438@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox