public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Qu Wenruo <wqu@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: [PATCH 4.19 31/37] btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info
Date: Mon,  4 May 2020 19:57:44 +0200	[thread overview]
Message-ID: <20200504165451.463599438@linuxfoundation.org> (raw)
In-Reply-To: <20200504165448.264746645@linuxfoundation.org>

From: Qu Wenruo <wqu@suse.com>

commit fcc99734d1d4ced30167eb02e17f656735cb9928 upstream.

[BUG]
One run of btrfs/063 triggered the following lockdep warning:
  ============================================
  WARNING: possible recursive locking detected
  5.6.0-rc7-custom+ #48 Not tainted
  --------------------------------------------
  kworker/u24:0/7 is trying to acquire lock:
  ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]

  but task is already holding lock:
  ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(sb_internal#2);
    lock(sb_internal#2);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  4 locks held by kworker/u24:0/7:
   #0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80
   #1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80
   #2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
   #3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs]

  stack backtrace:
  CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
  Call Trace:
   dump_stack+0xc2/0x11a
   __lock_acquire.cold+0xce/0x214
   lock_acquire+0xe6/0x210
   __sb_start_write+0x14e/0x290
   start_transaction+0x66c/0x890 [btrfs]
   btrfs_join_transaction+0x1d/0x20 [btrfs]
   find_free_extent+0x1504/0x1a50 [btrfs]
   btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
   btrfs_alloc_tree_block+0x1ac/0x570 [btrfs]
   btrfs_copy_root+0x213/0x580 [btrfs]
   create_reloc_root+0x3bd/0x470 [btrfs]
   btrfs_init_reloc_root+0x2d2/0x310 [btrfs]
   record_root_in_trans+0x191/0x1d0 [btrfs]
   btrfs_record_root_in_trans+0x90/0xd0 [btrfs]
   start_transaction+0x16e/0x890 [btrfs]
   btrfs_join_transaction+0x1d/0x20 [btrfs]
   btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs]
   finish_ordered_fn+0x15/0x20 [btrfs]
   btrfs_work_helper+0x116/0x9a0 [btrfs]
   process_one_work+0x632/0xb80
   worker_thread+0x80/0x690
   kthread+0x1a3/0x1f0
   ret_from_fork+0x27/0x50

It's pretty hard to reproduce, only one hit so far.

[CAUSE]
This is because we're calling btrfs_join_transaction() without re-using
the current running one:

btrfs_finish_ordered_io()
|- btrfs_join_transaction()		<<< Call #1
   |- btrfs_record_root_in_trans()
      |- btrfs_reserve_extent()
	 |- btrfs_join_transaction()	<<< Call #2

Normally such btrfs_join_transaction() call should re-use the existing
one, without trying to re-start a transaction.

But the problem is, in btrfs_join_transaction() call #1, we call
btrfs_record_root_in_trans() before initializing current::journal_info.

And in btrfs_join_transaction() call #2, we're relying on
current::journal_info to avoid such deadlock.

[FIX]
Call btrfs_record_root_in_trans() after we have initialized
current::journal_info.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/transaction.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -572,10 +572,19 @@ again:
 	}
 
 got_it:
-	btrfs_record_root_in_trans(h, root);
-
 	if (!current->journal_info)
 		current->journal_info = h;
+
+	/*
+	 * btrfs_record_root_in_trans() needs to alloc new extents, and may
+	 * call btrfs_join_transaction() while we're also starting a
+	 * transaction.
+	 *
+	 * Thus it need to be called after current->journal_info initialized,
+	 * or we can deadlock.
+	 */
+	btrfs_record_root_in_trans(h, root);
+
 	return h;
 
 join_fail:



  parent reply	other threads:[~2020-05-04 18:02 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 17:57 [PATCH 4.19 00/37] 4.19.121-rc1 review Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 01/37] drm/edid: Fix off-by-one in DispID DTD pixel clock Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 02/37] drm/qxl: qxl_release leak in qxl_draw_dirty_fb() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 03/37] drm/qxl: qxl_release leak in qxl_hw_surface_alloc() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 04/37] drm/qxl: qxl_release use after free Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 05/37] btrfs: fix block group leak when removing fails Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 06/37] ALSA: hda/realtek - Two front mics on a Lenovo ThinkCenter Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 07/37] ALSA: usb-audio: Correct a typo of NuPrime DAC-10 USB ID Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 08/37] ALSA: hda/hdmi: fix without unlocked before return Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 09/37] ALSA: pcm: oss: Place the plugin buffer overflow checks correctly Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 10/37] PM: ACPI: Output correct message on target power state Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in software_resume() Greg Kroah-Hartman
2020-05-05 12:09   ` Pavel Machek
2020-05-05 16:57     ` Dexuan Cui
2020-05-04 17:57 ` [PATCH 4.19 12/37] dm verity fec: fix hash block number in verity_fec_decode Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 13/37] dm writecache: fix data corruption when reloading the target Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 14/37] dm multipath: use updated MPATHF_QUEUE_IO on mapping for bio-based mpath Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 15/37] scsi: qla2xxx: set UNLOADING before waiting for session deletion Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 16/37] scsi: qla2xxx: check UNLOADING before posting async work Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 17/37] RDMA/mlx5: Set GRH fields in query QP on RoCE Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 18/37] RDMA/mlx4: Initialize ib_spec on the stack Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 19/37] RDMA/core: Prevent mixed use of FDs between shared ufiles Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 20/37] RDMA/core: Fix race between destroy and release FD object Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 21/37] vfio: avoid possible overflow in vfio_iommu_type1_pin_pages Greg Kroah-Hartman
2020-05-05 12:17   ` Pavel Machek
2020-05-04 17:57 ` [PATCH 4.19 22/37] vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 23/37] iommu/qcom: Fix local_base status check Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 24/37] scsi: target/iblock: fix WRITE SAME zeroing Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 25/37] iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 26/37] ALSA: opti9xx: shut up gcc-10 range warning Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 27/37] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 28/37] dmaengine: dmatest: Fix iteration non-stop logic Greg Kroah-Hartman
2020-05-05 12:31   ` Pavel Machek
2020-05-05 12:51     ` Andy Shevchenko
2020-05-05 12:58       ` Pavel Machek
2020-05-05 13:19         ` Andy Shevchenko
2020-05-05 13:37           ` Pavel Machek
2020-05-05 14:05             ` Andy Shevchenko
2020-05-05 14:53               ` Pavel Machek
2020-05-05 15:32               ` Sasha Levin
2020-05-05 15:57                 ` Pavel Machek
2020-05-05 21:37                   ` Andy Shevchenko
2020-05-04 17:57 ` [PATCH 4.19 29/37] selinux: properly handle multiple messages in selinux_netlink_send() Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 30/37] btrfs: fix partial loss of prealloc extent past i_size after fsync Greg Kroah-Hartman
2020-05-04 17:57 ` Greg Kroah-Hartman [this message]
2020-05-04 17:57 ` [PATCH 4.19 32/37] mmc: cqhci: Avoid false "cqhci: CQE stuck on" by not open-coding timeout loop Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 33/37] mmc: sdhci-xenon: fix annoying 1.8V regulator warning Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 34/37] mmc: sdhci-pci: Fix eMMC driver strength for BYT-based controllers Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 35/37] mmc: sdhci-msm: Enable host capabilities pertains to R1b response Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 36/37] mmc: meson-mx-sdio: Set MMC_CAP_WAIT_WHILE_BUSY Greg Kroah-Hartman
2020-05-04 17:57 ` [PATCH 4.19 37/37] mmc: meson-mx-sdio: remove the broken ->card_busy() op Greg Kroah-Hartman
2020-05-05  7:42 ` [PATCH 4.19 00/37] 4.19.121-rc1 review Chris Paterson
2020-05-05  9:17   ` Greg Kroah-Hartman
2020-05-05  8:37 ` Jon Hunter
2020-05-05 15:24 ` Naresh Kamboju
2020-05-05 15:50 ` shuah
2020-05-05 16:02 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200504165451.463599438@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox