stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Qu Wenruo <wqu@suse.com>, Jeff Mahoney <jeffm@suse.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
	linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 5.6 65/68] btrfs: qgroup: ensure qgroup_rescan_running is only set when the worker is at least queued
Date: Thu,  9 Apr 2020 23:46:30 -0400	[thread overview]
Message-ID: <20200410034634.7731-65-sashal@kernel.org> (raw)
In-Reply-To: <20200410034634.7731-1-sashal@kernel.org>

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit d61acbbf54c612ea9bf67eed609494cda0857b3a ]

[BUG]
There are some reports about btrfs wait forever to unmount itself, with
the following call trace:

  INFO: task umount:4631 blocked for more than 491 seconds.
        Tainted: G               X  5.3.8-2-default #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  umount          D    0  4631   3337 0x00000000
  Call Trace:
  ([<00000000174adf7a>] __schedule+0x342/0x748)
   [<00000000174ae3ca>] schedule+0x4a/0xd8
   [<00000000174b1f08>] schedule_timeout+0x218/0x420
   [<00000000174af10c>] wait_for_common+0x104/0x1d8
   [<000003ff804d6994>] btrfs_qgroup_wait_for_completion+0x84/0xb0 [btrfs]
   [<000003ff8044a616>] close_ctree+0x4e/0x380 [btrfs]
   [<0000000016fa3136>] generic_shutdown_super+0x8e/0x158
   [<0000000016fa34d6>] kill_anon_super+0x26/0x40
   [<000003ff8041ba88>] btrfs_kill_super+0x28/0xc8 [btrfs]
   [<0000000016fa39f8>] deactivate_locked_super+0x68/0x98
   [<0000000016fcb198>] cleanup_mnt+0xc0/0x140
   [<0000000016d6a846>] task_work_run+0xc6/0x110
   [<0000000016d04f76>] do_notify_resume+0xae/0xb8
   [<00000000174b30ae>] system_call+0xe2/0x2c8

[CAUSE]
The problem happens when we have called qgroup_rescan_init(), but
not queued the worker. It can be caused mostly by error handling.

	Qgroup ioctl thread		|	Unmount thread
----------------------------------------+-----------------------------------
					|
btrfs_qgroup_rescan()			|
|- qgroup_rescan_init()			|
|  |- qgroup_rescan_running = true;	|
|					|
|- trans = btrfs_join_transaction()	|
|  Some error happened			|
|					|
|- btrfs_qgroup_rescan() returns error	|
   But qgroup_rescan_running == true;	|
					| close_ctree()
					| |- btrfs_qgroup_wait_for_completion()
					|    |- running == true;
					|    |- wait_for_completion();

btrfs_qgroup_rescan_worker is never queued, thus no one is going to wake
up close_ctree() and we get a deadlock.

All involved qgroup_rescan_init() callers are:

- btrfs_qgroup_rescan()
  The example above. It's possible to trigger the deadlock when error
  happened.

- btrfs_quota_enable()
  Not possible. Just after qgroup_rescan_init() we queue the work.

- btrfs_read_qgroup_config()
  It's possible to trigger the deadlock. It only init the work, the
  work queueing happens in btrfs_qgroup_rescan_resume().
  Thus if error happened in between, deadlock is possible.

We shouldn't set fs_info->qgroup_rescan_running just in
qgroup_rescan_init(), as at that stage we haven't yet queued qgroup
rescan worker to run.

[FIX]
Set qgroup_rescan_running before queueing the work, so that we ensure
the rescan work is queued when we wait for it.

Fixes: 8d9eddad1946 ("Btrfs: fix qgroup rescan worker initialization")
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ Change subject and cause analyse, use a smaller fix ]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/qgroup.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index ff1870ff3474a..afc9752e984c3 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1030,6 +1030,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
 	ret = qgroup_rescan_init(fs_info, 0, 1);
 	if (!ret) {
 	        qgroup_rescan_zero_tracking(fs_info);
+		fs_info->qgroup_rescan_running = true;
 	        btrfs_queue_work(fs_info->qgroup_rescan_workers,
 	                         &fs_info->qgroup_rescan_work);
 	}
@@ -3263,7 +3264,6 @@ qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 progress_objectid,
 		sizeof(fs_info->qgroup_rescan_progress));
 	fs_info->qgroup_rescan_progress.objectid = progress_objectid;
 	init_completion(&fs_info->qgroup_rescan_completion);
-	fs_info->qgroup_rescan_running = true;
 
 	spin_unlock(&fs_info->qgroup_lock);
 	mutex_unlock(&fs_info->qgroup_rescan_lock);
@@ -3326,8 +3326,11 @@ btrfs_qgroup_rescan(struct btrfs_fs_info *fs_info)
 
 	qgroup_rescan_zero_tracking(fs_info);
 
+	mutex_lock(&fs_info->qgroup_rescan_lock);
+	fs_info->qgroup_rescan_running = true;
 	btrfs_queue_work(fs_info->qgroup_rescan_workers,
 			 &fs_info->qgroup_rescan_work);
+	mutex_unlock(&fs_info->qgroup_rescan_lock);
 
 	return 0;
 }
@@ -3363,9 +3366,13 @@ int btrfs_qgroup_wait_for_completion(struct btrfs_fs_info *fs_info,
 void
 btrfs_qgroup_rescan_resume(struct btrfs_fs_info *fs_info)
 {
-	if (fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN)
+	if (fs_info->qgroup_flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN) {
+		mutex_lock(&fs_info->qgroup_rescan_lock);
+		fs_info->qgroup_rescan_running = true;
 		btrfs_queue_work(fs_info->qgroup_rescan_workers,
 				 &fs_info->qgroup_rescan_work);
+		mutex_unlock(&fs_info->qgroup_rescan_lock);
+	}
 }
 
 /*
-- 
2.20.1


  parent reply	other threads:[~2020-04-10  4:00 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-10  3:45 [PATCH AUTOSEL 5.6 01/68] cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 02/68] EDAC/mc: Report "unknown memory" on too many DIMM labels found Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 03/68] usb: ucsi: ccg: disable runtime pm during fw flashing Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 04/68] staging: wilc1000: avoid double unlocking of 'wilc->hif_cs' mutex Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 05/68] media: vimc: streamer: fix memory leak in vimc subdevs if kthread_run fails Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 06/68] media: hantro: fix extra MV/MC sync space calculation Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 07/68] media: staging: rkisp1: use consistent bus_info string for media_dev Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 08/68] media: staging: rkisp1: isp: do not set invalid mbus code for pad Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 09/68] media: venus: hfi_parser: Ignore HEVC encoding for V1 Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 10/68] firmware: arm_sdei: fix double-lock on hibernate with shared events Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 11/68] media: arm64: dts: amlogic: add rc-videostrong-kii-pro keymap Sasha Levin
2020-04-10  6:07   ` Sean Young
2020-04-13 17:19     ` Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 12/68] usb: phy: tegra: Include proper GPIO consumer header to fix compile testing Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 13/68] arm64/mm: Hold memory hotplug lock while walking for kernel page table dump Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 14/68] driver core: Reevaluate dev->links.need_for_probe as suppliers are added Sasha Levin
2020-04-10  6:29   ` Greg Kroah-Hartman
2020-04-10  6:39     ` Saravana Kannan
2020-04-10  6:52       ` Greg Kroah-Hartman
2020-04-10 16:25         ` Saravana Kannan
2020-04-11 11:40           ` Greg Kroah-Hartman
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 15/68] sched/vtime: Prevent unstable evaluation of WARN(vtime->state) Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 16/68] iio: imu: st_lsm6dsx: check return value from st_lsm6dsx_sensor_set_enable Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 17/68] null_blk: Fix the null_add_dev() error path Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 18/68] blk-mq: Fix a recently introduced regression in blk_mq_realloc_hw_ctxs() Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 19/68] null_blk: Handle null_add_dev() failures properly Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 20/68] null_blk: Suppress an UBSAN complaint triggered when setting 'memory_backed' Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 21/68] null_blk: fix spurious IO errors after failed past-wp access Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 22/68] media: imx: imx7_mipi_csis: Power off the source when stopping streaming Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 23/68] media: imx: imx7-media-csi: Fix video field handling Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 24/68] xhci: bail out early if driver can't accress host in resume Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 25/68] ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add() Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 26/68] clocksource/drivers/timer-microchip-pit64b: Fix rate for gck Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 27/68] x86: Don't let pgprot_modify() change the page encryption bit Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 28/68] dma-mapping: Fix dma_pgprot() for unencrypted coherent pages Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 29/68] block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 30/68] debugfs: Check module state before warning in {full/open}_proxy_open() Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 31/68] spi: spi-fsl-dspi: Avoid NULL pointer in dspi_slave_abort for non-DMA mode Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 32/68] irqchip/versatile-fpga: Handle chained IRQs properly Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 33/68] time/sched_clock: Expire timer in hardirq context Sasha Levin
2020-04-10  3:45 ` [PATCH AUTOSEL 5.6 34/68] irqchip/gic-v4.1: Skip absent CPUs while iterating over redistributors Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 35/68] media: allegro: fix type of gop_length in channel_create message Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 36/68] sched: Avoid scale real weight down to zero Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 37/68] sched/fair: Fix condition of avg_load calculation Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 38/68] selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 39/68] PCI/switchtec: Fix init_completion race condition with poll_wait() Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 40/68] block, bfq: move forward the getting of an extra ref in bfq_bfqq_move Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 41/68] io-uring: drop completion when removing file Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 42/68] media: i2c: video-i2c: fix build errors due to 'imply hwmon' Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 43/68] libata: Remove extra scsi_host_put() in ata_scsi_add_hosts() Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 44/68] pstore/platform: fix potential mem leak if pstore_init_fs failed Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 45/68] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 46/68] gfs2: Don't demote a glock until its revokes are written Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 47/68] cpufreq: imx6q: fix error handling Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 48/68] x86/boot: Use unsigned comparison for addresses Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 49/68] efi/x86: Ignore the memory attributes table on i386 Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 50/68] genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 51/68] blk-mq: Keep set->nr_hw_queues and set->map[].nr_queues in sync Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 52/68] block: Fix use-after-free issue accessing struct io_cq Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 53/68] block, zoned: fix integer overflow with BLKRESETZONE et al Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 54/68] media: mtk-vpu: avoid unaligned access to DTCM buffer Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 55/68] media: i2c: ov5695: Fix power on and off sequences Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 56/68] usb: dwc3: core: add support for disabling SS instances in park mode Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 57/68] irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 58/68] md: check arrays is suspended in mddev_detach before call quiesce operations Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 59/68] firmware: fix a double abort case with fw_load_sysfs_fallback Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 60/68] spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 61/68] locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 62/68] staging: mt7621-pci: avoid to poweroff the phy for slot one Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 63/68] block, bfq: fix use-after-free in bfq_idle_slice_timer_body Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 64/68] btrfs: hold a ref on the root in btrfs_recover_relocation Sasha Levin
2020-04-10 10:09   ` David Sterba
2020-04-13 17:19     ` Sasha Levin
2020-04-10  3:46 ` Sasha Levin [this message]
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 66/68] btrfs: remove a BUG_ON() from merge_reloc_roots() Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 67/68] btrfs: restart relocate_tree_blocks properly Sasha Levin
2020-04-10  3:46 ` [PATCH AUTOSEL 5.6 68/68] btrfs: track reloc roots based on their commit root bytenr Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200410034634.7731-65-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=dsterba@suse.com \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).