stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Douglas Anderson <dianders@chromium.org>,
	Guenter Roeck <groeck@chromium.org>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.4 52/85] bdev: Reduce time holding bd_mutex in sync in blkdev_close()
Date: Tue, 29 Sep 2020 13:00:19 +0200	[thread overview]
Message-ID: <20200929105930.829340482@linuxfoundation.org> (raw)
In-Reply-To: <20200929105928.198942536@linuxfoundation.org>

From: Douglas Anderson <dianders@chromium.org>

[ Upstream commit b849dd84b6ccfe32622988b79b7b073861fcf9f7 ]

While trying to "dd" to the block device for a USB stick, I
encountered a hung task warning (blocked for > 120 seconds).  I
managed to come up with an easy way to reproduce this on my system
(where /dev/sdb is the block device for my USB stick) with:

  while true; do dd if=/dev/zero of=/dev/sdb bs=4M; done

With my reproduction here are the relevant bits from the hung task
detector:

 INFO: task udevd:294 blocked for more than 122 seconds.
 ...
 udevd           D    0   294      1 0x00400008
 Call trace:
  ...
  mutex_lock_nested+0x40/0x50
  __blkdev_get+0x7c/0x3d4
  blkdev_get+0x118/0x138
  blkdev_open+0x94/0xa8
  do_dentry_open+0x268/0x3a0
  vfs_open+0x34/0x40
  path_openat+0x39c/0xdf4
  do_filp_open+0x90/0x10c
  do_sys_open+0x150/0x3c8
  ...

 ...
 Showing all locks held in the system:
 ...
 1 lock held by dd/2798:
  #0: ffffff814ac1a3b8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x50/0x204
 ...
 dd              D    0  2798   2764 0x00400208
 Call trace:
  ...
  schedule+0x8c/0xbc
  io_schedule+0x1c/0x40
  wait_on_page_bit_common+0x238/0x338
  __lock_page+0x5c/0x68
  write_cache_pages+0x194/0x500
  generic_writepages+0x64/0xa4
  blkdev_writepages+0x24/0x30
  do_writepages+0x48/0xa8
  __filemap_fdatawrite_range+0xac/0xd8
  filemap_write_and_wait+0x30/0x84
  __blkdev_put+0x88/0x204
  blkdev_put+0xc4/0xe4
  blkdev_close+0x28/0x38
  __fput+0xe0/0x238
  ____fput+0x1c/0x28
  task_work_run+0xb0/0xe4
  do_notify_resume+0xfc0/0x14bc
  work_pending+0x8/0x14

The problem appears related to the fact that my USB disk is terribly
slow and that I have a lot of RAM in my system to cache things.
Specifically my writes seem to be happening at ~15 MB/s and I've got
~4 GB of RAM in my system that can be used for buffering.  To write 4
GB of buffer to disk thus takes ~4000 MB / ~15 MB/s = ~267 seconds.

The 267 second number is a problem because in __blkdev_put() we call
sync_blockdev() while holding the bd_mutex.  Any other callers who
want the bd_mutex will be blocked for the whole time.

The problem is made worse because I believe blkdev_put() specifically
tells other tasks (namely udev) to go try to access the device at right
around the same time we're going to hold the mutex for a long time.

Putting some traces around this (after disabling the hung task detector),
I could confirm:
 dd:    437.608600: __blkdev_put() right before sync_blockdev() for sdb
 udevd: 437.623901: blkdev_open() right before blkdev_get() for sdb
 dd:    661.468451: __blkdev_put() right after sync_blockdev() for sdb
 udevd: 663.820426: blkdev_open() right after blkdev_get() for sdb

A simple fix for this is to realize that sync_blockdev() works fine if
you're not holding the mutex.  Also, it's not the end of the world if
you sync a little early (though it can have performance impacts).
Thus we can make a guess that we're going to need to do the sync and
then do it without holding the mutex.  We still do one last sync with
the mutex but it should be much, much faster.

With this, my hung task warnings for my test case are gone.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/block_dev.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index b2ebfd96785b7..a71d442ef7d0e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1515,6 +1515,16 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 	struct gendisk *disk = bdev->bd_disk;
 	struct block_device *victim = NULL;
 
+	/*
+	 * Sync early if it looks like we're the last one.  If someone else
+	 * opens the block device between now and the decrement of bd_openers
+	 * then we did a sync that we didn't need to, but that's not the end
+	 * of the world and we want to avoid long (could be several minute)
+	 * syncs while holding the mutex.
+	 */
+	if (bdev->bd_openers == 1)
+		sync_blockdev(bdev);
+
 	mutex_lock_nested(&bdev->bd_mutex, for_part);
 	if (for_part)
 		bdev->bd_part_count--;
-- 
2.25.1




  parent reply	other threads:[~2020-09-29 11:04 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-29 10:59 [PATCH 4.4 00/85] 4.4.238-rc1 review Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 01/85] af_key: pfkey_dump needs parameter validation Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 02/85] KVM: fix memory leak in kvm_io_bus_unregister_dev() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 03/85] kprobes: fix kill kprobe which has been marked as gone Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 04/85] ftrace: Setup correct FTRACE_FL_REGS flags for module Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 05/85] RDMA/ucma: ucma_context reference leak in error path Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 06/85] mtd: Fix comparison in map_word_andequal() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 07/85] hdlc_ppp: add range checks in ppp_cp_parse_cr() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 08/85] tipc: use skb_unshare() instead in tipc_buf_append() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 09/85] net: add __must_check to skb_put_padto() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 10/85] ip: fix tos reflection in ack and reset packets Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 11/85] serial: 8250: Avoid error message on reprobe Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 12/85] scsi: aacraid: fix illegal IO beyond last LBA Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 13/85] m68k: q40: Fix info-leak in rtc_ioctl Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 14/85] gma/gma500: fix a memory disclosure bug due to uninitialized bytes Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 15/85] ASoC: kirkwood: fix IRQ error handling Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 16/85] PM / devfreq: tegra30: Fix integer overflow on CPUs freq max out Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 17/85] mtd: cfi_cmdset_0002: dont free cfi->cfiq in error path of cfi_amdstd_setup() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 18/85] mfd: mfd-core: Protect against NULL call-back function pointer Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 19/85] tracing: Adding NULL checks for trace_array descriptor pointer Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 20/85] bcache: fix a lost wake-up problem caused by mca_cannibalize_lock Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 21/85] xfs: fix attr leaf header freemap.size underflow Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 22/85] kernel/sys.c: avoid copying possible padding bytes in copy_to_user Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 23/85] neigh_stat_seq_next() should increase position index Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 24/85] rt_cpu_seq_next " Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 25/85] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 26/85] ACPI: EC: Reference count query handlers under lock Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 27/85] tracing: Set kernel_stacks caller size properly Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 28/85] ar5523: Add USB ID of SMCWUSBT-G2 wireless adapter Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 29/85] Bluetooth: Fix refcount use-after-free issue Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 30/85] mm: pagewalk: fix termination condition in walk_pte_range() Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 31/85] Bluetooth: prefetch channel before killing sock Greg Kroah-Hartman
2020-09-29 10:59 ` [PATCH 4.4 32/85] skbuff: fix a data race in skb_queue_len() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 33/85] audit: CONFIG_CHANGE dont log internal bookkeeping as an event Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 34/85] selinux: sel_avc_get_stat_idx should increase position index Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 35/85] scsi: lpfc: Fix RQ buffer leakage when no IOCBs available Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 36/85] drm/omap: fix possible object reference leak Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 37/85] dmaengine: tegra-apb: Prevent race conditions on channels freeing Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 38/85] media: go7007: Fix URB type for interrupt handling Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 39/85] Bluetooth: guard against controllers sending zerod events Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 40/85] drm/amdgpu: increase atombios cmd timeout Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 41/85] Bluetooth: L2CAP: handle l2cap config request during open state Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 42/85] media: tda10071: fix unsigned sign extension overflow Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 43/85] tpm: ibmvtpm: Wait for buffer to be set before proceeding Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 44/85] tracing: Use address-of operator on section symbols Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 45/85] serial: 8250_omap: Fix sleeping function called from invalid context during probe Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 46/85] SUNRPC: Fix a potential buffer overflow in svc_print_xprts() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 47/85] ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 48/85] ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 49/85] mm/filemap.c: clear page error before actual read Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 50/85] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 51/85] KVM: Remove CREATE_IRQCHIP/SET_PIT2 race Greg Kroah-Hartman
2020-09-29 11:00 ` Greg Kroah-Hartman [this message]
2020-09-29 11:00 ` [PATCH 4.4 53/85] drivers: char: tlclk.c: Avoid data race between init and interrupt handler Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 54/85] dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 55/85] atm: fix a memory leak of vcc->user_back Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 56/85] phy: samsung: s5pv210-usb2: Add delay after reset Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 57/85] Bluetooth: Handle Inquiry Cancel error after Inquiry Complete Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 58/85] USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 59/85] tty: serial: samsung: Correct clock selection logic Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 60/85] ALSA: hda: Fix potential race in unsol event handler Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 61/85] fuse: dont check refcount after stealing page Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 62/85] USB: EHCI: ehci-mv: fix less than zero comparison of an unsigned int Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 63/85] e1000: Do not perform reset in reset_task if we are already down Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 64/85] printk: handle blank console arguments passed in Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 65/85] vfio/pci: fix memory leaks of eventfd ctx Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 66/85] perf kcore_copy: Fix module map when there are no modules loaded Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 67/85] mtd: rawnand: omap_elm: Fix runtime PM imbalance on error Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 68/85] ceph: fix potential race in ceph_check_caps Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 69/85] mtd: parser: cmdline: Support MTD names containing one or more colons Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 70/85] x86/speculation/mds: Mark mds_user_clear_cpu_buffers() __always_inline Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 71/85] vfio/pci: Clear error and request eventfd ctx after releasing Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 72/85] vfio/pci: fix racy on error and request eventfd ctx Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 73/85] s390/init: add missing __init annotations Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 74/85] mwifiex: Increase AES key storage size to 256 bits Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 75/85] batman-adv: bla: fix type misuse for backbone_gw hash indexing Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 76/85] atm: eni: fix the missed pci_disable_device() for eni_init_one() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 77/85] batman-adv: mcast/TT: fix wrongly dropped or rerouted packets Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 78/85] ALSA: asihpi: fix iounmap in error handler Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 79/85] MIPS: Add the missing CPU_1074K into __get_cpu_type() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 80/85] tty: vt, consw->con_scrolldelta cleanup Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 81/85] kprobes: Fix to check probe enabled before disarm_kprobe_ftrace() Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 82/85] lib/string.c: implement stpcpy Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 83/85] ata: define AC_ERR_OK Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 84/85] ata: make qc_prep return ata_completion_errors Greg Kroah-Hartman
2020-09-29 11:00 ` [PATCH 4.4 85/85] ata: sata_mv, avoid trigerrable BUG_ON Greg Kroah-Hartman
2020-09-29 12:25 ` [PATCH 4.4 00/85] 4.4.238-rc1 review Pavel Machek
2020-09-29 15:15 ` Jon Hunter
2020-09-29 20:45 ` Guenter Roeck
2020-09-30 19:50 ` Shuah Khan
2020-10-01  1:45 ` Dan Rue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200929105930.829340482@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=dianders@chromium.org \
    --cc=groeck@chromium.org \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).