From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, David Rientjes <rientjes@google.com>,
Guenter Roeck <linux@roeck-us.net>,
Douglas Anderson <dianders@chromium.org>,
Mike Snitzer <snitzer@redhat.com>
Subject: [PATCH 4.4 39/47] dm bufio: avoid sleeping while holding the dm_bufio lock
Date: Tue, 10 Jul 2018 20:25:03 +0200 [thread overview]
Message-ID: <20180710182338.807810204@linuxfoundation.org> (raw)
In-Reply-To: <20180710182337.047502999@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson <dianders@chromium.org>
commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc upstream.
We've seen in-field reports showing _lots_ (18 in one case, 41 in
another) of tasks all sitting there blocked on:
mutex_lock+0x4c/0x68
dm_bufio_shrink_count+0x38/0x78
shrink_slab.part.54.constprop.65+0x100/0x464
shrink_zone+0xa8/0x198
In the two cases analyzed, we see one task that looks like this:
Workqueue: kverityd verity_prefetch_io
__switch_to+0x9c/0xa8
__schedule+0x440/0x6d8
schedule+0x94/0xb4
schedule_timeout+0x204/0x27c
schedule_timeout_uninterruptible+0x44/0x50
wait_iff_congested+0x9c/0x1f0
shrink_inactive_list+0x3a0/0x4cc
shrink_lruvec+0x418/0x5cc
shrink_zone+0x88/0x198
try_to_free_pages+0x51c/0x588
__alloc_pages_nodemask+0x648/0xa88
__get_free_pages+0x34/0x7c
alloc_buffer+0xa4/0x144
__bufio_new+0x84/0x278
dm_bufio_prefetch+0x9c/0x154
verity_prefetch_io+0xe8/0x10c
process_one_work+0x240/0x424
worker_thread+0x2fc/0x424
kthread+0x10c/0x114
...and that looks to be the one holding the mutex.
The problem has been reproduced on fairly easily:
0. Be running Chrome OS w/ verity enabled on the root filesystem
1. Pick test patch: http://crosreview.com/412360
2. Install launchBalloons.sh and balloon.arm from
http://crbug.com/468342
...that's just a memory stress test app.
3. On a 4GB rk3399 machine, run
nice ./launchBalloons.sh 4 900 100000
...that tries to eat 4 * 900 MB of memory and keep accessing.
4. Login to the Chrome web browser and restore many tabs
With that, I've seen printouts like:
DOUG: long bufio 90758 ms
...and stack trace always show's we're in dm_bufio_prefetch().
The problem is that we try to allocate memory with GFP_NOIO while
we're holding the dm_bufio lock. Instead we should be using
GFP_NOWAIT. Using GFP_NOIO can cause us to sleep while holding the
lock and that causes the above problems.
The current behavior explained by David Rientjes:
It will still try reclaim initially because __GFP_WAIT (or
__GFP_KSWAPD_RECLAIM) is set by GFP_NOIO. This is the cause of
contention on dm_bufio_lock() that the thread holds. You want to
pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
mutex that can be contended by a concurrent slab shrinker (if
count_objects didn't use a trylock, this pattern would trivially
deadlock).
This change significantly increases responsiveness of the system while
in this state. It makes a real difference because it unblocks kswapd.
In the bug report analyzed, kswapd was hung:
kswapd0 D ffffffc000204fd8 0 72 2 0x00000000
Call trace:
[<ffffffc000204fd8>] __switch_to+0x9c/0xa8
[<ffffffc00090b794>] __schedule+0x440/0x6d8
[<ffffffc00090bac0>] schedule+0x94/0xb4
[<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
[<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
[<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
[<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
[<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
[<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
[<ffffffc00030e578>] balance_pgdat+0x328/0x508
[<ffffffc00030eb7c>] kswapd+0x424/0x51c
[<ffffffc00023f06c>] kthread+0x10c/0x114
[<ffffffc000203dd0>] ret_from_fork+0x10/0x40
By unblocking kswapd memory pressure should be reduced.
Suggested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/dm-bufio.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -818,7 +818,8 @@ static struct dm_buffer *__alloc_buffer_
* dm-bufio is resistant to allocation failures (it just keeps
* one buffer reserved in cases all the allocations fail).
* So set flags to not try too hard:
- * GFP_NOIO: don't recurse into the I/O layer
+ * GFP_NOWAIT: don't wait; if we need to sleep we'll release our
+ * mutex and wait ourselves.
* __GFP_NORETRY: don't retry and rather return failure
* __GFP_NOMEMALLOC: don't use emergency reserves
* __GFP_NOWARN: don't print a warning in case of failure
@@ -828,7 +829,7 @@ static struct dm_buffer *__alloc_buffer_
*/
while (1) {
if (dm_bufio_cache_size_latch != 1) {
- b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
+ b = alloc_buffer(c, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
if (b)
return b;
}
next prev parent reply other threads:[~2018-07-10 18:28 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-10 18:24 [PATCH 4.4 00/47] 4.4.140-stable review Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 01/47] usb: cdc_acm: Add quirk for Uniden UBC125 scanner Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 02/47] USB: serial: cp210x: add CESINEL device ids Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 03/47] USB: serial: cp210x: add Silicon Labs IDs for Windows Update Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 04/47] n_tty: Fix stall at n_tty_receive_char_special() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 05/47] staging: android: ion: Return an ERR_PTR in ion_map_kernel Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 06/47] n_tty: Access echo_* variables carefully Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 07/47] x86/boot: Fix early command-line parsing when matching at end Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 08/47] ath10k: fix rfc1042 header retrieval in QCA4019 with eth decap mode Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 09/47] i2c: rcar: fix resume by always initializing registers before transfer Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 10/47] ipv4: Fix error return value in fib_convert_metrics() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 11/47] kprobes/x86: Do not modify singlestep buffer while resuming Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 12/47] nvme-pci: initialize queue memory before interrupts Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 13/47] netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 14/47] ARM: dts: imx6q: Use correct SDMA script for SPI5 core Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 15/47] ubi: fastmap: Correctly handle interrupted erasures in EBA Greg Kroah-Hartman
2018-07-26 2:12 ` Ben Hutchings
2018-07-26 6:25 ` Richard Weinberger
2018-07-28 1:28 ` Ben Hutchings
2018-07-28 5:56 ` Richard Weinberger
2018-07-10 18:24 ` [PATCH 4.4 16/47] mm: hugetlb: yield when prepping struct pages Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 17/47] tracing: Fix missing return symbol in function_graph output Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 18/47] scsi: sg: mitigate read/write abuse Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 19/47] s390: Correct register corruption in critical section cleanup Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 20/47] drbd: fix access after free Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 21/47] cifs: Fix infinite loop when using hard mount option Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 22/47] jbd2: dont mark block as modified if the handle is out of credits Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 23/47] ext4: make sure bitmaps and the inode table dont overlap with bg descriptors Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 24/47] ext4: always check block group bounds in ext4_init_block_bitmap() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 25/47] ext4: only look at the bg_flags field if it is valid Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 26/47] ext4: verify the depth of extent tree in ext4_find_extent() Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 27/47] ext4: include the illegal physical block in the bad map ext4_error msg Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 28/47] ext4: clear i_data in ext4_inode_info when removing inline data Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 29/47] ext4: add more inode number paranoia checks Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 30/47] ext4: add more mount time checks of the superblock Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 31/47] ext4: check superblock mapped prior to committing Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 32/47] HID: i2c-hid: Fix "incomplete report" noise Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 33/47] HID: hiddev: fix potential Spectre v1 Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 34/47] HID: debug: check length before copy_to_user() Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 37/47] media: cx25840: Use subdev host data for PLL override Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 38/47] mm, page_alloc: do not break __GFP_THISNODE by zonelist reset Greg Kroah-Hartman
2018-07-10 18:25 ` Greg Kroah-Hartman [this message]
2018-07-10 18:25 ` [PATCH 4.4 40/47] dm bufio: drop the lock when doing GFP_NOIO allocation Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 41/47] mtd: rawnand: mxc: set spare area size register explicitly Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 42/47] dm bufio: dont take the lock in dm_bufio_shrink_count Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 43/47] mtd: cfi_cmdset_0002: Change definition naming to retry write operation Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 44/47] mtd: cfi_cmdset_0002: Change erase functions to retry for error Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 45/47] mtd: cfi_cmdset_0002: Change erase functions to check chip good only Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 46/47] netfilter: nf_log: dont hold nf_log_mutex during user access Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 47/47] staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write() Greg Kroah-Hartman
2018-07-10 19:10 ` [PATCH 4.4 00/47] 4.4.140-stable review Nathan Chancellor
2018-07-11 11:21 ` Naresh Kamboju
2018-07-11 13:40 ` Guenter Roeck
2018-07-11 15:10 ` Shuah Khan
-- strict thread matches above, loose matches on Subject: below --
2018-07-10 18:24 [4.4,35/47] x86/mce: Detect local MCEs properly Greg Kroah-Hartman
2018-07-10 18:24 ` [PATCH 4.4 35/47] " Greg Kroah-Hartman
2018-07-10 18:25 [4.4,36/47] x86/mce: Fix incorrect "Machine check from unknown source" message Greg Kroah-Hartman
2018-07-10 18:25 ` [PATCH 4.4 36/47] " Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180710182338.807810204@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dianders@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=rientjes@google.com \
--cc=snitzer@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.