From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Chao Yu <yuchao0@huawei.com>,
Jaegeuk Kim <jaegeuk@kernel.org>
Subject: [PATCH 5.0 43/52] f2fs: fix to avoid deadlock of atomic file operations
Date: Tue, 26 Mar 2019 15:30:30 +0900 [thread overview]
Message-ID: <20190326042703.269132852@linuxfoundation.org> (raw)
In-Reply-To: <20190326042700.963224437@linuxfoundation.org>
5.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu <yuchao0@huawei.com>
commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.
Thread A Thread B
- __fput
- f2fs_release_file
- drop_inmem_pages
- mutex_lock(&fi->inmem_lock)
- __revoke_inmem_pages
- lock_page(page)
- open
- f2fs_setattr
- truncate_setsize
- truncate_inode_pages_range
- lock_page(page)
- truncate_cleanup_page
- f2fs_invalidate_page
- drop_inmem_page
- mutex_lock(&fi->inmem_lock);
We may encounter above ABBA deadlock as reported by Kyungtae Kim:
I'm reporting a bug in linux-4.17.19: "INFO: task hung in
drop_inmem_page" (no reproducer)
I think this might be somehow related to the following:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ
=========================================
INFO: task syz-executor7:10822 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7 D27024 10822 6346 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
__mutex_lock_common kernel/locking/mutex.c:833 [inline]
__mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
do_invalidatepage mm/truncate.c:165 [inline]
truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
truncate_inode_pages mm/truncate.c:478 [inline]
truncate_pagecache+0x6d/0x90 mm/truncate.c:801
truncate_setsize+0x81/0xa0 mm/truncate.c:826
f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
notify_change+0xa62/0xe80 fs/attr.c:313
do_truncate+0x12e/0x1e0 fs/open.c:63
do_last fs/namei.c:2955 [inline]
path_openat+0x2042/0x29f0 fs/namei.c:3505
do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
do_sys_open+0x35e/0x4e0 fs/open.c:1101
__do_sys_open fs/open.c:1119 [inline]
__se_sys_open fs/open.c:1114 [inline]
__x64_sys_open+0x89/0xc0 fs/open.c:1114
do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
INFO: task syz-executor7:10858 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7 D28880 10858 6346 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
__rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
__down_write arch/x86/include/asm/rwsem.h:142 [inline]
down_write+0x58/0xa0 kernel/locking/rwsem.c:72
inode_lock include/linux/fs.h:713 [inline]
do_truncate+0x120/0x1e0 fs/open.c:61
do_last fs/namei.c:2955 [inline]
path_openat+0x2042/0x29f0 fs/namei.c:3505
do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
do_sys_open+0x35e/0x4e0 fs/open.c:1101
__do_sys_open fs/open.c:1119 [inline]
__se_sys_open fs/open.c:1114 [inline]
__x64_sys_open+0x89/0xc0 fs/open.c:1114
do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
INFO: task syz-executor5:10829 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5 D28760 10829 6308 0x80000002
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
io_schedule+0x21/0x80 kernel/sched/core.c:5179
wait_on_page_bit_common mm/filemap.c:1100 [inline]
__lock_page+0x2b5/0x390 mm/filemap.c:1273
lock_page include/linux/pagemap.h:483 [inline]
__revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
__fput+0x2c7/0x780 fs/file_table.c:209
____fput+0x1a/0x20 fs/file_table.c:243
task_work_run+0x151/0x1d0 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x8ba/0x30a0 kernel/exit.c:865
do_group_exit+0x13b/0x3a0 kernel/exit.c:968
get_signal+0x6bb/0x1650 kernel/signal.c:2482
do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700
This patch tries to use trylock_page to mitigate such deadlock condition
for fix.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/f2fs/segment.c | 43 +++++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 12 deletions(-)
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -215,7 +215,8 @@ void f2fs_register_inmem_page(struct ino
}
static int __revoke_inmem_pages(struct inode *inode,
- struct list_head *head, bool drop, bool recover)
+ struct list_head *head, bool drop, bool recover,
+ bool trylock)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct inmem_pages *cur, *tmp;
@@ -227,7 +228,16 @@ static int __revoke_inmem_pages(struct i
if (drop)
trace_f2fs_commit_inmem_page(page, INMEM_DROP);
- lock_page(page);
+ if (trylock) {
+ /*
+ * to avoid deadlock in between page lock and
+ * inmem_lock.
+ */
+ if (!trylock_page(page))
+ continue;
+ } else {
+ lock_page(page);
+ }
f2fs_wait_on_page_writeback(page, DATA, true, true);
@@ -318,13 +328,19 @@ void f2fs_drop_inmem_pages(struct inode
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct f2fs_inode_info *fi = F2FS_I(inode);
- mutex_lock(&fi->inmem_lock);
- __revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
- if (!list_empty(&fi->inmem_ilist))
- list_del_init(&fi->inmem_ilist);
- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
- mutex_unlock(&fi->inmem_lock);
+ while (!list_empty(&fi->inmem_pages)) {
+ mutex_lock(&fi->inmem_lock);
+ __revoke_inmem_pages(inode, &fi->inmem_pages,
+ true, false, true);
+
+ if (list_empty(&fi->inmem_pages)) {
+ spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+ if (!list_empty(&fi->inmem_ilist))
+ list_del_init(&fi->inmem_ilist);
+ spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+ }
+ mutex_unlock(&fi->inmem_lock);
+ }
clear_inode_flag(inode, FI_ATOMIC_FILE);
fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
@@ -429,12 +445,15 @@ retry:
* recovery or rewrite & commit last transaction. For other
* error number, revoking was done by filesystem itself.
*/
- err = __revoke_inmem_pages(inode, &revoke_list, false, true);
+ err = __revoke_inmem_pages(inode, &revoke_list,
+ false, true, false);
/* drop all uncommitted pages */
- __revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
+ __revoke_inmem_pages(inode, &fi->inmem_pages,
+ true, false, false);
} else {
- __revoke_inmem_pages(inode, &revoke_list, false, false);
+ __revoke_inmem_pages(inode, &revoke_list,
+ false, false, false);
}
return err;
next prev parent reply other threads:[~2019-03-26 6:41 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-26 6:29 [PATCH 5.0 00/52] 5.0.5-stable review Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 01/52] ALSA: hda - add Lenovo IdeaCentre B550 to the power_save_blacklist Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 02/52] ALSA: firewire-motu: use version field of unit directory to identify model Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 03/52] mmc: pxamci: fix enum type confusion Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 04/52] mmc: alcor: fix DMA reads Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 05/52] mmc: mxcmmc: "Revert mmc: mxcmmc: handle highmem pages" Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 06/52] mmc: renesas_sdhi: limit block count to 16 bit for old revisions Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 07/52] drm/amdgpu: fix invalid use of change_bit Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 08/52] drm/vmwgfx: Dont double-free the mode stored in par->set_mode Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 09/52] drm/vmwgfx: Return 0 when gmrid::get_node runs out of IDs Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 10/52] iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 11/52] iommu/iova: Fix tracking of recently failed iova address Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 5.0 12/52] libceph: wait for latest osdmap in ceph_monc_blacklist_add() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 13/52] udf: Fix crash on IO error during truncate Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 14/52] mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 15/52] MIPS: Ensure ELF appended dtb is relocated Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 16/52] MIPS: Fix kernel crash for R6 in jump label branch function Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 17/52] powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038 Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 18/52] powerpc/security: Fix spectre_v2 reporting Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 19/52] net/mlx5: Fix DCT creation bad flow Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 20/52] scsi: core: Avoid that a kernel warning appears during system resume Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 21/52] scsi: qla2xxx: Fix FC-AL connection target discovery Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 22/52] scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 23/52] scsi: ibmvscsi: Fix empty event pool access during host removal Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 24/52] futex: Ensure that futex address is aligned in handle_futex_death() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 25/52] cifs: allow guest mounts to work for smb3.11 Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 26/52] perf probe: Fix getting the kernel map Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 27/52] objtool: Move objtool_file struct off the stack Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 28/52] irqchip/gic-v3-its: Fix comparison logic in lpi_range_cmp Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 29/52] clocksource/drivers/riscv: Fix clocksource mask Greg Kroah-Hartman
2019-03-26 6:30 ` Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 30/52] SMB3: Fix SMB3.1.1 guest mounts to Samba Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 31/52] ALSA: hda - Dont trigger jackpoll_work in azx_resume Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 32/52] ALSA: ac97: Fix of-node refcount unbalance Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 33/52] ext4: fix NULL pointer dereference while journal is aborted Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 34/52] ext4: fix data corruption caused by unaligned direct AIO Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 35/52] ext4: brelse all indirect buffer in ext4_ind_remove_space() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 36/52] media: v4l2-ctrls.c/uvc: zero v4l2_event Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 37/52] Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 38/52] Bluetooth: Fix decrementing reference count twice in releasing socket Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 39/52] Bluetooth: hci_ldisc: Initialize hci_dev before open() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 40/52] Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 41/52] drm/vkms: Fix flush_work() without INIT_WORK() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 42/52] RDMA/cma: Rollback source IP address if failing to acquire device Greg Kroah-Hartman
2019-03-26 6:30 ` Greg Kroah-Hartman [this message]
2019-03-26 6:30 ` [PATCH 5.0 44/52] aio: simplify - and fix - fget/fput for io_submit() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 45/52] netfilter: ebtables: remove BUGPRINT messages Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 46/52] loop: access lo_backing_file only when the loop device is Lo_bound Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 47/52] x86/unwind: Handle NULL pointer calls better in frame unwinder Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 48/52] x86/unwind: Add hardcoded ORC entry for NULL Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 49/52] locking/lockdep: Add debug_locks check in __lock_downgrade() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 50/52] mm, mempolicy: fix uninit memory access Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 51/52] ALSA: hda - Record the current power state before suspend/resume calls Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 5.0 52/52] ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec Greg Kroah-Hartman
2019-03-26 15:20 ` [PATCH 5.0 00/52] 5.0.5-stable review Jon Hunter
2019-03-26 15:20 ` Jon Hunter
2019-03-27 0:56 ` Greg Kroah-Hartman
2019-03-26 17:50 ` Guenter Roeck
2019-03-27 0:59 ` Greg Kroah-Hartman
2019-03-26 23:18 ` shuah
2019-03-27 0:55 ` Greg Kroah-Hartman
2019-03-27 4:06 ` Naresh Kamboju
2019-03-27 5:06 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190326042703.269132852@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=jaegeuk@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=yuchao0@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.