From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ye Bin <yebin10@huawei.com>, Jan Kara <jack@suse.cz>,
Theodore Ts'o <tytso@mit.edu>, Sasha Levin <sashal@kernel.org>,
jack@suse.com, linux-ext4@vger.kernel.org
Subject: [PATCH AUTOSEL 5.10 5/8] jbd2: fix soft lockup in journal_finish_inode_data_buffers()
Date: Mon, 18 Dec 2023 07:46:26 -0500 [thread overview]
Message-ID: <20231218124635.1381482-5-sashal@kernel.org> (raw)
In-Reply-To: <20231218124635.1381482-1-sashal@kernel.org>
From: Ye Bin <yebin10@huawei.com>
[ Upstream commit 6c02757c936063f0631b4e43fe156f8c8f1f351f ]
There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE
Call trace:
dump_backtrace+0x0/0x1a0
show_stack+0x24/0x30
dump_stack+0xb0/0x100
watchdog_timer_fn+0x254/0x3f8
__hrtimer_run_queues+0x11c/0x380
hrtimer_interrupt+0xfc/0x2f8
arch_timer_handler_phys+0x38/0x58
handle_percpu_devid_irq+0x90/0x248
generic_handle_irq+0x3c/0x58
__handle_domain_irq+0x68/0xc0
gic_handle_irq+0x90/0x320
el1_irq+0xcc/0x180
queued_spin_lock_slowpath+0x1d8/0x320
jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
kjournald2+0xec/0x2f0 [jbd2]
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;
According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
filemap_fdatawait_range_keep_errors
__filemap_fdatawait_range
while (index <= end)
nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
if (!nr_pages)
break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
for (i = 0; i < nr_pages; i++)
wait_on_page_writeback(page);
cond_resched();
To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/jbd2/commit.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index fa24b407a9dcb..db137671a41f1 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -300,6 +300,7 @@ static int journal_finish_inode_data_buffers(journal_t *journal,
if (!ret)
ret = err;
}
+ cond_resched();
spin_lock(&journal->j_list_lock);
jinode->i_flags &= ~JI_COMMIT_RUNNING;
smp_mb();
--
2.43.0
next prev parent reply other threads:[~2023-12-18 12:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-18 12:46 [PATCH AUTOSEL 5.10 1/8] clk: rockchip: rk3128: Fix HCLK_OTG gate register Sasha Levin
2023-12-18 12:46 ` Sasha Levin
2023-12-18 12:46 ` Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 2/8] jbd2: correct the printing of write_flags in jbd2_write_superblock() Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 3/8] drm/crtc: Fix uninit-value bug in drm_mode_setcrtc Sasha Levin
2023-12-18 12:46 ` Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 4/8] neighbour: Don't let neigh_forced_gc() disable preemption for long Sasha Levin
2023-12-18 12:46 ` Sasha Levin [this message]
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 6/8] tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 7/8] tracing: Add size check when printing trace_marker output Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 8/8] ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231218124635.1381482-5-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yebin10@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.