From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Ye Bin <yebin10@huawei.com>, Jan Kara <jack@suse.cz>,
Theodore Ts'o <tytso@mit.edu>, Sasha Levin <sashal@kernel.org>,
jack@suse.com, linux-ext4@vger.kernel.org
Subject: [PATCH AUTOSEL 5.10 5/8] jbd2: fix soft lockup in journal_finish_inode_data_buffers()
Date: Mon, 18 Dec 2023 07:46:26 -0500 [thread overview]
Message-ID: <20231218124635.1381482-5-sashal@kernel.org> (raw)
In-Reply-To: <20231218124635.1381482-1-sashal@kernel.org>
From: Ye Bin <yebin10@huawei.com>
[ Upstream commit 6c02757c936063f0631b4e43fe156f8c8f1f351f ]
There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE
Call trace:
dump_backtrace+0x0/0x1a0
show_stack+0x24/0x30
dump_stack+0xb0/0x100
watchdog_timer_fn+0x254/0x3f8
__hrtimer_run_queues+0x11c/0x380
hrtimer_interrupt+0xfc/0x2f8
arch_timer_handler_phys+0x38/0x58
handle_percpu_devid_irq+0x90/0x248
generic_handle_irq+0x3c/0x58
__handle_domain_irq+0x68/0xc0
gic_handle_irq+0x90/0x320
el1_irq+0xcc/0x180
queued_spin_lock_slowpath+0x1d8/0x320
jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
kjournald2+0xec/0x2f0 [jbd2]
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;
According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
filemap_fdatawait_range_keep_errors
__filemap_fdatawait_range
while (index <= end)
nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
if (!nr_pages)
break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
for (i = 0; i < nr_pages; i++)
wait_on_page_writeback(page);
cond_resched();
To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/jbd2/commit.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index fa24b407a9dcb..db137671a41f1 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -300,6 +300,7 @@ static int journal_finish_inode_data_buffers(journal_t *journal,
if (!ret)
ret = err;
}
+ cond_resched();
spin_lock(&journal->j_list_lock);
jinode->i_flags &= ~JI_COMMIT_RUNNING;
smp_mb();
--
2.43.0
next prev parent reply other threads:[~2023-12-18 12:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-18 12:46 [PATCH AUTOSEL 5.10 1/8] clk: rockchip: rk3128: Fix HCLK_OTG gate register Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 2/8] jbd2: correct the printing of write_flags in jbd2_write_superblock() Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 3/8] drm/crtc: Fix uninit-value bug in drm_mode_setcrtc Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 4/8] neighbour: Don't let neigh_forced_gc() disable preemption for long Sasha Levin
2023-12-18 12:46 ` Sasha Levin [this message]
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 6/8] tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 7/8] tracing: Add size check when printing trace_marker output Sasha Levin
2023-12-18 12:46 ` [PATCH AUTOSEL 5.10 8/8] ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231218124635.1381482-5-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yebin10@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox