From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Chunhai Guo <guochunhai@vivo.com>, Jan Kara <jack@suse.cz>,
Christian Brauner <brauner@kernel.org>,
Sasha Levin <sashal@kernel.org>,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org
Subject: [PATCH AUTOSEL 5.15 02/10] fs-writeback: do not requeue a clean inode having skipped pages
Date: Sat, 7 Oct 2023 20:49:41 -0400 [thread overview]
Message-ID: <20231008004950.3768189-2-sashal@kernel.org> (raw)
In-Reply-To: <20231008004950.3768189-1-sashal@kernel.org>
From: Chunhai Guo <guochunhai@vivo.com>
[ Upstream commit be049c3a088d512187407b7fd036cecfab46d565 ]
When writing back an inode and performing an fsync on it concurrently, a
deadlock issue may arise as shown below. In each writeback iteration, a
clean inode is requeued to the wb->b_dirty queue due to non-zero
pages_skipped, without anything actually being written. This causes an
infinite loop and prevents the plug from being flushed, resulting in a
deadlock. We now avoid requeuing the clean inode to prevent this issue.
wb_writeback fsync (inode-Y)
blk_start_plug(&plug)
for (;;) {
iter i-1: some reqs with page-X added into plug->mq_list // f2fs node page-X with PG_writeback
filemap_fdatawrite
__filemap_fdatawrite_range // write inode-Y with sync_mode WB_SYNC_ALL
do_writepages
f2fs_write_data_pages
__f2fs_write_data_pages // wb_sync_req[DATA]++ for WB_SYNC_ALL
f2fs_write_cache_pages
f2fs_write_single_data_page
f2fs_do_write_data_page
f2fs_outplace_write_data
f2fs_update_data_blkaddr
f2fs_wait_on_page_writeback
wait_on_page_writeback // wait for f2fs node page-X
iter i:
progress = __writeback_inodes_wb(wb, work)
. writeback_sb_inodes
. __writeback_single_inode // write inode-Y with sync_mode WB_SYNC_NONE
. . do_writepages
. . f2fs_write_data_pages
. . . __f2fs_write_data_pages // skip writepages due to (wb_sync_req[DATA]>0)
. . . wbc->pages_skipped += get_dirty_pages(inode) // wbc->pages_skipped = 1
. if (!(inode->i_state & I_DIRTY_ALL)) // i_state = I_SYNC | I_SYNC_QUEUED
. total_wrote++; // total_wrote = 1
. requeue_inode // requeue inode-Y to wb->b_dirty queue due to non-zero pages_skipped
if (progress) // progress = 1
continue;
iter i+1:
queue_io
// similar process with iter i, infinite for-loop !
}
blk_finish_plug(&plug) // flush plug won't be called
Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230916045131.957929-1-guochunhai@vivo.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/fs-writeback.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index c76537a6826a7..5f0abea107e46 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1557,10 +1557,15 @@ static void requeue_inode(struct inode *inode, struct bdi_writeback *wb,
if (wbc->pages_skipped) {
/*
- * writeback is not making progress due to locked
- * buffers. Skip this inode for now.
+ * Writeback is not making progress due to locked buffers.
+ * Skip this inode for now. Although having skipped pages
+ * is odd for clean inodes, it can happen for some
+ * filesystems so handle that gracefully.
*/
- redirty_tail_locked(inode, wb);
+ if (inode->i_state & I_DIRTY_ALL)
+ redirty_tail_locked(inode, wb);
+ else
+ inode_cgwb_move_to_attached(inode, wb);
return;
}
--
2.40.1
next prev parent reply other threads:[~2023-10-08 0:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-08 0:49 [PATCH AUTOSEL 5.15 01/10] ARM: dts: ti: omap: Fix noisy serial with overrun-throttle-ms for mapphone Sasha Levin
2023-10-08 0:49 ` Sasha Levin [this message]
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 03/10] btrfs: return -EUCLEAN for delayed tree ref with a ref count not equals to 1 Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 04/10] btrfs: initialize start_slot in btrfs_log_prealloc_extents Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 05/10] i2c: mux: Avoid potential false error message in i2c_mux_add_adapter Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 06/10] overlayfs: set ctime when setting mtime and atime Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 07/10] gpio: timberdale: Fix potential deadlock on &tgpio->lock Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 08/10] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 09/10] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Sasha Levin
2023-10-08 0:49 ` [PATCH AUTOSEL 5.15 10/10] tracing: relax trace_event_eval_update() execution with cond_resched() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231008004950.3768189-2-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=brauner@kernel.org \
--cc=guochunhai@vivo.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox