From: "Luis Henriques (SUSE)" <luis.henriques@linux.dev>
To: Theodore Ts'o <tytso@mit.edu>, Andreas Dilger <adilger@dilger.ca>,
Jan Kara <jack@suse.cz>,
Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
"Luis Henriques (SUSE)" <luis.henriques@linux.dev>
Subject: [PATCH v2] ext4: fix fast commit inode enqueueing during a full journal commit
Date: Thu, 23 May 2024 12:16:18 +0100 [thread overview]
Message-ID: <20240523111618.17012-1-luis.henriques@linux.dev> (raw)
When a full journal commit is on-going, any fast commit has to be enqueued
into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
is done only once, i.e. if an inode is already queued in a previous fast
commit entry it won't be enqueued again. However, if a full commit starts
_after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
be done into FC_Q_STAGING. And this is not being done in function
ext4_fc_track_template().
This patch fixes the issue by flagging an inode that is already enqueued in
either queues. Later, during the fast commit clean-up callback, if the
inode has a tid that is bigger than the one being handled, that inode is
re-enqueued into STAGING and the spliced back into MAIN.
This bug was found using fstest generic/047. This test creates several 32k
bytes files, sync'ing each of them after it's creation, and then shutting
down the filesystem. Some data may be loss in this operation; for example a
file may have it's size truncated to zero.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
Hi!
(Now Cc'ing Harshad, as I should have done in the initial RFC.)
This v2 is a complete different solution, hinted by Jan Kara. I hope my
understanding of his suggestion is correct. Also, I've dropped the second
patch as it didn't made sense, as Jan also pointed out.
Finally, I haven't yet done a review of Harshad's patchset [1] (hope to
get to it soon), but a quick test shows the issue is still present there.
The good news is that patch can be trivially applied on top of it.
[1] https://lore.kernel.org/all/20240520055153.136091-1-harshadshirwadkar@gmail.com
Cheers,
--
Luis
fs/ext4/ext4.h | 11 ++++++++++-
fs/ext4/fast_commit.c | 11 +++++++++++
fs/ext4/super.c | 1 +
3 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 983dad8c07ec..4c308c18c3da 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1062,9 +1062,18 @@ struct ext4_inode_info {
/* Fast commit wait queue for this inode */
wait_queue_head_t i_fc_wait;
- /* Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len */
+ /*
+ * Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len,
+ * i_fc_next
+ */
struct mutex i_fc_lock;
+ /*
+ * Used to flag an inode as part of the next fast commit; will be
+ * reset during fast commit clean-up
+ */
+ tid_t i_fc_next;
+
/*
* i_disksize keeps track of what the inode size is ON DISK, not
* in memory. During truncate, i_size is set to the new size by
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 87c009e0c59a..bfdf249f0783 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -402,6 +402,8 @@ static int ext4_fc_track_template(
sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
&sbi->s_fc_q[FC_Q_STAGING] :
&sbi->s_fc_q[FC_Q_MAIN]);
+ else
+ ei->i_fc_next = tid;
spin_unlock(&sbi->s_fc_lock);
return ret;
@@ -1280,6 +1282,15 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
i_fc_list) {
list_del_init(&iter->i_fc_list);
+ if (iter->i_fc_next == tid)
+ iter->i_fc_next = 0;
+ else if (iter->i_fc_next > tid)
+ /*
+ * re-enqueue inode into STAGING, which will later be
+ * splice back into MAIN
+ */
+ list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
+ &sbi->s_fc_q[FC_Q_STAGING]);
ext4_clear_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
if (iter->i_sync_tid <= tid)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 893ab80dafba..56f416656d96 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1437,6 +1437,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
INIT_WORK(&ei->i_rsv_conversion_work, ext4_end_io_rsv_work);
ext4_fc_init_inode(&ei->vfs_inode);
mutex_init(&ei->i_fc_lock);
+ ei->i_fc_next = 0;
return &ei->vfs_inode;
}
next reply other threads:[~2024-05-23 11:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 11:16 Luis Henriques (SUSE) [this message]
2024-05-24 16:22 ` [PATCH v2] ext4: fix fast commit inode enqueueing during a full journal commit Jan Kara
2024-05-27 8:29 ` Luis Henriques
2024-05-27 15:48 ` Luis Henriques
2024-05-28 10:36 ` Jan Kara
2024-05-28 10:52 ` Jan Kara
2024-05-28 15:50 ` Luis Henriques
2024-05-29 0:01 ` harshad shirwadkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240523111618.17012-1-luis.henriques@linux.dev \
--to=luis.henriques@linux.dev \
--cc=adilger@dilger.ca \
--cc=harshadshirwadkar@gmail.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox