From: Luis Henriques <luis.henriques@linux.dev>
To: Jan Kara <jack@suse.cz>
Cc: "Luis Henriques (SUSE)" <luis.henriques@linux.dev>,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger@dilger.ca>, Jan Kara <jack@suse.com>,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
Date: Wed, 22 May 2024 14:21:47 +0100 [thread overview]
Message-ID: <87pltedlsk.fsf@brahms.olymp> (raw)
In-Reply-To: <20240522103545.ypmmoyxvls52i6yl@quack3> (Jan Kara's message of "Wed, 22 May 2024 12:35:45 +0200")
On Wed 22 May 2024 12:35:45 PM +02, Jan Kara wrote;
> On Tue 21-05-24 16:45:34, Luis Henriques (SUSE) wrote:
>> When a full journal commit is on-going, any fast commit has to be enqueued
>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
>> is done only once, i.e. if an inode is already queued in a previous fast
>> commit entry it won't be enqueued again. However, if a full commit starts
>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>> be done into FC_Q_STAGING. And this is not being done in function
>> ext4_fc_track_template().
>
> Ah, good catch.
>
>> This patch fixes the issue by simply re-enqueuing the inode from the MAIN
>> into the STAGING queue.
>>
>> This bug was found using fstest generic/047. This test creates several 32k
>> bytes files, sync'ing each of them after it's creation, and then shutting
>> down the filesystem. Some data may be loss in this operation; for example a
>> file may have it's size truncated to zero.
>>
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>> ---
>> fs/ext4/fast_commit.c | 19 +++++++++++++------
>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
>> index 87c009e0c59a..337b5289cf11 100644
>> --- a/fs/ext4/fast_commit.c
>> +++ b/fs/ext4/fast_commit.c
>> @@ -396,12 +396,19 @@ static int ext4_fc_track_template(
>> return ret;
>>
>> spin_lock(&sbi->s_fc_lock);
>> - if (list_empty(&EXT4_I(inode)->i_fc_list))
>> - list_add_tail(&EXT4_I(inode)->i_fc_list,
>> - (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
>> - sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
>> - &sbi->s_fc_q[FC_Q_STAGING] :
>> - &sbi->s_fc_q[FC_Q_MAIN]);
>> + if (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
>> + sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) {
>> + if (list_empty(&EXT4_I(inode)->i_fc_list))
>> + list_add_tail(&EXT4_I(inode)->i_fc_list,
>> + &sbi->s_fc_q[FC_Q_STAGING]);
>> + else
>> + list_move_tail(&EXT4_I(inode)->i_fc_list,
>> + &sbi->s_fc_q[FC_Q_STAGING]);
>
> So I'm not sure this is actually safe. I'm concerned about the following
> race:
>
> Task1 Task2
>
> handle = ext4_journal_start(..)
> modify inode_X
> ext4_fc_track_inode(inode_X)
> ext4_fsync(inode_X)
> ext4_fc_commit()
> jbd2_fc_begin_commit()
> journal->j_flags |= JBD2_FAST_COMMIT_ONGOING;
> ...
> jbd2_journal_lock_updates()
> blocks waiting for handle of Task2
> modify inode_X
> ext4_fc_track_inode(inode_X)
> - moves inode out of FC_Q_MAIN
> ext4_journal_stop()
> fast commit proceeds but skips inode_X...
Hmm... I see, the problem is deeper that I thought.
> How we deal with a similar issue in jbd2 for ordinary buffers is that we
> just mark the buffer as *also* belonging to the next transaction (by
> setting jh->b_next_transaction) and during commit cleanup we move the bh to
> the appropriate list of the next transaction. Here, we could mark the inode
> as also being part of the next fast commit and during fastcommit cleanup we
> could move it to FC_Q_STAGING which is then spliced back to FC_Q_MAIN.
Yeah, I guess that would work. I'll need to add a new field to flag the
'next commit' in struct ext4_inode_info. I'll need to play a bit with it
and see what I can came up with. Thanks for the suggestion.
> Also Harshad has recently posted changes to fast commit code that modify
> how fast commits are serialized (in particular jbd2_journal_lock_updates()
> is gone). I didn't read them yet but your change definitely needs a careful
> verification against those changes to make sure we don't introduce new data
> integrity issues.
>
Right, I saw his patchset only after sending my RFC (and I should have
probably included him on the CC as well; probably get_maintainer.pl isn't
picking his email).
I'll need to look at those changes too, which will probably take me some
time as most of that code isn't familiar to me.
Thanks a lot for your review, Jan. Much appreciated.
Cheers,
--
Luis
>> + } else {
>> + if (list_empty(&EXT4_I(inode)->i_fc_list))
>> + list_add_tail(&EXT4_I(inode)->i_fc_list,
>> + &sbi->s_fc_q[FC_Q_MAIN]);
>> + }
>> spin_unlock(&sbi->s_fc_lock);
>>
>> return ret;
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
next prev parent reply other threads:[~2024-05-22 13:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-21 15:45 [RFC PATCH 0/2] ext4: two small fast commit fixes Luis Henriques (SUSE)
2024-05-21 15:45 ` [RFC PATCH 1/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-22 10:35 ` Jan Kara
2024-05-22 13:21 ` Luis Henriques [this message]
2024-05-21 15:45 ` [RFC PATCH 2/2] jbd2: reset fast commit offset only after fs cleanup is done Luis Henriques (SUSE)
2024-05-22 10:45 ` Jan Kara
2024-05-22 13:36 ` Luis Henriques
2024-05-23 7:44 ` Jan Kara
2024-05-23 8:52 ` Luis Henriques
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pltedlsk.fsf@brahms.olymp \
--to=luis.henriques@linux.dev \
--cc=adilger@dilger.ca \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.