public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <luis.henriques@linux.dev>
To: Jan Kara <jack@suse.cz>
Cc: "Luis Henriques (SUSE)" <luis.henriques@linux.dev>,
	 Theodore Ts'o <tytso@mit.edu>,
	 Andreas Dilger <adilger@dilger.ca>,  Jan Kara <jack@suse.com>,
	 linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
Date: Wed, 22 May 2024 14:21:47 +0100	[thread overview]
Message-ID: <87pltedlsk.fsf@brahms.olymp> (raw)
In-Reply-To: <20240522103545.ypmmoyxvls52i6yl@quack3> (Jan Kara's message of "Wed, 22 May 2024 12:35:45 +0200")

On Wed 22 May 2024 12:35:45 PM +02, Jan Kara wrote;

> On Tue 21-05-24 16:45:34, Luis Henriques (SUSE) wrote:
>> When a full journal commit is on-going, any fast commit has to be enqueued
>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
>> is done only once, i.e. if an inode is already queued in a previous fast
>> commit entry it won't be enqueued again.  However, if a full commit starts
>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>> be done into FC_Q_STAGING.  And this is not being done in function
>> ext4_fc_track_template().
>
> Ah, good catch.
>
>> This patch fixes the issue by simply re-enqueuing the inode from the MAIN
>> into the STAGING queue.
>> 
>> This bug was found using fstest generic/047.  This test creates several 32k
>> bytes files, sync'ing each of them after it's creation, and then shutting
>> down the filesystem.  Some data may be loss in this operation; for example a
>> file may have it's size truncated to zero.
>> 
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>> ---
>>  fs/ext4/fast_commit.c | 19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>> 
>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
>> index 87c009e0c59a..337b5289cf11 100644
>> --- a/fs/ext4/fast_commit.c
>> +++ b/fs/ext4/fast_commit.c
>> @@ -396,12 +396,19 @@ static int ext4_fc_track_template(
>>  		return ret;
>>  
>>  	spin_lock(&sbi->s_fc_lock);
>> -	if (list_empty(&EXT4_I(inode)->i_fc_list))
>> -		list_add_tail(&EXT4_I(inode)->i_fc_list,
>> -				(sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
>> -				 sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
>> -				&sbi->s_fc_q[FC_Q_STAGING] :
>> -				&sbi->s_fc_q[FC_Q_MAIN]);
>> +	if (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
>> +	    sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) {
>> +		if (list_empty(&EXT4_I(inode)->i_fc_list))
>> +			list_add_tail(&EXT4_I(inode)->i_fc_list,
>> +				      &sbi->s_fc_q[FC_Q_STAGING]);
>> +		else
>> +			list_move_tail(&EXT4_I(inode)->i_fc_list,
>> +				       &sbi->s_fc_q[FC_Q_STAGING]);
>
> So I'm not sure this is actually safe. I'm concerned about the following
> race:
>
> Task1					Task2
>
> 					handle = ext4_journal_start(..)
> modify inode_X
>   ext4_fc_track_inode(inode_X)
> ext4_fsync(inode_X)
>   ext4_fc_commit()
>     jbd2_fc_begin_commit()
>       journal->j_flags |= JBD2_FAST_COMMIT_ONGOING;
>       ...
>       jbd2_journal_lock_updates()
>         blocks waiting for handle of Task2
> 					modify inode_X
> 					  ext4_fc_track_inode(inode_X)
> 					    - moves inode out of FC_Q_MAIN
> 					ext4_journal_stop()
>     fast commit proceeds but skips inode_X...

Hmm... I see, the problem is deeper that I thought.

> How we deal with a similar issue in jbd2 for ordinary buffers is that we
> just mark the buffer as *also* belonging to the next transaction (by
> setting jh->b_next_transaction) and during commit cleanup we move the bh to
> the appropriate list of the next transaction. Here, we could mark the inode
> as also being part of the next fast commit and during fastcommit cleanup we
> could move it to FC_Q_STAGING which is then spliced back to FC_Q_MAIN.

Yeah, I guess that would work.  I'll need to add a new field to flag the
'next commit' in struct ext4_inode_info.  I'll need to play a bit with it
and see what I can came up with.  Thanks for the suggestion.

> Also Harshad has recently posted changes to fast commit code that modify
> how fast commits are serialized (in particular jbd2_journal_lock_updates()
> is gone). I didn't read them yet but your change definitely needs a careful
> verification against those changes to make sure we don't introduce new data
> integrity issues.
>

Right, I saw his patchset only after sending my RFC (and I should have
probably included him on the CC as well; probably get_maintainer.pl isn't
picking his email).

I'll need to look at those changes too, which will probably take me some
time as most of that code isn't familiar to me.

Thanks a lot for your review, Jan.  Much appreciated.

Cheers,
-- 
Luis

>> +	} else {
>> +		if (list_empty(&EXT4_I(inode)->i_fc_list))
>> +			list_add_tail(&EXT4_I(inode)->i_fc_list,
>> +				      &sbi->s_fc_q[FC_Q_MAIN]);
>> +	}
>>  	spin_unlock(&sbi->s_fc_lock);
>>  
>>  	return ret;
>
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

  reply	other threads:[~2024-05-22 13:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-21 15:45 [RFC PATCH 0/2] ext4: two small fast commit fixes Luis Henriques (SUSE)
2024-05-21 15:45 ` [RFC PATCH 1/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-22 10:35   ` Jan Kara
2024-05-22 13:21     ` Luis Henriques [this message]
2024-05-21 15:45 ` [RFC PATCH 2/2] jbd2: reset fast commit offset only after fs cleanup is done Luis Henriques (SUSE)
2024-05-22 10:45   ` Jan Kara
2024-05-22 13:36     ` Luis Henriques
2024-05-23  7:44       ` Jan Kara
2024-05-23  8:52         ` Luis Henriques

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pltedlsk.fsf@brahms.olymp \
    --to=luis.henriques@linux.dev \
    --cc=adilger@dilger.ca \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox