public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <luis.henriques@linux.dev>
To: Jan Kara <jack@suse.cz>
Cc: "Luis Henriques (SUSE)" <luis.henriques@linux.dev>,
	 Theodore Ts'o <tytso@mit.edu>,
	 Andreas Dilger <adilger@dilger.ca>,
	 Harshad Shirwadkar <harshadshirwadkar@gmail.com>,
	 linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] ext4: fix fast commit inode enqueueing during a full journal commit
Date: Mon, 27 May 2024 09:29:40 +0100	[thread overview]
Message-ID: <87h6ej64jv.fsf@brahms.olymp> (raw)
In-Reply-To: <20240524162231.l5r4niz7awjgfju6@quack3> (Jan Kara's message of "Fri, 24 May 2024 18:22:31 +0200")

On Fri 24 May 2024 06:22:31 PM +02, Jan Kara wrote;

> On Thu 23-05-24 12:16:18, Luis Henriques (SUSE) wrote:
>> When a full journal commit is on-going, any fast commit has to be enqueued
>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
>> is done only once, i.e. if an inode is already queued in a previous fast
>> commit entry it won't be enqueued again.  However, if a full commit starts
>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>> be done into FC_Q_STAGING.  And this is not being done in function
>> ext4_fc_track_template().
>> 
>> This patch fixes the issue by flagging an inode that is already enqueued in
>> either queues.  Later, during the fast commit clean-up callback, if the
>> inode has a tid that is bigger than the one being handled, that inode is
>> re-enqueued into STAGING and the spliced back into MAIN.
>> 
>> This bug was found using fstest generic/047.  This test creates several 32k
>> bytes files, sync'ing each of them after it's creation, and then shutting
>> down the filesystem.  Some data may be loss in this operation; for example a
>> file may have it's size truncated to zero.
>> 
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>
> Thanks for the fix. Some comments below:
>
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 983dad8c07ec..4c308c18c3da 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -1062,9 +1062,18 @@ struct ext4_inode_info {
>>  	/* Fast commit wait queue for this inode */
>>  	wait_queue_head_t i_fc_wait;
>>  
>> -	/* Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len */
>> +	/*
>> +	 * Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len,
>> +	 * i_fc_next
>> +	 */
>>  	struct mutex i_fc_lock;
>>  
>> +	/*
>> +	 * Used to flag an inode as part of the next fast commit; will be
>> +	 * reset during fast commit clean-up
>> +	 */
>> +	tid_t i_fc_next;
>> +
>
> Do we really need new tid in the inode? I'd be kind of hoping we could use
> EXT4_I(inode)->i_sync_tid for this - I can see we even already set it in
> ext4_fc_track_template() and used for similar comparisons in fast commit
> code.

Ah, true.  It looks like it could be used indeed.  We'll still need a flag
here, but a simple bool should be enough for that.

>
>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
>> index 87c009e0c59a..bfdf249f0783 100644
>> --- a/fs/ext4/fast_commit.c
>> +++ b/fs/ext4/fast_commit.c
>> @@ -402,6 +402,8 @@ static int ext4_fc_track_template(
>>  				 sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
>>  				&sbi->s_fc_q[FC_Q_STAGING] :
>>  				&sbi->s_fc_q[FC_Q_MAIN]);
>> +	else
>> +		ei->i_fc_next = tid;
>>  	spin_unlock(&sbi->s_fc_lock);
>>  
>>  	return ret;
>> @@ -1280,6 +1282,15 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
>>  	list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
>>  				 i_fc_list) {
>>  		list_del_init(&iter->i_fc_list);
>> +		if (iter->i_fc_next == tid)
>> +			iter->i_fc_next = 0;
>> +		else if (iter->i_fc_next > tid)
> 			 ^^^ careful here, TIDs do wrap so you need to use
> tid_geq() for comparison.
>

Yikes!  Thanks, I'll update the code to do that.

>> +			/*
>> +			 * re-enqueue inode into STAGING, which will later be
>> +			 * splice back into MAIN
>> +			 */
>> +			list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
>> +				      &sbi->s_fc_q[FC_Q_STAGING]);
>>  		ext4_clear_inode_state(&iter->vfs_inode,
>>  				       EXT4_STATE_FC_COMMITTING);
>>  		if (iter->i_sync_tid <= tid)
> 				     ^^^ and I can see this is buggy as
> well and needs tid_geq() (not your fault obviously).

Yeah, good point.  I can that too in v3.

Again, thanks a lot for your review!

Cheers,
-- 
Luís

  reply	other threads:[~2024-05-27  8:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 11:16 [PATCH v2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-24 16:22 ` Jan Kara
2024-05-27  8:29   ` Luis Henriques [this message]
2024-05-27 15:48     ` Luis Henriques
2024-05-28 10:36       ` Jan Kara
2024-05-28 10:52         ` Jan Kara
2024-05-28 15:50           ` Luis Henriques
2024-05-29  0:01             ` harshad shirwadkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6ej64jv.fsf@brahms.olymp \
    --to=luis.henriques@linux.dev \
    --cc=adilger@dilger.ca \
    --cc=harshadshirwadkar@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox