linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zhang Yi <yi.zhang@huaweicloud.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com,
	hch@infradead.org, david@fromorbit.com, zokeefe@google.com,
	yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com,
	yangerkun@huawei.com
Subject: Re: [PATCH 04/27] ext4: refactor ext4_punch_hole()
Date: Wed, 20 Nov 2024 11:18:46 +0800	[thread overview]
Message-ID: <c41a2dd8-de10-4f9e-9a5e-6927ebef2b3c@huaweicloud.com> (raw)
In-Reply-To: <20241118232712.GB9417@frogsfrogsfrogs>

On 2024/11/19 7:27, Darrick J. Wong wrote:
> On Tue, Oct 22, 2024 at 07:10:35PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> The current implementation of ext4_punch_hole() contains complex
>> position calculations and stale error tags. To improve the code's
>> clarity and maintainability, it is essential to clean up the code and
>> improve its readability, this can be achieved by: a) simplifying and
>> renaming variables; b) eliminating unnecessary position calculations;
>> c) writing back all data in data=journal mode, and drop page cache from
>> the original offset to the end, rather than using aligned blocks,
>> d) renaming the stale error tags.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>> ---
>>  fs/ext4/inode.c | 140 +++++++++++++++++++++---------------------------
>>  1 file changed, 62 insertions(+), 78 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 94b923afcd9c..1d128333bd06 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -3955,13 +3955,14 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  {
>>  	struct inode *inode = file_inode(file);
>>  	struct super_block *sb = inode->i_sb;
>> -	ext4_lblk_t first_block, stop_block;
>> +	ext4_lblk_t start_lblk, end_lblk;
>>  	struct address_space *mapping = inode->i_mapping;
>> -	loff_t first_block_offset, last_block_offset, max_length;
>> -	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> +	loff_t max_end = EXT4_SB(sb)->s_bitmap_maxbytes - sb->s_blocksize;
>> +	loff_t end = offset + length;
>> +	unsigned long blocksize = i_blocksize(inode);
>>  	handle_t *handle;
>>  	unsigned int credits;
>> -	int ret = 0, ret2 = 0;
>> +	int ret = 0;
>>  
>>  	trace_ext4_punch_hole(inode, offset, length, 0);
>>  
>> @@ -3969,36 +3970,27 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  
>>  	/* No need to punch hole beyond i_size */
>>  	if (offset >= inode->i_size)
>> -		goto out_mutex;
>> +		goto out;
>>  
>>  	/*
>> -	 * If the hole extends beyond i_size, set the hole
>> -	 * to end after the page that contains i_size
>> +	 * If the hole extends beyond i_size, set the hole to end after
>> +	 * the page that contains i_size, and also make sure that the hole
>> +	 * within one block before last range.
>>  	 */
>> -	if (offset + length > inode->i_size) {
>> -		length = inode->i_size +
>> -		   PAGE_SIZE - (inode->i_size & (PAGE_SIZE - 1)) -
>> -		   offset;
>> -	}
>> +	if (end > inode->i_size)
>> +		end = round_up(inode->i_size, PAGE_SIZE);
>> +	if (end > max_end)
>> +		end = max_end;
>> +	length = end - offset;
>>  
>>  	/*
>> -	 * For punch hole the length + offset needs to be within one block
>> -	 * before last range. Adjust the length if it goes beyond that limit.
>> +	 * Attach jinode to inode for jbd2 if we do any zeroing of partial
>> +	 * block.
>>  	 */
>> -	max_length = sbi->s_bitmap_maxbytes - inode->i_sb->s_blocksize;
>> -	if (offset + length > max_length)
>> -		length = max_length - offset;
>> -
>> -	if (offset & (sb->s_blocksize - 1) ||
>> -	    (offset + length) & (sb->s_blocksize - 1)) {
>> -		/*
>> -		 * Attach jinode to inode for jbd2 if we do any zeroing of
>> -		 * partial block
>> -		 */
>> +	if (offset & (blocksize - 1) || end & (blocksize - 1)) {
> 
> IS_ALIGNED(offset | end, blocksize) ?

Right, this helper looks better, thanks for pointing this out.

> 
>>  		ret = ext4_inode_attach_jinode(inode);
>>  		if (ret < 0)
>> -			goto out_mutex;
>> -
>> +			goto out;
>>  	}
>>  
>>  	/* Wait all existing dio workers, newcomers will block on i_rwsem */
>> @@ -4006,7 +3998,7 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  
>>  	ret = file_modified(file);
>>  	if (ret)
>> -		goto out_mutex;
>> +		goto out;
>>  
>>  	/*
>>  	 * Prevent page faults from reinstantiating pages we have released from
>> @@ -4016,34 +4008,24 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  
>>  	ret = ext4_break_layouts(inode);
>>  	if (ret)
>> -		goto out_dio;
>> -
>> -	first_block_offset = round_up(offset, sb->s_blocksize);
>> -	last_block_offset = round_down((offset + length), sb->s_blocksize) - 1;
>> +		goto out_invalidate_lock;
>>  
>> -	/* Now release the pages and zero block aligned part of pages*/
>> -	if (last_block_offset > first_block_offset) {
>> +	/*
>> +	 * For journalled data we need to write (and checkpoint) pages
>> +	 * before discarding page cache to avoid inconsitent data on
> 
> inconsistent

Yeah.

> 
>> +	 * disk in case of crash before punching trans is committed.
>> +	 */
>> +	if (ext4_should_journal_data(inode)) {
>> +		ret = filemap_write_and_wait_range(mapping, offset, end - 1);
>> +	} else {
>>  		ret = ext4_update_disksize_before_punch(inode, offset, length);
>> -		if (ret)
>> -			goto out_dio;
>> -
>> -		/*
>> -		 * For journalled data we need to write (and checkpoint) pages
>> -		 * before discarding page cache to avoid inconsitent data on
>> -		 * disk in case of crash before punching trans is committed.
>> -		 */
>> -		if (ext4_should_journal_data(inode)) {
>> -			ret = filemap_write_and_wait_range(mapping,
>> -					first_block_offset, last_block_offset);
>> -			if (ret)
>> -				goto out_dio;
>> -		}
>> -
>> -		ext4_truncate_folios_range(inode, first_block_offset,
>> -					   last_block_offset + 1);
>> -		truncate_pagecache_range(inode, first_block_offset,
>> -					 last_block_offset);
>> +		ext4_truncate_folios_range(inode, offset, end);
>>  	}
>> +	if (ret)
>> +		goto out_invalidate_lock;
>> +
>> +	/* Now release the pages and zero block aligned part of pages*/
>> +	truncate_pagecache_range(inode, offset, end - 1);
>>  
>>  	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
>>  		credits = ext4_writepage_trans_blocks(inode);
>> @@ -4053,52 +4035,54 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
>>  	if (IS_ERR(handle)) {
>>  		ret = PTR_ERR(handle);
>>  		ext4_std_error(sb, ret);
>> -		goto out_dio;
>> +		goto out_invalidate_lock;
>>  	}
>>  
>> -	ret = ext4_zero_partial_blocks(handle, inode, offset,
>> -				       length);
>> +	ret = ext4_zero_partial_blocks(handle, inode, offset, length);
>>  	if (ret)
>> -		goto out_stop;
>> -
>> -	first_block = (offset + sb->s_blocksize - 1) >>
>> -		EXT4_BLOCK_SIZE_BITS(sb);
>> -	stop_block = (offset + length) >> EXT4_BLOCK_SIZE_BITS(sb);
>> +		goto out_handle;
>>  
>>  	/* If there are blocks to remove, do it */
>> -	if (stop_block > first_block) {
>> -		ext4_lblk_t hole_len = stop_block - first_block;
>> +	start_lblk = round_up(offset, blocksize) >> inode->i_blkbits;
> 
> egad I wish ext4 had nicer unit conversion helpers.
> 
> static inline ext4_lblk_t
> EXT4_B_TO_LBLK(struct ext4_sb_info *sbi, ..., loff_t offset)
> {
> 	return round_up(offset, blocksize) >> inode->i_blkbits;
> }
> 
> 	start_lblk = EXT4_B_TO_LBLK(sbi, offset);
> 
> ah well.
> 

Sure, it looks clearer.

>> +	end_lblk = end >> inode->i_blkbits;
>> +
>> +	if (end_lblk > start_lblk) {
>> +		ext4_lblk_t hole_len = end_lblk - start_lblk;
>>  
>>  		down_write(&EXT4_I(inode)->i_data_sem);
>>  		ext4_discard_preallocations(inode);
>>  
>> -		ext4_es_remove_extent(inode, first_block, hole_len);
>> +		ext4_es_remove_extent(inode, start_lblk, hole_len);
>>  
>>  		if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
>> -			ret = ext4_ext_remove_space(inode, first_block,
>> -						    stop_block - 1);
>> +			ret = ext4_ext_remove_space(inode, start_lblk,
>> +						    end_lblk - 1);
>>  		else
>> -			ret = ext4_ind_remove_space(handle, inode, first_block,
>> -						    stop_block);
>> +			ret = ext4_ind_remove_space(handle, inode, start_lblk,
>> +						    end_lblk);
>> +		if (ret) {
>> +			up_write(&EXT4_I(inode)->i_data_sem);
>> +			goto out_handle;
>> +		}
>>  
>> -		ext4_es_insert_extent(inode, first_block, hole_len, ~0,
>> +		ext4_es_insert_extent(inode, start_lblk, hole_len, ~0,
>>  				      EXTENT_STATUS_HOLE, 0);
>>  		up_write(&EXT4_I(inode)->i_data_sem);
>>  	}
>> -	ext4_fc_track_range(handle, inode, first_block, stop_block);
>> +	ext4_fc_track_range(handle, inode, start_lblk, end_lblk);
>> +
>> +	ret = ext4_mark_inode_dirty(handle, inode);
>> +	if (unlikely(ret))
>> +		goto out_handle;
>> +
>> +	ext4_update_inode_fsync_trans(handle, inode, 1);
>>  	if (IS_SYNC(inode))
>>  		ext4_handle_sync(handle);
>> -
>> -	ret2 = ext4_mark_inode_dirty(handle, inode);
>> -	if (unlikely(ret2))
>> -		ret = ret2;
>> -	if (ret >= 0)
>> -		ext4_update_inode_fsync_trans(handle, inode, 1);
>> -out_stop:
>> +out_handle:
>>  	ext4_journal_stop(handle);
>> -out_dio:
>> +out_invalidate_lock:
>>  	filemap_invalidate_unlock(mapping);
>> -out_mutex:
>> +out:
> 
> Why drop "_mutex"?  You're unlocking *something* on the way out.
> 

"_mutex" is no longer accurate, as the inode has changed to using rwsem instead.
But never mind, this "out" tag is also removed in patch 9.

Thanks,
Yi.

> 
>>  	inode_unlock(inode);
>>  	return ret;
>>  }
>> -- 
>> 2.46.1
>>
>>


  reply	other threads:[~2024-11-20  3:18 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-22 11:10 [PATCH 00/27] ext4: use iomap for regular file's buffered I/O path and enable large folio Zhang Yi
2024-10-22  6:59 ` Sedat Dilek
2024-10-22  9:22   ` Zhang Yi
2024-10-23 12:13     ` Sedat Dilek
2024-10-24  7:44       ` Zhang Yi
2024-10-22 11:10 ` [PATCH 01/27] ext4: remove writable userspace mappings before truncating page cache Zhang Yi
2024-12-04 11:13   ` Jan Kara
2024-12-06  7:59     ` Zhang Yi
2024-12-06 15:49       ` Jan Kara
2024-10-22 11:10 ` [PATCH 02/27] ext4: don't explicit update times in ext4_fallocate() Zhang Yi
2024-10-22 11:10 ` [PATCH 03/27] ext4: don't write back data before punch hole in nojournal mode Zhang Yi
2024-11-18 23:15   ` Darrick J. Wong
2024-11-20  2:56     ` Zhang Yi
2024-12-04 11:26       ` Jan Kara
2024-12-04 11:27   ` Jan Kara
2024-10-22 11:10 ` [PATCH 04/27] ext4: refactor ext4_punch_hole() Zhang Yi
2024-11-18 23:27   ` Darrick J. Wong
2024-11-20  3:18     ` Zhang Yi [this message]
2024-12-04 11:36   ` Jan Kara
2024-10-22 11:10 ` [PATCH 05/27] ext4: refactor ext4_zero_range() Zhang Yi
2024-12-04 11:52   ` Jan Kara
2024-12-06  8:09     ` Zhang Yi
2024-10-22 11:10 ` [PATCH 06/27] ext4: refactor ext4_collapse_range() Zhang Yi
2024-12-04 11:58   ` Jan Kara
2024-10-22 11:10 ` [PATCH 07/27] ext4: refactor ext4_insert_range() Zhang Yi
2024-12-04 12:02   ` Jan Kara
2024-10-22 11:10 ` [PATCH 08/27] ext4: factor out ext4_do_fallocate() Zhang Yi
2024-10-22 11:10 ` [PATCH 09/27] ext4: move out inode_lock into ext4_fallocate() Zhang Yi
2024-12-04 12:05   ` Jan Kara
2024-12-06  8:13     ` Zhang Yi
2024-12-06 15:51       ` Jan Kara
2024-10-22 11:10 ` [PATCH 10/27] ext4: move out common parts " Zhang Yi
2024-12-04 12:10   ` Jan Kara
2024-10-22 11:10 ` [PATCH 11/27] ext4: use reserved metadata blocks when splitting extent on endio Zhang Yi
2024-12-04 12:16   ` Jan Kara
2024-10-22 11:10 ` [PATCH 12/27] ext4: introduce seq counter for the extent status entry Zhang Yi
2024-12-04 12:42   ` Jan Kara
2024-12-06  8:55     ` Zhang Yi
2024-12-06 16:21       ` Jan Kara
2024-12-09  8:32         ` Zhang Yi
2024-12-10 12:57           ` Jan Kara
2024-12-11  7:59             ` Zhang Yi
2024-12-11 16:00               ` Jan Kara
2024-12-12  2:32                 ` Zhang Yi
2024-10-22 11:10 ` [PATCH 13/27] ext4: add a new iomap aops for regular file's buffered IO path Zhang Yi
2024-10-22 11:10 ` [PATCH 14/27] ext4: implement buffered read iomap path Zhang Yi
2024-10-22 11:10 ` [PATCH 15/27] ext4: implement buffered write " Zhang Yi
2024-10-22 11:10 ` [PATCH 16/27] ext4: don't order data for inode with EXT4_STATE_BUFFERED_IOMAP Zhang Yi
2024-10-22 11:10 ` [PATCH 17/27] ext4: implement writeback iomap path Zhang Yi
2024-10-22 11:10 ` [PATCH 18/27] ext4: implement mmap " Zhang Yi
2024-10-22 11:10 ` [PATCH 19/27] ext4: do not always order data when partial zeroing out a block Zhang Yi
2024-10-22 11:10 ` [PATCH 20/27] ext4: do not start handle if unnecessary while " Zhang Yi
2024-10-22 11:10 ` [PATCH 21/27] ext4: implement zero_range iomap path Zhang Yi
2024-10-22 11:10 ` [PATCH 22/27] ext4: disable online defrag when inode using iomap buffered I/O path Zhang Yi
2024-10-22 11:10 ` [PATCH 23/27] ext4: disable inode journal mode when " Zhang Yi
2024-10-22 11:10 ` [PATCH 24/27] ext4: partially enable iomap for the buffered I/O path of regular files Zhang Yi
2024-10-22 11:10 ` [PATCH 25/27] ext4: enable large folio for regular file with iomap buffered I/O path Zhang Yi
2024-10-22 11:10 ` [PATCH 26/27] ext4: change mount options code style Zhang Yi
2024-10-22 11:10 ` [PATCH 27/27] ext4: introduce a mount option for iomap buffered I/O path Zhang Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c41a2dd8-de10-4f9e-9a5e-6927ebef2b3c@huaweicloud.com \
    --to=yi.zhang@huaweicloud.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=chengzhihao1@huawei.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).