linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zhang Yi <yi.zhang@huaweicloud.com>
To: sunyongjian1@huawei.com, linux-ext4@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, tytso@mit.edu, jack@suse.cz,
	yangerkun@huawei.com, libaokun1@huawei.com,
	chengzhihao1@huawei.com
Subject: Re: [PATCH v4] ext4: increase i_disksize to offset + len in ext4_update_disksize_before_punch()
Date: Thu, 11 Sep 2025 14:16:57 +0800	[thread overview]
Message-ID: <8d1ee18e-6bf4-423b-b046-16a5d55a7030@huaweicloud.com> (raw)
In-Reply-To: <20250911025412.186872-1-sunyongjian@huaweicloud.com>

On 9/11/2025 10:54 AM, Yongjian Sun wrote:
> From: Yongjian Sun <sunyongjian1@huawei.com>
> 
> After running a stress test combined with fault injection,
> we performed fsck -a followed by fsck -fn on the filesystem
> image. During the second pass, fsck -fn reported:
> 
> Inode 131512, end of extent exceeds allowed value
> 	(logical block 405, physical block 1180540, len 2)
> 
> This inode was not in the orphan list. Analysis revealed the
> following call chain that leads to the inconsistency:
> 
>                              ext4_da_write_end()
>                               //does not update i_disksize
>                              ext4_punch_hole()
>                               //truncate folio, keep size
> ext4_page_mkwrite()
>  ext4_block_page_mkwrite()
>   ext4_block_write_begin()
>     ext4_get_block()
>      //insert written extent without update i_disksize
> journal commit
> echo 1 > /sys/block/xxx/device/delete
> 
> da-write path updates i_size but does not update i_disksize. Then
> ext4_punch_hole truncates the da-folio yet still leaves i_disksize
> unchanged(in the ext4_update_disksize_before_punch function, the
> condition offset + len < size is met). Then ext4_page_mkwrite sees
> ext4_nonda_switch return 1 and takes the nodioread_nolock path, the
> folio about to be written has just been punched out, and it’s offset
> sits beyond the current i_disksize. This may result in a written
> extent being inserted, but again does not update i_disksize. If the
> journal gets committed and then the block device is yanked, we might
> run into this. It should be noted that replacing ext4_punch_hole with
> ext4_zero_range in the call sequence may also trigger this issue, as
> neither will update i_disksize under these circumstances.
> 
> To fix this, we can modify ext4_update_disksize_before_punch to
> increase i_disksize to min(offset + len) when both i_size and
> (offset + len) are greater than i_disksize.
> 
> Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>

Looks good to me, and feel free to add:

Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

BTW, since Jan has no other review comments and has allowed you to
add his review tag after improving the language, you can also add his
review tag when sending this version.

Thanks,
Yi.

> ---
> Changes in v4:
> - Make the comments simpler and clearer.
> - Link to v3: https://lore.kernel.org/all/20250910042516.3947590-1-sunyongjian@huaweicloud.com/
> Changes in v3:
> - Add a condition to avoid increasing i_disksize and include some comments.
> - Link to v2: https://lore.kernel.org/all/20250908063355.3149491-1-sunyongjian@huaweicloud.com/
> Changes in v2:
> - The modification of i_disksize should be moved into ext4_update_disksize_before_punch,
>   rather than being done in ext4_page_mkwrite.
> - Link to v1: https://lore.kernel.org/all/20250731140528.1554917-1-sunyongjian@huaweicloud.com/
> ---
>  fs/ext4/inode.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 5b7a15db4953..f82f7fb84e17 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4287,7 +4287,11 @@ int ext4_can_truncate(struct inode *inode)
>   * We have to make sure i_disksize gets properly updated before we truncate
>   * page cache due to hole punching or zero range. Otherwise i_disksize update
>   * can get lost as it may have been postponed to submission of writeback but
> - * that will never happen after we truncate page cache.
> + * that will never happen if we remove the folio containing i_size from the
> + * page cache. Also if we punch hole within i_size but above i_disksize,
> + * following ext4_page_mkwrite() may mistakenly allocate written blocks over
> + * the hole and thus introduce allocated blocks beyond i_disksize which is
> + * not allowed (e2fsck would complain in case of crash).
>   */
>  int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
>  				      loff_t len)
> @@ -4298,9 +4302,11 @@ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
>  	loff_t size = i_size_read(inode);
>  
>  	WARN_ON(!inode_is_locked(inode));
> -	if (offset > size || offset + len < size)
> +	if (offset > size)
>  		return 0;
>  
> +	if (offset + len < size)
> +		size = offset + len;
>  	if (EXT4_I(inode)->i_disksize >= size)
>  		return 0;
>  


  reply	other threads:[~2025-09-11  6:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11  2:54 [PATCH v4] ext4: increase i_disksize to offset + len in ext4_update_disksize_before_punch() Yongjian Sun
2025-09-11  6:16 ` Zhang Yi [this message]
2025-09-11  8:29   ` Sun Yongjian
2025-09-11  9:13 ` Jan Kara
2025-09-11 11:29 ` Baokun Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d1ee18e-6bf4-423b-b046-16a5d55a7030@huaweicloud.com \
    --to=yi.zhang@huaweicloud.com \
    --cc=chengzhihao1@huawei.com \
    --cc=jack@suse.cz \
    --cc=libaokun1@huawei.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sunyongjian1@huawei.com \
    --cc=tytso@mit.edu \
    --cc=yangerkun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).