All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Yi <yizhang089@gmail.com>
To: Jan Kara <jack@suse.cz>, Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, libaokun@linux.alibaba.com,
	ojaswin@linux.ibm.com, ritesh.list@gmail.com, djwong@kernel.org,
	hch@infradead.org, yi.zhang@huawei.com, yangerkun@huawei.com,
	yukuai@fnnas.com
Subject: Re: [PATCH v4 06/23] ext4: pass out extent seq counter when mapping da blocks
Date: Tue, 16 Jun 2026 20:37:00 +0800	[thread overview]
Message-ID: <58ee3009-8a0e-4657-a473-475ea998316e@gmail.com> (raw)
In-Reply-To: <ghe4izo4od2b4fcgmiayopkaombyjjopao266sseuqim72t5sj@h2meez5lalsl>

On 6/16/2026 6:04 PM, Jan Kara wrote:
> On Mon 11-05-26 15:23:26, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> The iomap buffered write path does not hold any locks between querying
>> inode extent mapping information and performing buffered writes. It
>> relies on the sequence counter saved in the inode to detect stale
>> mappings.
> 
> Now that I'm looking at it again, I've got a bit confused here. Buffered
> write path is holding i_rwsem between mapping blocks and using them so
> there shouldn't be races.  Perhaps you mean buffered *writeback* path? But
> then ext4_da_map_blocks() should not ever get called in the writeback path
> because it is allocating delayed blocks... So this change looks unnecessary
> to me now. Am I missing something?
> 
> 								Honza
> 

Hi Jan,

Thanks for coming back to this series. Sorry for the inaccurate
description in the commit message. However, this change is still needed.

As mentioned in the comment before the ->iomap_valid() callback in
iomap_write_begin(), consider the following scenario — a buffered write
to a dirty unwritten extent, with this concurrent race:

   Buffer write (holds i_rwsem)    Writeback (no i_rwsem)
   ext4_da_map_blocks()
     // ext4_es_lookup_extent()
     // finds UNWRITTEN extent
   ext4_set_iomap()
     // type = IOMAP_UNWRITTEN
     // validity_cookie = m_seq
                                   ext4_iomap_writepages()
                                     // writeback for unwritten extent
                                     // ext4_convert_unwritten_extents()
                                     // extent tree: UNWRITTEN → WRITTEN
                                     // advancing i_es_seq
   __iomap_write_begin()
     // ext4_iomap_valid(): cookie != i_es_seq
     // iomap invalidated, re-maps
     // gets type = IOMAP_MAPPED (WRITTEN)
     // iomap_block_needs_zeroing(): FALSE

Without passing out m_seq, the stale IOMAP_UNWRITTEN type from the iomap
would cause __iomap_write_begin()->iomap_block_needs_zeroing() to zero
out blocks that have already been written, corrupting data on partial
writes.

Thanks,
Yi

>>
>> Commit 07c440e8da8f ("ext4: pass out extent seq counter when mapping
>> blocks") added the m_seq field to ext4_map_blocks to pass out extent
>> sequence numbers, but it missed two callsites within
>> ext4_da_map_blocks(). These callsites are on the delayed allocation
>> path, which is also used in the iomap buffered write path. Pass out the
>> sequence counter to ensure stale mappings can be detected.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>> ---
>>   fs/ext4/inode.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 6c4d9137b279..39577a6b65b9 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -1929,7 +1929,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
>>   	ext4_check_map_extents_env(inode);
>>   
>>   	/* Lookup extent status tree firstly */
>> -	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
>> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) {
>>   		map->m_len = min_t(unsigned int, map->m_len,
>>   				   es.es_len - (map->m_lblk - es.es_lblk));
>>   
>> @@ -1982,7 +1982,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
>>   	 * is held in write mode, before inserting a new da entry in
>>   	 * the extent status tree.
>>   	 */
>> -	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
>> +	if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) {
>>   		map->m_len = min_t(unsigned int, map->m_len,
>>   				   es.es_len - (map->m_lblk - es.es_lblk));
>>   
>> -- 
>> 2.52.0
>>


  reply	other threads:[~2026-06-16 12:37 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  7:23 [PATCH v4 00/23] ext4: use iomap for regular file's buffered I/O path Zhang Yi
2026-05-11  7:23 ` [PATCH v4 01/23] ext4: simplify size updating in ext4_setattr() Zhang Yi
2026-05-19  5:24   ` Ojaswin Mujoo
2026-05-11  7:23 ` [PATCH v4 02/23] ext4: factor out ext4_truncate_[up|down]() Zhang Yi
2026-05-19  6:05   ` Ojaswin Mujoo
2026-06-16  9:31   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 03/23] ext4: simplify error handling in ext4_setattr() Zhang Yi
2026-05-19  6:08   ` Ojaswin Mujoo
2026-06-16  9:36   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 04/23] ext4: add iomap address space operations for buffered I/O Zhang Yi
2026-05-19  9:28   ` Ojaswin Mujoo
2026-05-19 12:35     ` Zhang Yi
2026-05-19 16:53       ` Ojaswin Mujoo
2026-05-20  2:49         ` Zhang Yi
2026-05-26 17:11           ` Ojaswin Mujoo
2026-05-11  7:23 ` [PATCH v4 05/23] ext4: implement buffered read path using iomap Zhang Yi
2026-05-19 10:00   ` Ojaswin Mujoo
2026-05-11  7:23 ` [PATCH v4 06/23] ext4: pass out extent seq counter when mapping da blocks Zhang Yi
2026-05-19 10:02   ` Ojaswin Mujoo
2026-06-16 10:04   ` Jan Kara
2026-06-16 12:37     ` Zhang Yi [this message]
2026-06-17 20:29       ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 07/23] ext4: do not use data=ordered mode for inodes using buffered iomap path Zhang Yi
2026-05-19 10:41   ` Ojaswin Mujoo
2026-05-19 13:31     ` Ojaswin Mujoo
2026-05-20  8:18       ` Zhang Yi
2026-05-20 13:17         ` Ojaswin Mujoo
2026-06-16 10:01   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 08/23] ext4: implement buffered write path using iomap Zhang Yi
2026-05-26 17:10   ` Ojaswin Mujoo
2026-05-28 15:40     ` Darrick J. Wong
2026-06-02  7:02       ` Ojaswin Mujoo
2026-05-29  9:13     ` Zhang Yi
2026-06-02 10:05       ` Ojaswin Mujoo
2026-06-03  1:44         ` Zhang Yi
2026-06-02 10:26   ` Ojaswin Mujoo
2026-06-03  2:56     ` Zhang Yi
2026-06-03 11:08       ` Ojaswin Mujoo
2026-06-16 10:45   ` Jan Kara
2026-06-16 14:42     ` Zhang Yi
2026-05-11  7:23 ` [PATCH v4 09/23] ext4: implement writeback " Zhang Yi
2026-05-27 12:49   ` Ojaswin Mujoo
2026-05-29 14:12     ` Zhang Yi
2026-05-29 15:32       ` Ojaswin Mujoo
2026-05-30  1:21         ` Zhang Yi
2026-06-01  6:26           ` Ojaswin Mujoo
2026-06-16 11:47   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 10/23] ext4: implement mmap " Zhang Yi
2026-05-27 12:56   ` Ojaswin Mujoo
2026-06-16 11:56   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 11/23] iomap: correct the range of a partial dirty clear Zhang Yi
2026-05-11  7:46   ` Christoph Hellwig
2026-05-11  8:57     ` Zhang Yi
2026-05-11  7:23 ` [PATCH v4 12/23] iomap: support invalidating partial folios Zhang Yi
2026-05-11  7:23 ` [PATCH v4 13/23] iomap: fix incorrect did_zero setting in iomap_zero_iter() Zhang Yi
2026-05-11  7:23 ` [PATCH v4 14/23] ext4: implement partial block zero range path using iomap Zhang Yi
2026-05-27 13:13   ` Ojaswin Mujoo
2026-06-16 12:28   ` Jan Kara
2026-06-17  8:14     ` Zhang Yi
2026-06-17 10:50       ` Jan Kara
2026-06-17 13:00         ` Zhang Yi
2026-06-17 10:56       ` Brian Foster
2026-06-17 13:22         ` Zhang Yi
2026-05-11  7:23 ` [PATCH v4 15/23] ext4: add block mapping tracepoints for iomap buffered I/O path Zhang Yi
2026-05-27 13:14   ` Ojaswin Mujoo
2026-06-16 12:29   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 16/23] ext4: disable online defrag when inode using " Zhang Yi
2026-05-27 13:14   ` Ojaswin Mujoo
2026-06-16 12:30   ` Jan Kara
2026-05-11  7:23 ` [PATCH v4 17/23] ext4: submit zeroed post-EOF data immediately in the " Zhang Yi
2026-05-27 13:41   ` Ojaswin Mujoo
2026-05-30  2:53     ` Zhang Yi
2026-06-01  9:08       ` Ojaswin Mujoo
2026-06-01 12:22         ` Zhang Yi
2026-06-01 18:15           ` Ojaswin Mujoo
2026-06-02  3:36             ` Zhang Yi
2026-05-11  7:23 ` [PATCH v4 18/23] ext4: wait for ordered I/O " Zhang Yi
2026-05-27 15:58   ` Ojaswin Mujoo
2026-05-28 13:34     ` Ojaswin Mujoo
2026-05-30  9:32       ` Zhang Yi
2026-06-02  5:56         ` Ojaswin Mujoo
2026-05-30  7:22     ` Zhang Yi
2026-05-30  8:24       ` Zhang Yi
2026-06-01 18:33         ` Ojaswin Mujoo
2026-06-02  3:22           ` Zhang Yi
2026-06-02  5:35             ` Ojaswin Mujoo
2026-05-11  7:23 ` [PATCH v4 19/23] ext4: update i_disksize to i_size on ordered I/O completion Zhang Yi
2026-05-11  7:23 ` [PATCH v4 20/23] ext4: wait for ordered I/O to complete during insert and collapse range Zhang Yi
2026-05-11  7:23 ` [PATCH v4 21/23] ext4: add tracepoints for ordered I/O in the iomap buffered I/O path Zhang Yi
2026-05-11  7:23 ` [PATCH v4 22/23] ext4: partially enable iomap for the buffered I/O path of regular files Zhang Yi
2026-05-11  7:23 ` [PATCH v4 23/23] ext4: introduce a mount option for iomap buffered I/O path Zhang Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58ee3009-8a0e-4657-a473-475ea998316e@gmail.com \
    --to=yizhang089@gmail.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=libaokun@linux.alibaba.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@huaweicloud.com \
    --cc=yukuai@fnnas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.