linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Ritesh Harjani <ritesh.list@gmail.com>
Cc: linux-ext4@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
	Jan Kara <jack@suse.cz>, John Garry <john.g.garry@oracle.com>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc
Date: Wed, 14 May 2025 09:40:50 -0700	[thread overview]
Message-ID: <20250514164050.GN25655@frogsfrogsfrogs> (raw)
In-Reply-To: <87h61t65pl.fsf@gmail.com>

On Fri, May 09, 2025 at 11:12:46PM +0530, Ritesh Harjani wrote:
> "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> writes:
> 
> > This is v3 of multi-fsblock atomic write support using bigalloc. This has
> > started looking into much better shape now. The major chunk of the design
> > changes has been kept in Patch-4 & 5.
> >
> > This series can now be carefully reviewed, as all the error handling related
> > code paths should be properly taken care of.
> >
> 
> We spotted that multi-fsblock changes might need to force a journal
> commit if there were mixed mappings in the underlying region e.g. say WUWUWUW...
> 
> The issue arises when, during block allocation, the unwritten ranges are
> first zeroed out, followed by the unwritten-to-written extent
> conversion. This conversion is part of a journaled metadata transaction
> that has not yet been committed, as the transaction is still running.
> If an iomap write then modifies the data on those multi-fsblocks and a
> sudden power loss occurs before the transaction commits, the
> unwritten-to-written conversion will not be replayed during journal
> recovery. As a result, we end up with new data written over mapped
> blocks, while the alternate unwritten blocks will read zeroes. This
> could cause a torn write behavior for atomic writes.
> 
> So we were thinking we might need something like this. Hopefully this
> should still be ok, as mixed mapping case mostly is a non-performance
> critical path. Thoughts?

I agree the journal has to be written out before the atomic write is
sent to the device.

--D

> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 2642e1ef128f..59b59d609976 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3517,7 +3517,8 @@ static int ext4_map_blocks_atomic_write_slow(handle_t *handle,
>   * underlying short holes/unwritten extents within the requested range.
>   */
>  static int ext4_map_blocks_atomic_write(handle_t *handle, struct inode *inode,
> -                               struct ext4_map_blocks *map, int m_flags)
> +                               struct ext4_map_blocks *map, int m_flags,
> +                               bool *force_commit)
>  {
>         ext4_lblk_t m_lblk = map->m_lblk;
>         unsigned int m_len = map->m_len;
> @@ -3537,6 +3538,11 @@ static int ext4_map_blocks_atomic_write(handle_t *handle, struct inode *inode,
>         map->m_len = m_len;
>         map->m_flags = 0;
> 
> +       /*
> +        * slow path means we have mixed mapping, that means we will need
> +        * to force txn commit.
> +        */
> +       *force_commit = true;
>         return ext4_map_blocks_atomic_write_slow(handle, inode, map);
>  out:
>         return ret;
> @@ -3548,6 +3554,7 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
>         handle_t *handle;
>         u8 blkbits = inode->i_blkbits;
>         int ret, dio_credits, m_flags = 0, retries = 0;
> +       bool force_commit = false;
> 
>         /*
>          * Trim the mapping request to the maximum value that we can map at
> @@ -3610,7 +3617,8 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
>                 m_flags = EXT4_GET_BLOCKS_IO_CREATE_EXT;
> 
>         if (flags & IOMAP_ATOMIC)
> -               ret = ext4_map_blocks_atomic_write(handle, inode, map, m_flags);
> +               ret = ext4_map_blocks_atomic_write(handle, inode, map, m_flags,
> +                                                  &force_commit);
>         else
>                 ret = ext4_map_blocks(handle, inode, map, m_flags);
> 
> @@ -3626,6 +3634,9 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
>         if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
>                 goto retry;
> 
> +       if (ret > 0 && force_commit)
> +               ext4_force_commit(inode->i_sb);
> +
>         return ret;
>  }
> 
> 
> -ritesh
> 

  reply	other threads:[~2025-05-14 16:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08 20:50 [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-08 20:50 ` [PATCH v3 1/7] ext4: Document an edge case for overwrites Ritesh Harjani (IBM)
2025-05-09  5:19   ` Ojaswin Mujoo
2025-05-14 16:23   ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 2/7] ext4: Check if inode uses extents in ext4_inode_can_atomic_write() Ritesh Harjani (IBM)
2025-05-09  5:20   ` Ojaswin Mujoo
2025-05-14 16:24   ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 3/7] ext4: Make ext4_meta_trans_blocks() non-static for later use Ritesh Harjani (IBM)
2025-05-09  5:21   ` Ojaswin Mujoo
2025-05-14 16:24   ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 4/7] ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS Ritesh Harjani (IBM)
2025-05-14 16:16   ` Darrick J. Wong
2025-05-14 18:47     ` Ritesh Harjani
2025-05-08 20:50 ` [PATCH v3 5/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-14 16:19   ` Darrick J. Wong
2025-05-14 19:04     ` Ritesh Harjani
2025-05-08 20:50 ` [PATCH v3 6/7] ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc Ritesh Harjani (IBM)
2025-05-14 16:21   ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 7/7] ext4: Add atomic block write documentation Ritesh Harjani (IBM)
2025-05-09  7:34   ` Ojaswin Mujoo
2025-05-14 16:38     ` Darrick J. Wong
2025-05-15  2:15       ` Ritesh Harjani
2025-05-15  2:18     ` Ritesh Harjani
2025-05-09 17:42 ` [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani
2025-05-14 16:40   ` Darrick J. Wong [this message]
2025-05-14 18:55     ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250514164050.GN25655@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).