All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Long Li <leo.lilong@huawei.com>
Cc: brauner@kernel.org, djwong@kernel.org, cem@kernel.org,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com
Subject: Re: [PATCH v6 1/3] iomap: pass byte granular end position to iomap_add_to_ioend
Date: Mon, 9 Dec 2024 09:06:14 -0500	[thread overview]
Message-ID: <Z1b5Vr96Aysa_JCG@bfoster> (raw)
In-Reply-To: <20241209114241.3725722-2-leo.lilong@huawei.com>

On Mon, Dec 09, 2024 at 07:42:39PM +0800, Long Li wrote:
> This is a preparatory patch for fixing zero padding issues in concurrent
> append write scenarios. In the following patches, we need to obtain
> byte-granular writeback end position for io_size trimming after EOF
> handling.
> 
> Due to concurrent writeback and truncate operations, inode size may
> shrink. Resampling inode size would force writeback code to handle the
> newly appeared post-EOF blocks, which is undesirable. As Dave
> explained in [1]:
> 
> "Really, the issue is that writeback mappings have to be able to
> handle the range being mapped suddenly appear to be beyond EOF.
> This behaviour is a longstanding writeback constraint, and is what
> iomap_writepage_handle_eof() is attempting to handle.
> 
> We handle this by only sampling i_size_read() whilst we have the
> folio locked and can determine the action we should take with that
> folio (i.e. nothing, partial zeroing, or skip altogether). Once
> we've made the decision that the folio is within EOF and taken
> action on it (i.e. moved the folio to writeback state), we cannot
> then resample the inode size because a truncate may have started
> and changed the inode size."
> 
> To avoid resampling inode size after EOF handling, we convert end_pos
> to byte-granular writeback position and return it from EOF handling
> function.
> 
> Since iomap_set_range_dirty() can handle unaligned lengths, this
> conversion has no impact on it. However, iomap_find_dirty_range()
> requires aligned start and end range to find dirty blocks within the
> given range, so the end position needs to be rounded up when passed
> to it.
> 
> LINK [1]: https://lore.kernel.org/linux-xfs/Z1Gg0pAa54MoeYME@localhost.localdomain/
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
>  fs/iomap/buffered-io.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 955f19e27e47..bcc7831d03af 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
...
> @@ -1914,6 +1915,7 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	struct inode *inode = folio->mapping->host;
>  	u64 pos = folio_pos(folio);
>  	u64 end_pos = pos + folio_size(folio);
> +	u64 end_aligned = 0;
>  	unsigned count = 0;
>  	int error = 0;
>  	u32 rlen;
> @@ -1955,9 +1957,10 @@ static int iomap_writepage_map(struct iomap_writepage_ctx *wpc,
>  	/*
>  	 * Walk through the folio to find dirty areas to write back.
>  	 */
> -	while ((rlen = iomap_find_dirty_range(folio, &pos, end_pos))) {
> +	end_aligned = round_up(end_pos, i_blocksize(inode));

So do I follow correctly that the set_range_dirty() path doesn't need
the alignment because it uses inclusive first_blk/last_blk logic,
whereas this find_dirty_range() path does the opposite and thus does
require the round_up? If so, presumably that means if we fixed up the
find path we wouldn't need end_aligned at all anymore?

If I follow the reasoning correctly, then this looks Ok to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

... but as a followup exercise it might be nice to clean up the
iomap_find_dirty_range() path to either do the rounding itself or be
more consistent with set_range_dirty().

Brian

> +	while ((rlen = iomap_find_dirty_range(folio, &pos, end_aligned))) {
>  		error = iomap_writepage_map_blocks(wpc, wbc, folio, inode,
> -				pos, rlen, &count);
> +				pos, end_pos, rlen, &count);
>  		if (error)
>  			break;
>  		pos += rlen;
> -- 
> 2.39.2
> 
> 


  reply	other threads:[~2024-12-09 14:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-09 11:42 [PATCH v6 0/3] iomap: fix zero padding data issue in concurrent append writes Long Li
2024-12-09 11:42 ` [PATCH v6 1/3] iomap: pass byte granular end position to iomap_add_to_ioend Long Li
2024-12-09 14:06   ` Brian Foster [this message]
2024-12-10  8:09     ` Long Li
2024-12-10 11:50       ` Brian Foster
2024-12-09 11:42 ` [PATCH v6 2/3] iomap: fix zero padding data issue in concurrent append writes Long Li
2024-12-09 11:42 ` [PATCH v6 3/3] xfs: clean up xfs_end_ioend() to reuse local variables Long Li
2025-01-14 10:30   ` Carlos Maiolino
2024-12-10 10:15 ` [PATCH v6 0/3] iomap: fix zero padding data issue in concurrent append writes Christian Brauner
2024-12-10 11:38   ` Christoph Hellwig
2024-12-11 10:34     ` Christian Brauner
2024-12-11 10:09 ` (subset) " Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1b5Vr96Aysa_JCG@bfoster \
    --to=bfoster@redhat.com \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=djwong@kernel.org \
    --cc=houtao1@huawei.com \
    --cc=leo.lilong@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.