From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@lst.de>, Christian Brauner <brauner@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
Chandan Babu R <chandan.babu@oracle.com>,
Zhang Yi <yi.zhang@huaweicloud.com>,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper
Date: Mon, 27 Nov 2023 12:27:22 +0530 [thread overview]
Message-ID: <8734wrsmy5.fsf@doe.com> (raw)
In-Reply-To: <20231126124720.1249310-6-hch@lst.de>
Christoph Hellwig <hch@lst.de> writes:
> Most of iomap_do_writepage is dedidcated to handling a folio crossing or
> beyond i_size. Split this is into a separate helper and update the
> commens to deal with folios instead of pages and make them more readable.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/iomap/buffered-io.c | 128 ++++++++++++++++++++---------------------
> 1 file changed, 62 insertions(+), 66 deletions(-)
Just a small nit below.
But otherwise looks good to me. Please feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8148e4c9765dac..4a5a21809b0182 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1768,6 +1768,64 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
> wbc_account_cgroup_owner(wbc, &folio->page, len);
> }
>
> +/*
> + * Check interaction of the folio with the file end.
> + *
> + * If the folio is entirely beyond i_size, return false. If it straddles
> + * i_size, adjust end_pos and zero all data beyond i_size.
> + */
> +static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
> + u64 *end_pos)
> +{
> + u64 isize = i_size_read(inode);
i_size_read(inode) returns loff_t type. Can we make end_pos also as
loff_t type. We anyway use loff_t for
folio_pos(folio) + folio_size(folio), at many places in fs/iomap. It
would be more consistent with the data type then.
Thoughts?
-ritesh
> +
> + if (*end_pos > isize) {
> + size_t poff = offset_in_folio(folio, isize);
> + pgoff_t end_index = isize >> PAGE_SHIFT;
> +
> + /*
> + * If the folio is entirely ouside of i_size, skip it.
> + *
> + * This can happen due to a truncate operation that is in
> + * progress and in that case truncate will finish it off once
> + * we've dropped the folio lock.
> + *
> + * Note that the pgoff_t used for end_index is an unsigned long.
> + * If the given offset is greater than 16TB on a 32-bit system,
> + * then if we checked if the folio is fully outside i_size with
> + * "if (folio->index >= end_index + 1)", "end_index + 1" would
> + * overflow and evaluate to 0. Hence this folio would be
> + * redirtied and written out repeatedly, which would result in
> + * an infinite loop; the user program performing this operation
> + * would hang. Instead, we can detect this situation by
> + * checking if the folio is totally beyond i_size or if its
> + * offset is just equal to the EOF.
> + */
> + if (folio->index > end_index ||
> + (folio->index == end_index && poff == 0))
> + return false;
> +
> + /*
> + * The folio straddles i_size.
> + *
> + * It must be zeroed out on each and every writepage invocation
> + * because it may be mmapped:
> + *
> + * A file is mapped in multiples of the page size. For a
> + * file that is not a multiple of the page size, the
> + * remaining memory is zeroed when mapped, and writes to that
> + * region are not written out to the file.
> + *
> + * Also adjust the writeback range to skip all blocks entirely
> + * beyond i_size.
> + */
> + folio_zero_segment(folio, poff, folio_size(folio));
> + *end_pos = isize;
> + }
> +
> + return true;
> +}
> +
> /*
> * We implement an immediate ioend submission policy here to avoid needing to
> * chain multiple ioends and hence nest mempool allocations which can violate
> @@ -1906,78 +1964,16 @@ static int iomap_do_writepage(struct folio *folio,
> {
> struct iomap_writepage_ctx *wpc = data;
> struct inode *inode = folio->mapping->host;
> - u64 end_pos, isize;
> + u64 end_pos = folio_pos(folio) + folio_size(folio);
>
> trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
>
> - /*
> - * Is this folio beyond the end of the file?
> - *
> - * The folio index is less than the end_index, adjust the end_pos
> - * to the highest offset that this folio should represent.
> - * -----------------------------------------------------
> - * | file mapping | <EOF> |
> - * -----------------------------------------------------
> - * | Page ... | Page N-2 | Page N-1 | Page N | |
> - * ^--------------------------------^----------|--------
> - * | desired writeback range | see else |
> - * ---------------------------------^------------------|
> - */
> - isize = i_size_read(inode);
> - end_pos = folio_pos(folio) + folio_size(folio);
> - if (end_pos > isize) {
> - /*
> - * Check whether the page to write out is beyond or straddles
> - * i_size or not.
> - * -------------------------------------------------------
> - * | file mapping | <EOF> |
> - * -------------------------------------------------------
> - * | Page ... | Page N-2 | Page N-1 | Page N | Beyond |
> - * ^--------------------------------^-----------|---------
> - * | | Straddles |
> - * ---------------------------------^-----------|--------|
> - */
> - size_t poff = offset_in_folio(folio, isize);
> - pgoff_t end_index = isize >> PAGE_SHIFT;
> -
> - /*
> - * Skip the page if it's fully outside i_size, e.g.
> - * due to a truncate operation that's in progress. We've
> - * cleaned this page and truncate will finish things off for
> - * us.
> - *
> - * Note that the end_index is unsigned long. If the given
> - * offset is greater than 16TB on a 32-bit system then if we
> - * checked if the page is fully outside i_size with
> - * "if (page->index >= end_index + 1)", "end_index + 1" would
> - * overflow and evaluate to 0. Hence this page would be
> - * redirtied and written out repeatedly, which would result in
> - * an infinite loop; the user program performing this operation
> - * would hang. Instead, we can detect this situation by
> - * checking if the page is totally beyond i_size or if its
> - * offset is just equal to the EOF.
> - */
> - if (folio->index > end_index ||
> - (folio->index == end_index && poff == 0))
> - goto unlock;
> -
> - /*
> - * The page straddles i_size. It must be zeroed out on each
> - * and every writepage invocation because it may be mmapped.
> - * "A file is mapped in multiples of the page size. For a file
> - * that is not a multiple of the page size, the remaining
> - * memory is zeroed when mapped, and writes to that region are
> - * not written out to the file."
> - */
> - folio_zero_segment(folio, poff, folio_size(folio));
> - end_pos = isize;
> + if (!iomap_writepage_handle_eof(folio, inode, &end_pos)) {
> + folio_unlock(folio);
> + return 0;
> }
>
> return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
> -
> -unlock:
> - folio_unlock(folio);
> - return 0;
> }
>
> int
> --
> 2.39.2
next prev parent reply other threads:[~2023-11-27 6:57 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-26 12:47 RFC: map multiple blocks per ->map_blocks in iomap writeback Christoph Hellwig
2023-11-26 12:47 ` [PATCH 01/13] iomap: clear the per-folio dirty bits on all writeback failures Christoph Hellwig
2023-11-27 3:47 ` Ritesh Harjani
2023-11-27 6:29 ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 02/13] iomap: treat inline data in iomap_writepage_map as an I/O error Christoph Hellwig
2023-11-27 5:01 ` Ritesh Harjani
2023-11-27 6:33 ` Christoph Hellwig
2023-11-29 4:40 ` Darrick J. Wong
2023-11-29 5:39 ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 03/13] iomap: move the io_folios field out of struct iomap_ioend Christoph Hellwig
2023-11-27 5:33 ` Ritesh Harjani
2023-11-29 4:41 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 04/13] iomap: drop the obsolete PF_MEMALLOC check in iomap_do_writepage Christoph Hellwig
2023-11-27 6:39 ` Ritesh Harjani
2023-11-27 6:41 ` Christoph Hellwig
2023-11-29 4:45 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper Christoph Hellwig
2023-11-27 6:57 ` Ritesh Harjani [this message]
2023-11-27 7:02 ` Ritesh Harjani
2023-11-27 7:12 ` Christoph Hellwig
2023-11-29 4:48 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 06/13] iomap: move all remaining per-folio logic into xfs_writepage_map Christoph Hellwig
2023-11-27 7:36 ` Ritesh Harjani
2023-11-27 19:20 ` Josef Bacik
2023-11-29 4:50 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 07/13] iomap: clean up the iomap_new_ioend calling convention Christoph Hellwig
2023-11-27 7:43 ` Ritesh Harjani
2023-11-27 8:13 ` Christoph Hellwig
2023-11-29 4:51 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 08/13] iomap: move the iomap_sector sector calculation out of iomap_add_to_ioend Christoph Hellwig
2023-11-27 9:54 ` Ritesh Harjani
2023-11-27 13:54 ` Christoph Hellwig
2023-11-29 4:53 ` Darrick J. Wong
2023-11-29 5:42 ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 09/13] iomap: don't chain bios Christoph Hellwig
2023-11-27 12:53 ` Zhang Yi
2023-11-27 13:51 ` Christoph Hellwig
2023-11-29 4:59 ` Darrick J. Wong
2023-11-29 5:40 ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 10/13] iomap: only call mapping_set_error once for each failed bio Christoph Hellwig
2023-11-29 5:03 ` Darrick J. Wong
2023-11-29 5:41 ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 11/13] iomap: factor out a iomap_writepage_map_block helper Christoph Hellwig
2023-11-29 5:06 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 12/13] iomap: submit ioends immediately Christoph Hellwig
2023-11-29 5:14 ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 13/13] iomap: map multiple blocks at a time Christoph Hellwig
2023-11-29 5:25 ` Darrick J. Wong
2023-11-29 5:44 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734wrsmy5.fsf@doe.com \
--to=ritesh.list@gmail.com \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=yi.zhang@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox