public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@lst.de>, Christian Brauner <brauner@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Chandan Babu R <chandan.babu@oracle.com>,
	Zhang Yi <yi.zhang@huaweicloud.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper
Date: Mon, 27 Nov 2023 12:32:38 +0530	[thread overview]
Message-ID: <87zfyzr84x.fsf@doe.com> (raw)
In-Reply-To: <8734wrsmy5.fsf@doe.com>

Ritesh Harjani (IBM) <ritesh.list@gmail.com> writes:

> Christoph Hellwig <hch@lst.de> writes:
>
>> Most of iomap_do_writepage is dedidcated to handling a folio crossing or
>> beyond i_size.  Split this is into a separate helper and update the
>> commens to deal with folios instead of pages and make them more readable.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  fs/iomap/buffered-io.c | 128 ++++++++++++++++++++---------------------
>>  1 file changed, 62 insertions(+), 66 deletions(-)
>
> Just a small nit below.
>
> But otherwise looks good to me. Please feel free to add - 
>
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
>>
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index 8148e4c9765dac..4a5a21809b0182 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -1768,6 +1768,64 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
>>  	wbc_account_cgroup_owner(wbc, &folio->page, len);
>>  }
>>  
>> +/*
>> + * Check interaction of the folio with the file end.
>> + *
>> + * If the folio is entirely beyond i_size, return false.  If it straddles
>> + * i_size, adjust end_pos and zero all data beyond i_size.
>> + */
>> +static bool iomap_writepage_handle_eof(struct folio *folio, struct inode *inode,
>> +		u64 *end_pos)
>> +{
>> +	u64 isize = i_size_read(inode);
>
> i_size_read(inode) returns loff_t type. Can we make end_pos also as
> loff_t type. We anyway use loff_t for
> folio_pos(folio) + folio_size(folio), at many places in fs/iomap. It
> would be more consistent with the data type then.
>
> Thoughts?

aah, that might also require to change the types in
iomap_writepage_map(). So I guess the data type consistency change
should be a follow up change as this patch does only the refactoring.

>
> -ritesh
>
>
>> +
>> +	if (*end_pos > isize) {
>> +		size_t poff = offset_in_folio(folio, isize);
>> +		pgoff_t end_index = isize >> PAGE_SHIFT;
>> +
>> +		/*
>> +		 * If the folio is entirely ouside of i_size, skip it.
>> +		 *
>> +		 * This can happen due to a truncate operation that is in
>> +		 * progress and in that case truncate will finish it off once
>> +		 * we've dropped the folio lock.
>> +		 *
>> +		 * Note that the pgoff_t used for end_index is an unsigned long.
>> +		 * If the given offset is greater than 16TB on a 32-bit system,
>> +		 * then if we checked if the folio is fully outside i_size with
>> +		 * "if (folio->index >= end_index + 1)", "end_index + 1" would
>> +		 * overflow and evaluate to 0.  Hence this folio would be
>> +		 * redirtied and written out repeatedly, which would result in
>> +		 * an infinite loop; the user program performing this operation
>> +		 * would hang.  Instead, we can detect this situation by
>> +		 * checking if the folio is totally beyond i_size or if its
>> +		 * offset is just equal to the EOF.
>> +		 */
>> +		if (folio->index > end_index ||
>> +		    (folio->index == end_index && poff == 0))
>> +			return false;
>> +
>> +		/*
>> +		 * The folio straddles i_size.
>> +		 *
>> +		 * It must be zeroed out on each and every writepage invocation
>> +		 * because it may be mmapped:
>> +		 *
>> +		 *    A file is mapped in multiples of the page size.  For a
>> +		 *    file that is not a multiple of the page size, the
>> +		 *    remaining memory is zeroed when mapped, and writes to that
>> +		 *    region are not written out to the file.
>> +		 *
>> +		 * Also adjust the writeback range to skip all blocks entirely
>> +		 * beyond i_size.
>> +		 */
>> +		folio_zero_segment(folio, poff, folio_size(folio));
>> +		*end_pos = isize;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>  /*
>>   * We implement an immediate ioend submission policy here to avoid needing to
>>   * chain multiple ioends and hence nest mempool allocations which can violate
>> @@ -1906,78 +1964,16 @@ static int iomap_do_writepage(struct folio *folio,
>>  {
>>  	struct iomap_writepage_ctx *wpc = data;
>>  	struct inode *inode = folio->mapping->host;
>> -	u64 end_pos, isize;
>> +	u64 end_pos = folio_pos(folio) + folio_size(folio);
>>  
>>  	trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
>>  
>> -	/*
>> -	 * Is this folio beyond the end of the file?
>> -	 *
>> -	 * The folio index is less than the end_index, adjust the end_pos
>> -	 * to the highest offset that this folio should represent.
>> -	 * -----------------------------------------------------
>> -	 * |			file mapping	       | <EOF> |
>> -	 * -----------------------------------------------------
>> -	 * | Page ... | Page N-2 | Page N-1 |  Page N  |       |
>> -	 * ^--------------------------------^----------|--------
>> -	 * |     desired writeback range    |      see else    |
>> -	 * ---------------------------------^------------------|
>> -	 */
>> -	isize = i_size_read(inode);
>> -	end_pos = folio_pos(folio) + folio_size(folio);
>> -	if (end_pos > isize) {
>> -		/*
>> -		 * Check whether the page to write out is beyond or straddles
>> -		 * i_size or not.
>> -		 * -------------------------------------------------------
>> -		 * |		file mapping		        | <EOF>  |
>> -		 * -------------------------------------------------------
>> -		 * | Page ... | Page N-2 | Page N-1 |  Page N   | Beyond |
>> -		 * ^--------------------------------^-----------|---------
>> -		 * |				    |      Straddles     |
>> -		 * ---------------------------------^-----------|--------|
>> -		 */
>> -		size_t poff = offset_in_folio(folio, isize);
>> -		pgoff_t end_index = isize >> PAGE_SHIFT;
>> -
>> -		/*
>> -		 * Skip the page if it's fully outside i_size, e.g.
>> -		 * due to a truncate operation that's in progress.  We've
>> -		 * cleaned this page and truncate will finish things off for
>> -		 * us.
>> -		 *
>> -		 * Note that the end_index is unsigned long.  If the given
>> -		 * offset is greater than 16TB on a 32-bit system then if we
>> -		 * checked if the page is fully outside i_size with
>> -		 * "if (page->index >= end_index + 1)", "end_index + 1" would
>> -		 * overflow and evaluate to 0.  Hence this page would be
>> -		 * redirtied and written out repeatedly, which would result in
>> -		 * an infinite loop; the user program performing this operation
>> -		 * would hang.  Instead, we can detect this situation by
>> -		 * checking if the page is totally beyond i_size or if its
>> -		 * offset is just equal to the EOF.
>> -		 */
>> -		if (folio->index > end_index ||
>> -		    (folio->index == end_index && poff == 0))
>> -			goto unlock;
>> -
>> -		/*
>> -		 * The page straddles i_size.  It must be zeroed out on each
>> -		 * and every writepage invocation because it may be mmapped.
>> -		 * "A file is mapped in multiples of the page size.  For a file
>> -		 * that is not a multiple of the page size, the remaining
>> -		 * memory is zeroed when mapped, and writes to that region are
>> -		 * not written out to the file."
>> -		 */
>> -		folio_zero_segment(folio, poff, folio_size(folio));
>> -		end_pos = isize;
>> +	if (!iomap_writepage_handle_eof(folio, inode, &end_pos)) {
>> +		folio_unlock(folio);
>> +		return 0;
>>  	}
>>  
>>  	return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
>> -
>> -unlock:
>> -	folio_unlock(folio);
>> -	return 0;
>>  }
>>  
>>  int
>> -- 
>> 2.39.2

  reply	other threads:[~2023-11-27  7:02 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-26 12:47 RFC: map multiple blocks per ->map_blocks in iomap writeback Christoph Hellwig
2023-11-26 12:47 ` [PATCH 01/13] iomap: clear the per-folio dirty bits on all writeback failures Christoph Hellwig
2023-11-27  3:47   ` Ritesh Harjani
2023-11-27  6:29     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 02/13] iomap: treat inline data in iomap_writepage_map as an I/O error Christoph Hellwig
2023-11-27  5:01   ` Ritesh Harjani
2023-11-27  6:33     ` Christoph Hellwig
2023-11-29  4:40       ` Darrick J. Wong
2023-11-29  5:39         ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 03/13] iomap: move the io_folios field out of struct iomap_ioend Christoph Hellwig
2023-11-27  5:33   ` Ritesh Harjani
2023-11-29  4:41   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 04/13] iomap: drop the obsolete PF_MEMALLOC check in iomap_do_writepage Christoph Hellwig
2023-11-27  6:39   ` Ritesh Harjani
2023-11-27  6:41     ` Christoph Hellwig
2023-11-29  4:45       ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 05/13] iomap: factor out a iomap_writepage_handle_eof helper Christoph Hellwig
2023-11-27  6:57   ` Ritesh Harjani
2023-11-27  7:02     ` Ritesh Harjani [this message]
2023-11-27  7:12       ` Christoph Hellwig
2023-11-29  4:48         ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 06/13] iomap: move all remaining per-folio logic into xfs_writepage_map Christoph Hellwig
2023-11-27  7:36   ` Ritesh Harjani
2023-11-27 19:20   ` Josef Bacik
2023-11-29  4:50   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 07/13] iomap: clean up the iomap_new_ioend calling convention Christoph Hellwig
2023-11-27  7:43   ` Ritesh Harjani
2023-11-27  8:13     ` Christoph Hellwig
2023-11-29  4:51       ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 08/13] iomap: move the iomap_sector sector calculation out of iomap_add_to_ioend Christoph Hellwig
2023-11-27  9:54   ` Ritesh Harjani
2023-11-27 13:54     ` Christoph Hellwig
2023-11-29  4:53       ` Darrick J. Wong
2023-11-29  5:42         ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 09/13] iomap: don't chain bios Christoph Hellwig
2023-11-27 12:53   ` Zhang Yi
2023-11-27 13:51     ` Christoph Hellwig
2023-11-29  4:59   ` Darrick J. Wong
2023-11-29  5:40     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 10/13] iomap: only call mapping_set_error once for each failed bio Christoph Hellwig
2023-11-29  5:03   ` Darrick J. Wong
2023-11-29  5:41     ` Christoph Hellwig
2023-11-26 12:47 ` [PATCH 11/13] iomap: factor out a iomap_writepage_map_block helper Christoph Hellwig
2023-11-29  5:06   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 12/13] iomap: submit ioends immediately Christoph Hellwig
2023-11-29  5:14   ` Darrick J. Wong
2023-11-26 12:47 ` [PATCH 13/13] iomap: map multiple blocks at a time Christoph Hellwig
2023-11-29  5:25   ` Darrick J. Wong
2023-11-29  5:44     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfyzr84x.fsf@doe.com \
    --to=ritesh.list@gmail.com \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=yi.zhang@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox