Re: [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Chandan Babu R <chandan.babu@oracle.com>,
	Christian Brauner <brauner@kernel.org>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper
Date: Mon, 23 Sep 2024 09:20:34 -0700	[thread overview]
Message-ID: <20240923162034.GG21877@frogsfrogsfrogs> (raw)
In-Reply-To: <20240923152904.1747117-5-hch@lst.de>

On Mon, Sep 23, 2024 at 05:28:18PM +0200, Christoph Hellwig wrote:
> Split a helper from xfs_file_write_checks that just deal with the
> post-EOF zeroing to keep the code readable.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_file.c | 140 +++++++++++++++++++++++++++-------------------
>  1 file changed, 82 insertions(+), 58 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 412b1d71b52b7d..3efb0da2a910d6 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -347,10 +347,77 @@ xfs_file_splice_read(
>  	return ret;
>  }
>  
> +/*
> + * Take care of zeroing post-EOF blocks when they might exist.
> + *
> + * Returns 0 if successfully, a negative error for a failure, or 1 if this
> + * function dropped the iolock and reacquired it exclusively and the caller
> + * needs to restart the write sanity checks.

Thanks for documentating the calling conventions,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> + */
> +static ssize_t
> +xfs_file_write_zero_eof(
> +	struct kiocb		*iocb,
> +	struct iov_iter		*from,
> +	unsigned int		*iolock,
> +	size_t			count,
> +	bool			*drained_dio)
> +{
> +	struct xfs_inode	*ip = XFS_I(iocb->ki_filp->f_mapping->host);
> +	loff_t			isize;
> +
> +	/*
> +	 * We need to serialise against EOF updates that occur in IO completions
> +	 * here. We want to make sure that nobody is changing the size while
> +	 * we do this check until we have placed an IO barrier (i.e. hold
> +	 * XFS_IOLOCK_EXCL) that prevents new IO from being dispatched.  The
> +	 * spinlock effectively forms a memory barrier once we have
> +	 * XFS_IOLOCK_EXCL so we are guaranteed to see the latest EOF value and
> +	 * hence be able to correctly determine if we need to run zeroing.
> +	 */
> +	spin_lock(&ip->i_flags_lock);
> +	isize = i_size_read(VFS_I(ip));
> +	if (iocb->ki_pos <= isize) {
> +		spin_unlock(&ip->i_flags_lock);
> +		return 0;
> +	}
> +	spin_unlock(&ip->i_flags_lock);
> +
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		return -EAGAIN;
> +
> +	if (!*drained_dio) {
> +		/*
> +		 * If zeroing is needed and we are currently holding the iolock
> +		 * shared, we need to update it to exclusive which implies
> +		 * having to redo all checks before.
> +		 */
> +		if (*iolock == XFS_IOLOCK_SHARED) {
> +			xfs_iunlock(ip, *iolock);
> +			*iolock = XFS_IOLOCK_EXCL;
> +			xfs_ilock(ip, *iolock);
> +			iov_iter_reexpand(from, count);
> +		}
> +
> +		/*
> +		 * We now have an IO submission barrier in place, but AIO can do
> +		 * EOF updates during IO completion and hence we now need to
> +		 * wait for all of them to drain.  Non-AIO DIO will have drained
> +		 * before we are given the XFS_IOLOCK_EXCL, and so for most
> +		 * cases this wait is a no-op.
> +		 */
> +		inode_dio_wait(VFS_I(ip));
> +		*drained_dio = true;
> +		return 1;
> +	}
> +
> +	trace_xfs_zero_eof(ip, isize, iocb->ki_pos - isize);
> +	return xfs_zero_range(ip, isize, iocb->ki_pos - isize, NULL);
> +}
> +
>  /*
>   * Common pre-write limit and setup checks.
>   *
> - * Called with the iolocked held either shared and exclusive according to
> + * Called with the iolock held either shared and exclusive according to
>   * @iolock, and returns with it held.  Might upgrade the iolock to exclusive
>   * if called for a direct write beyond i_size.
>   */
> @@ -360,13 +427,10 @@ xfs_file_write_checks(
>  	struct iov_iter		*from,
>  	unsigned int		*iolock)
>  {
> -	struct file		*file = iocb->ki_filp;
> -	struct inode		*inode = file->f_mapping->host;
> -	struct xfs_inode	*ip = XFS_I(inode);
> -	ssize_t			error = 0;
> +	struct inode		*inode = iocb->ki_filp->f_mapping->host;
>  	size_t			count = iov_iter_count(from);
>  	bool			drained_dio = false;
> -	loff_t			isize;
> +	ssize_t			error;
>  
>  restart:
>  	error = generic_write_checks(iocb, from);
> @@ -389,7 +453,7 @@ xfs_file_write_checks(
>  	 * exclusively.
>  	 */
>  	if (*iolock == XFS_IOLOCK_SHARED && !IS_NOSEC(inode)) {
> -		xfs_iunlock(ip, *iolock);
> +		xfs_iunlock(XFS_I(inode), *iolock);
>  		*iolock = XFS_IOLOCK_EXCL;
>  		error = xfs_ilock_iocb(iocb, *iolock);
>  		if (error) {
> @@ -400,64 +464,24 @@ xfs_file_write_checks(
>  	}
>  
>  	/*
> -	 * If the offset is beyond the size of the file, we need to zero any
> +	 * If the offset is beyond the size of the file, we need to zero all
>  	 * blocks that fall between the existing EOF and the start of this
> -	 * write.  If zeroing is needed and we are currently holding the iolock
> -	 * shared, we need to update it to exclusive which implies having to
> -	 * redo all checks before.
> -	 *
> -	 * We need to serialise against EOF updates that occur in IO completions
> -	 * here. We want to make sure that nobody is changing the size while we
> -	 * do this check until we have placed an IO barrier (i.e.  hold the
> -	 * XFS_IOLOCK_EXCL) that prevents new IO from being dispatched.  The
> -	 * spinlock effectively forms a memory barrier once we have the
> -	 * XFS_IOLOCK_EXCL so we are guaranteed to see the latest EOF value and
> -	 * hence be able to correctly determine if we need to run zeroing.
> +	 * write.
>  	 *
> -	 * We can do an unlocked check here safely as IO completion can only
> -	 * extend EOF. Truncate is locked out at this point, so the EOF can
> -	 * not move backwards, only forwards. Hence we only need to take the
> -	 * slow path and spin locks when we are at or beyond the current EOF.
> +	 * We can do an unlocked check for i_size here safely as I/O completion
> +	 * can only extend EOF.  Truncate is locked out at this point, so the
> +	 * EOF can not move backwards, only forwards. Hence we only need to take
> +	 * the slow path when we are at or beyond the current EOF.
>  	 */
> -	if (iocb->ki_pos <= i_size_read(inode))
> -		goto out;
> -
> -	spin_lock(&ip->i_flags_lock);
> -	isize = i_size_read(inode);
> -	if (iocb->ki_pos > isize) {
> -		spin_unlock(&ip->i_flags_lock);
> -
> -		if (iocb->ki_flags & IOCB_NOWAIT)
> -			return -EAGAIN;
> -
> -		if (!drained_dio) {
> -			if (*iolock == XFS_IOLOCK_SHARED) {
> -				xfs_iunlock(ip, *iolock);
> -				*iolock = XFS_IOLOCK_EXCL;
> -				xfs_ilock(ip, *iolock);
> -				iov_iter_reexpand(from, count);
> -			}
> -			/*
> -			 * We now have an IO submission barrier in place, but
> -			 * AIO can do EOF updates during IO completion and hence
> -			 * we now need to wait for all of them to drain. Non-AIO
> -			 * DIO will have drained before we are given the
> -			 * XFS_IOLOCK_EXCL, and so for most cases this wait is a
> -			 * no-op.
> -			 */
> -			inode_dio_wait(inode);
> -			drained_dio = true;
> +	if (iocb->ki_pos > i_size_read(inode)) {
> +		error = xfs_file_write_zero_eof(iocb, from, iolock, count,
> +				&drained_dio);
> +		if (error == 1)
>  			goto restart;
> -		}
> -
> -		trace_xfs_zero_eof(ip, isize, iocb->ki_pos - isize);
> -		error = xfs_zero_range(ip, isize, iocb->ki_pos - isize, NULL);
>  		if (error)
>  			return error;
> -	} else
> -		spin_unlock(&ip->i_flags_lock);
> +	}
>  
> -out:
>  	return kiocb_modified(iocb);
>  }
>  
> -- 
> 2.45.2
>

next prev parent reply	other threads:[~2024-09-23 16:20 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-23 15:28 fix stale delalloc punching for COW I/O v3 Christoph Hellwig
2024-09-23 15:28 ` [PATCH 01/10] iomap: factor out a iomap_last_written_block helper Christoph Hellwig
2024-09-23 15:53   ` Darrick J. Wong
2024-09-24  5:45     ` Christoph Hellwig
2024-09-23 15:28 ` [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc Christoph Hellwig
2024-09-23 16:18   ` Darrick J. Wong
2024-09-23 22:43     ` Dave Chinner
2024-09-24  5:55     ` Christoph Hellwig
2024-09-24  6:05       ` Darrick J. Wong
2024-09-24  6:10         ` Christoph Hellwig
2024-09-23 15:28 ` [PATCH 03/10] iomap: move locking out of iomap_write_delalloc_release Christoph Hellwig
2024-09-23 16:19   ` Darrick J. Wong
2024-09-23 15:28 ` [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper Christoph Hellwig
2024-09-23 16:20   ` Darrick J. Wong [this message]
2024-09-23 15:28 ` [PATCH 05/10] xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof Christoph Hellwig
2024-09-23 15:28 ` [PATCH 06/10] xfs: zeroing already holds invalidate_lock Christoph Hellwig
2024-09-23 16:22   ` Darrick J. Wong
2024-09-24  5:44     ` Christoph Hellwig
2024-09-23 15:28 ` [PATCH 07/10] xfs: support the COW fork in xfs_bmap_punch_delalloc_range Christoph Hellwig
2024-09-23 15:28 ` [PATCH 08/10] xfs: share more code in xfs_buffered_write_iomap_begin Christoph Hellwig
2024-09-23 15:28 ` [PATCH 09/10] xfs: set IOMAP_F_SHARED for all COW fork allocations Christoph Hellwig
2024-09-23 15:28 ` [PATCH 10/10] xfs: punch delalloc extents from the COW fork for COW writes Christoph Hellwig
2024-09-25  9:19 ` fix stale delalloc punching for COW I/O v3 Christian Brauner
2024-09-25  9:24   ` fix stale delalloc punching for COW I/O v4 Christian Brauner
  -- strict thread matches above, loose matches on Subject: below --
2024-09-24  7:40 Christoph Hellwig
2024-09-24  7:40 ` [PATCH 01/10] iomap: factor out a iomap_last_written_block helper Christoph Hellwig
2024-09-24 14:58   ` Darrick J. Wong
2024-09-24  7:40 ` [PATCH 02/10] iomap: remove iomap_file_buffered_write_punch_delalloc Christoph Hellwig
2024-09-24  7:40 ` [PATCH 03/10] iomap: move locking out of iomap_write_delalloc_release Christoph Hellwig
2024-09-24  7:40 ` [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper Christoph Hellwig
2024-09-24  7:40 ` [PATCH 05/10] xfs: take XFS_MMAPLOCK_EXCL xfs_file_write_zero_eof Christoph Hellwig
2024-09-24  7:40 ` [PATCH 06/10] xfs: zeroing already holds invalidate_lock Christoph Hellwig
2024-09-24  7:40 ` [PATCH 07/10] xfs: support the COW fork in xfs_bmap_punch_delalloc_range Christoph Hellwig
2024-09-24  7:40 ` [PATCH 08/10] xfs: share more code in xfs_buffered_write_iomap_begin Christoph Hellwig
2024-09-24  7:40 ` [PATCH 09/10] xfs: set IOMAP_F_SHARED for all COW fork allocations Christoph Hellwig
2024-09-24  7:40 ` [PATCH 10/10] xfs: punch delalloc extents from the COW fork for COW writes Christoph Hellwig
2024-10-05 15:53 ` fix stale delalloc punching for COW I/O v4 Darrick J. Wong
2024-10-07  5:41   ` Christoph Hellwig
2024-10-07  6:28     ` Darrick J. Wong
2024-10-07  6:46       ` Christoph Hellwig
2024-10-07 15:20         ` Darrick J. Wong
2024-10-08  8:59 fix stale delalloc punching for COW I/O v5 Christoph Hellwig
2024-10-08  8:59 ` [PATCH 04/10] xfs: factor out a xfs_file_write_zero_eof helper Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240923162034.GG21877@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).