public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v22 08/31] splice: Make splice from a DAX file use copy_splice_read()
       [not found] <20230522135018.2742245-1-dhowells@redhat.com>
@ 2023-05-22 13:49 ` David Howells
  2023-05-22 13:50 ` [PATCH v22 24/31] xfs: Provide a splice-read wrapper David Howells
  2023-05-22 13:50 ` [PATCH v22 25/31] zonefs: " David Howells
  2 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2023-05-22 13:49 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe, Hillf Danton,
	Christian Brauner, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Theodore Ts'o,
	Gao Xiang, linux-erofs, linux-ext4, linux-xfs

Make a read splice from a DAX file go directly to copy_splice_read() to do
the reading as filemap_splice_read() is unlikely to find any pagecache to
splice.

I think this affects only erofs, Ext2, Ext4, fuse and XFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: linux-erofs@lists.ozlabs.org
cc: linux-ext4@vger.kernel.org
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---

Notes:
    ver #21)
     - Don't need #ifdef CONFIG_FS_DAX as IS_DAX() is false if !CONFIG_FS_DAX.
     - Needs to be in vfs_splice_read(), not generic_file_splice_read().

 fs/splice.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 76126b1aafcb..8268248df3a9 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -908,10 +908,10 @@ long vfs_splice_read(struct file *in, loff_t *ppos,
 	if (unlikely(!in->f_op->splice_read))
 		return warn_unsupported(in, "read");
 	/*
-	 * O_DIRECT doesn't deal with the pagecache, so we allocate a buffer,
-	 * copy into it and splice that into the pipe.
+	 * O_DIRECT and DAX don't deal with the pagecache, so we allocate a
+	 * buffer, copy into it and splice that into the pipe.
 	 */
-	if ((in->f_flags & O_DIRECT))
+	if ((in->f_flags & O_DIRECT) || IS_DAX(in->f_mapping->host))
 		return copy_splice_read(in, ppos, pipe, len, flags);
 	return in->f_op->splice_read(in, ppos, pipe, len, flags);
 }


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v22 24/31] xfs: Provide a splice-read wrapper
       [not found] <20230522135018.2742245-1-dhowells@redhat.com>
  2023-05-22 13:49 ` [PATCH v22 08/31] splice: Make splice from a DAX file use copy_splice_read() David Howells
@ 2023-05-22 13:50 ` David Howells
  2023-05-22 13:50 ` [PATCH v22 25/31] zonefs: " David Howells
  2 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2023-05-22 13:50 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe, Hillf Danton,
	Christian Brauner, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Darrick J . Wong,
	linux-xfs

Provide a splice_read wrapper for XFS.  This does a stat count and a
shutdown check before proceeding, then emits a new trace line and locks the
inode across the call to filemap_splice_read() and adds to the stats
afterwards.  Splicing from direct I/O or DAX is handled by the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Darrick J. Wong <djwong@kernel.org>
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/xfs/xfs_file.c  | 30 +++++++++++++++++++++++++++++-
 fs/xfs/xfs_trace.h |  2 +-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index aede746541f8..08d632668e94 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -306,6 +306,34 @@ xfs_file_read_iter(
 	return ret;
 }
 
+STATIC ssize_t
+xfs_file_splice_read(
+	struct file		*in,
+	loff_t			*ppos,
+	struct pipe_inode_info	*pipe,
+	size_t			len,
+	unsigned int		flags)
+{
+	struct inode		*inode = file_inode(in);
+	struct xfs_inode	*ip = XFS_I(inode);
+	struct xfs_mount	*mp = ip->i_mount;
+	ssize_t			ret = 0;
+
+	XFS_STATS_INC(mp, xs_read_calls);
+
+	if (xfs_is_shutdown(mp))
+		return -EIO;
+
+	trace_xfs_file_splice_read(ip, *ppos, len);
+
+	xfs_ilock(ip, XFS_IOLOCK_SHARED);
+	ret = filemap_splice_read(in, ppos, pipe, len, flags);
+	xfs_iunlock(ip, XFS_IOLOCK_SHARED);
+	if (ret > 0)
+		XFS_STATS_ADD(mp, xs_read_bytes, ret);
+	return ret;
+}
+
 /*
  * Common pre-write limit and setup checks.
  *
@@ -1423,7 +1451,7 @@ const struct file_operations xfs_file_operations = {
 	.llseek		= xfs_file_llseek,
 	.read_iter	= xfs_file_read_iter,
 	.write_iter	= xfs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= xfs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iocb_bio_iopoll,
 	.unlocked_ioctl	= xfs_file_ioctl,
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index cd4ca5b1fcb0..4db669203149 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1445,7 +1445,6 @@ DEFINE_RW_EVENT(xfs_file_direct_write);
 DEFINE_RW_EVENT(xfs_file_dax_write);
 DEFINE_RW_EVENT(xfs_reflink_bounce_dio_write);
 
-
 DECLARE_EVENT_CLASS(xfs_imap_class,
 	TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count,
 		 int whichfork, struct xfs_bmbt_irec *irec),
@@ -1535,6 +1534,7 @@ DEFINE_SIMPLE_IO_EVENT(xfs_zero_eof);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_unwritten);
 DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_append);
+DEFINE_SIMPLE_IO_EVENT(xfs_file_splice_read);
 
 DECLARE_EVENT_CLASS(xfs_itrunc_class,
 	TP_PROTO(struct xfs_inode *ip, xfs_fsize_t new_size),


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v22 25/31] zonefs: Provide a splice-read wrapper
       [not found] <20230522135018.2742245-1-dhowells@redhat.com>
  2023-05-22 13:49 ` [PATCH v22 08/31] splice: Make splice from a DAX file use copy_splice_read() David Howells
  2023-05-22 13:50 ` [PATCH v22 24/31] xfs: Provide a splice-read wrapper David Howells
@ 2023-05-22 13:50 ` David Howells
  2023-05-23  2:48   ` Damien Le Moal
  2 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2023-05-22 13:50 UTC (permalink / raw)
  To: Jens Axboe, Al Viro, Christoph Hellwig
  Cc: David Howells, Matthew Wilcox, Jan Kara, Jeff Layton,
	David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe, Hillf Danton,
	Christian Brauner, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel, linux-mm, Christoph Hellwig, Darrick J . Wong,
	linux-xfs

Provide a splice_read wrapper for zonefs.  This does some checks before
proceeding and locks the inode across the call to filemap_splice_read() and
a size check in case of truncation.  Splicing from direct I/O is handled by
the caller.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Darrick J. Wong <djwong@kernel.org>
cc: linux-xfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
 fs/zonefs/file.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 132f01d3461f..65d4c4fe6364 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -752,6 +752,44 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+static ssize_t zonefs_file_splice_read(struct file *in, loff_t *ppos,
+				       struct pipe_inode_info *pipe,
+				       size_t len, unsigned int flags)
+{
+	struct inode *inode = file_inode(in);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct zonefs_zone *z = zonefs_inode_zone(inode);
+	loff_t isize;
+	ssize_t ret = 0;
+
+	/* Offline zones cannot be read */
+	if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777)))
+		return -EPERM;
+
+	if (*ppos >= z->z_capacity)
+		return 0;
+
+	inode_lock_shared(inode);
+
+	/* Limit read operations to written data */
+	mutex_lock(&zi->i_truncate_mutex);
+	isize = i_size_read(inode);
+	if (*ppos >= isize)
+		len = 0;
+	else
+		len = min_t(loff_t, len, isize - *ppos);
+	mutex_unlock(&zi->i_truncate_mutex);
+
+	if (len > 0) {
+		ret = filemap_splice_read(in, ppos, pipe, len, flags);
+		if (ret == -EIO)
+			zonefs_io_error(inode, false);
+	}
+
+	inode_unlock_shared(inode);
+	return ret;
+}
+
 /*
  * Write open accounting is done only for sequential files.
  */
@@ -896,7 +934,7 @@ const struct file_operations zonefs_file_operations = {
 	.llseek		= zonefs_file_llseek,
 	.read_iter	= zonefs_file_read_iter,
 	.write_iter	= zonefs_file_write_iter,
-	.splice_read	= generic_file_splice_read,
+	.splice_read	= zonefs_file_splice_read,
 	.splice_write	= iter_file_splice_write,
 	.iopoll		= iocb_bio_iopoll,
 };


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v22 25/31] zonefs: Provide a splice-read wrapper
  2023-05-22 13:50 ` [PATCH v22 25/31] zonefs: " David Howells
@ 2023-05-23  2:48   ` Damien Le Moal
  2023-05-23 20:43     ` David Howells
  0 siblings, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2023-05-23  2:48 UTC (permalink / raw)
  To: David Howells, Jens Axboe, Al Viro, Christoph Hellwig
  Cc: Matthew Wilcox, Jan Kara, Jeff Layton, David Hildenbrand,
	Jason Gunthorpe, Logan Gunthorpe, Hillf Danton, Christian Brauner,
	Linus Torvalds, linux-fsdevel, linux-block, linux-kernel,
	linux-mm, Christoph Hellwig, Darrick J . Wong, linux-xfs

On 5/22/23 22:50, David Howells wrote:
> Provide a splice_read wrapper for zonefs.  This does some checks before
> proceeding and locks the inode across the call to filemap_splice_read() and
> a size check in case of truncation.  Splicing from direct I/O is handled by
> the caller.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Christoph Hellwig <hch@lst.de>
> cc: Al Viro <viro@zeniv.linux.org.uk>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Darrick J. Wong <djwong@kernel.org>
> cc: linux-xfs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-block@vger.kernel.org
> cc: linux-mm@kvack.org

One comment below but otherwise looks OK.

Acked-by: Damien Le Moal <dlemoal@kernel.org>

> ---
>  fs/zonefs/file.c | 40 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
> index 132f01d3461f..65d4c4fe6364 100644
> --- a/fs/zonefs/file.c
> +++ b/fs/zonefs/file.c
> @@ -752,6 +752,44 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  	return ret;
>  }
>  
> +static ssize_t zonefs_file_splice_read(struct file *in, loff_t *ppos,
> +				       struct pipe_inode_info *pipe,
> +				       size_t len, unsigned int flags)
> +{
> +	struct inode *inode = file_inode(in);
> +	struct zonefs_inode_info *zi = ZONEFS_I(inode);
> +	struct zonefs_zone *z = zonefs_inode_zone(inode);
> +	loff_t isize;
> +	ssize_t ret = 0;
> +
> +	/* Offline zones cannot be read */
> +	if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777)))
> +		return -EPERM;
> +
> +	if (*ppos >= z->z_capacity)
> +		return 0;
> +
> +	inode_lock_shared(inode);
> +
> +	/* Limit read operations to written data */
> +	mutex_lock(&zi->i_truncate_mutex);
> +	isize = i_size_read(inode);
> +	if (*ppos >= isize)
> +		len = 0;
> +	else
> +		len = min_t(loff_t, len, isize - *ppos);
> +	mutex_unlock(&zi->i_truncate_mutex);
> +
> +	if (len > 0) {
> +		ret = filemap_splice_read(in, ppos, pipe, len, flags);
> +		if (ret == -EIO)

Is -EIO the only error that filemap_splice_read() may return ? There are other
IO error codes that we could get from the block layer, e.g. -ETIMEDOUT etc. So
"if (ret < 0)" may be better here ?

> +			zonefs_io_error(inode, false);
> +	}
> +
> +	inode_unlock_shared(inode);
> +	return ret;
> +}
> +
>  /*
>   * Write open accounting is done only for sequential files.
>   */
> @@ -896,7 +934,7 @@ const struct file_operations zonefs_file_operations = {
>  	.llseek		= zonefs_file_llseek,
>  	.read_iter	= zonefs_file_read_iter,
>  	.write_iter	= zonefs_file_write_iter,
> -	.splice_read	= generic_file_splice_read,
> +	.splice_read	= zonefs_file_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.iopoll		= iocb_bio_iopoll,
>  };
> 

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v22 25/31] zonefs: Provide a splice-read wrapper
  2023-05-23  2:48   ` Damien Le Moal
@ 2023-05-23 20:43     ` David Howells
  2023-05-24 23:13       ` Damien Le Moal
  0 siblings, 1 reply; 6+ messages in thread
From: David Howells @ 2023-05-23 20:43 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: dhowells, Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox,
	Jan Kara, Jeff Layton, David Hildenbrand, Jason Gunthorpe,
	Logan Gunthorpe, Hillf Danton, Christian Brauner, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel, linux-mm,
	Christoph Hellwig, Darrick J . Wong, linux-xfs

Damien Le Moal <dlemoal@kernel.org> wrote:

> > +	if (len > 0) {
> > +		ret = filemap_splice_read(in, ppos, pipe, len, flags);
> > +		if (ret == -EIO)
> 
> Is -EIO the only error that filemap_splice_read() may return ? There are other
> IO error codes that we could get from the block layer, e.g. -ETIMEDOUT etc. So
> "if (ret < 0)" may be better here ?

It can return -ENOMEM, -EINTR and -EAGAIN at least, none of which really count
as I/O errors.  I based the splice function on what zonefs_file_read_iter()
does:

	} else {
		ret = generic_file_read_iter(iocb, to);
		if (ret == -EIO)
			zonefs_io_error(inode, false);
	}

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v22 25/31] zonefs: Provide a splice-read wrapper
  2023-05-23 20:43     ` David Howells
@ 2023-05-24 23:13       ` Damien Le Moal
  0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2023-05-24 23:13 UTC (permalink / raw)
  To: David Howells
  Cc: Jens Axboe, Al Viro, Christoph Hellwig, Matthew Wilcox, Jan Kara,
	Jeff Layton, David Hildenbrand, Jason Gunthorpe, Logan Gunthorpe,
	Hillf Danton, Christian Brauner, Linus Torvalds, linux-fsdevel,
	linux-block, linux-kernel, linux-mm, Christoph Hellwig,
	Darrick J . Wong, linux-xfs

On 5/24/23 05:43, David Howells wrote:
> Damien Le Moal <dlemoal@kernel.org> wrote:
> 
>>> +	if (len > 0) {
>>> +		ret = filemap_splice_read(in, ppos, pipe, len, flags);
>>> +		if (ret == -EIO)
>>
>> Is -EIO the only error that filemap_splice_read() may return ? There are other
>> IO error codes that we could get from the block layer, e.g. -ETIMEDOUT etc. So
>> "if (ret < 0)" may be better here ?
> 
> It can return -ENOMEM, -EINTR and -EAGAIN at least, none of which really count
> as I/O errors.  I based the splice function on what zonefs_file_read_iter()
> does:
> 
> 	} else {
> 		ret = generic_file_read_iter(iocb, to);
> 		if (ret == -EIO)
> 			zonefs_io_error(inode, false);
> 	}

Fair point. But checking again zonefs_io_error(), it will do nothing is nothing
bad is detected for the zone that was used for the failed IO. So calling
zonefs_io_error() for all error codes is actually fine, and likely much safer. I
will change that in zonefs_file_read_iter(). Please use "if (ret < 0)" in your
patch.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-24 23:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230522135018.2742245-1-dhowells@redhat.com>
2023-05-22 13:49 ` [PATCH v22 08/31] splice: Make splice from a DAX file use copy_splice_read() David Howells
2023-05-22 13:50 ` [PATCH v22 24/31] xfs: Provide a splice-read wrapper David Howells
2023-05-22 13:50 ` [PATCH v22 25/31] zonefs: " David Howells
2023-05-23  2:48   ` Damien Le Moal
2023-05-23 20:43     ` David Howells
2023-05-24 23:13       ` Damien Le Moal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox