linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: David Howells <dhowells@redhat.com>, Steve French <smfrench@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Marc Dionne <marc.dionne@auristor.com>,
	 Paulo Alcantara <pc@manguebit.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Eric Van Hensbergen <ericvh@kernel.org>,
	Ilya Dryomov <idryomov@gmail.com>,
	Christian Brauner <christian@brauner.io>,
	linux-cachefs@redhat.com, linux-afs@lists.infradead.org,
	 linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
	 ceph-devel@vger.kernel.org, v9fs@lists.linux.dev,
	linux-fsdevel@vger.kernel.org,  linux-mm@kvack.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 37/39] netfs: Optimise away reads above the point at which there can be no data
Date: Thu, 14 Dec 2023 09:07:28 -0500	[thread overview]
Message-ID: <8b9413cc37a231a97059c7d028d404ab35363764.camel@kernel.org> (raw)
In-Reply-To: <20231213152350.431591-38-dhowells@redhat.com>

On Wed, 2023-12-13 at 15:23 +0000, David Howells wrote:
> Track the file position above which the server is not expected to have any
> data (the "zero point") and preemptively assume that we can satisfy
> requests by filling them with zeroes locally rather than attempting to
> download them if they're over that line - even if we've written data back
> to the server.  Assume that any data that was written back above that
> position is held in the local cache.  Note that we have to split requests
> that straddle the line.
> 
> Make use of this to optimise away some reads from the server.  We need to
> set the zero point in the following circumstances:
> 
>  (1) When we see an extant remote inode and have no cache for it, we set
>      the zero_point to i_size.
> 
>  (2) On local inode creation, we set zero_point to 0.
> 
>  (3) On local truncation down, we reduce zero_point to the new i_size if
>      the new i_size is lower.
> 
>  (4) On local truncation up, we don't change zero_point.
> 

The above seems odd, but I guess the assumption is that if there are any
writes by a 3rd party above the old zero point, that that would cause an
invalidation?

>  (5) On local modification, we don't change zero_point.
> 
>  (6) On remote invalidation, we set zero_point to the new i_size.
> 
>  (7) If stored data is discarded from the pagecache or culled from fscache,
>      we must set zero_point above that if the data also got written to the
>      server.
> 
>  (8) If dirty data is written back to the server, but not fscache, we must
>      set zero_point above that.
> 
>  (9) If a direct I/O write is made, set zero_point above that.
> 
> Assuming the above, any read from the server at or above the zero_point
> position will return all zeroes.
> 
> The zero_point value can be stored in the cache, provided the above rules
> are applied to it by any code that culls part of the local cache.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: linux-cachefs@redhat.com
> cc: linux-fsdevel@vger.kernel.org
> cc: linux-mm@kvack.org
> ---
>  fs/afs/inode.c            | 22 +++++++++++++---------
>  fs/netfs/buffered_write.c |  2 +-
>  fs/netfs/direct_write.c   |  4 ++++
>  fs/netfs/io.c             | 10 ++++++++++
>  fs/netfs/misc.c           |  5 +++++
>  fs/smb/client/cifsfs.c    |  4 ++--
>  include/linux/netfs.h     | 14 ++++++++++++--
>  7 files changed, 47 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/afs/inode.c b/fs/afs/inode.c
> index c43112dcbbbb..dfd940a64e0f 100644
> --- a/fs/afs/inode.c
> +++ b/fs/afs/inode.c
> @@ -166,6 +166,7 @@ static void afs_apply_status(struct afs_operation *op,
>  	struct inode *inode = &vnode->netfs.inode;
>  	struct timespec64 t;
>  	umode_t mode;
> +	bool unexpected_jump = false;
>  	bool data_changed = false;
>  	bool change_size = vp->set_size;
>  
> @@ -230,6 +231,7 @@ static void afs_apply_status(struct afs_operation *op,
>  		}
>  		change_size = true;
>  		data_changed = true;
> +		unexpected_jump = true;
>  	} else if (vnode->status.type == AFS_FTYPE_DIR) {
>  		/* Expected directory change is handled elsewhere so
>  		 * that we can locally edit the directory and save on a
> @@ -251,6 +253,8 @@ static void afs_apply_status(struct afs_operation *op,
>  		vnode->netfs.remote_i_size = status->size;
>  		if (change_size || status->size > i_size_read(inode)) {
>  			afs_set_i_size(vnode, status->size);
> +			if (unexpected_jump)
> +				vnode->netfs.zero_point = status->size;
>  			inode_set_ctime_to_ts(inode, t);
>  			inode_set_atime_to_ts(inode, t);
>  		}
> @@ -689,17 +693,17 @@ static void afs_setattr_success(struct afs_operation *op)
>  static void afs_setattr_edit_file(struct afs_operation *op)
>  {
>  	struct afs_vnode_param *vp = &op->file[0];
> -	struct inode *inode = &vp->vnode->netfs.inode;
> +	struct afs_vnode *vnode = vp->vnode;
>  
>  	if (op->setattr.attr->ia_valid & ATTR_SIZE) {
>  		loff_t size = op->setattr.attr->ia_size;
>  		loff_t i_size = op->setattr.old_i_size;
>  
> -		if (size < i_size)
> -			truncate_pagecache(inode, size);
> -		if (size != i_size)
> -			fscache_resize_cookie(afs_vnode_cache(vp->vnode),
> -					      vp->scb.status.size);
> +		if (size != i_size) {
> +			truncate_setsize(&vnode->netfs.inode, size);
> +			netfs_resize_file(&vnode->netfs, size, true);
> +			fscache_resize_cookie(afs_vnode_cache(vnode), size);
> +		}
>  	}
>  }
>  
> @@ -767,11 +771,11 @@ int afs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
>  		 */
>  		if (!(attr->ia_valid & (supported & ~ATTR_SIZE & ~ATTR_MTIME)) &&
>  		    attr->ia_size < i_size &&
> -		    attr->ia_size > vnode->status.size) {
> -			truncate_pagecache(inode, attr->ia_size);
> +		    attr->ia_size > vnode->netfs.remote_i_size) {
> +			truncate_setsize(inode, attr->ia_size);
> +			netfs_resize_file(&vnode->netfs, size, false);
>  			fscache_resize_cookie(afs_vnode_cache(vnode),
>  					      attr->ia_size);
> -			i_size_write(inode, attr->ia_size);
>  			ret = 0;
>  			goto out_unlock;
>  		}
> diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
> index dce6995fb644..d7ce424b9188 100644
> --- a/fs/netfs/buffered_write.c
> +++ b/fs/netfs/buffered_write.c
> @@ -73,7 +73,7 @@ static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx,
>  	if (folio_test_uptodate(folio))
>  		return NETFS_FOLIO_IS_UPTODATE;
>  
> -	if (pos >= ctx->remote_i_size)
> +	if (pos >= ctx->zero_point)
>  		return NETFS_MODIFY_AND_CLEAR;
>  
>  	if (!maybe_trouble && offset == 0 && len >= flen)
> diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
> index bb0c2718f57b..aad05f2349a4 100644
> --- a/fs/netfs/direct_write.c
> +++ b/fs/netfs/direct_write.c
> @@ -134,6 +134,7 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	struct file *file = iocb->ki_filp;
>  	struct inode *inode = file->f_mapping->host;
>  	struct netfs_inode *ictx = netfs_inode(inode);
> +	unsigned long long end;
>  	ssize_t ret;
>  
>  	_enter("%llx,%zx,%llx", iocb->ki_pos, iov_iter_count(from), i_size_read(inode));
> @@ -155,6 +156,9 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	ret = kiocb_invalidate_pages(iocb, iov_iter_count(from));
>  	if (ret < 0)
>  		goto out;
> +	end = iocb->ki_pos + iov_iter_count(from);
> +	if (end > ictx->zero_point)
> +		ictx->zero_point = end;
>  
>  	fscache_invalidate(netfs_i_cookie(ictx), NULL, i_size_read(inode),
>  			   FSCACHE_INVAL_DIO_WRITE);
> diff --git a/fs/netfs/io.c b/fs/netfs/io.c
> index 5d9098db815a..41a6113aa7fa 100644
> --- a/fs/netfs/io.c
> +++ b/fs/netfs/io.c
> @@ -569,6 +569,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
>  			struct iov_iter *io_iter)
>  {
>  	enum netfs_io_source source = NETFS_DOWNLOAD_FROM_SERVER;
> +	struct netfs_inode *ictx = netfs_inode(rreq->inode);
>  	size_t lsize;
>  
>  	_enter("%llx-%llx,%llx", subreq->start, subreq->start + subreq->len, rreq->i_size);
> @@ -586,6 +587,14 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
>  		 * to make serial calls, it can indicate a short read and then
>  		 * we will call it again.
>  		 */
> +		if (rreq->origin != NETFS_DIO_READ) {
> +			if (subreq->start >= ictx->zero_point) {
> +				source = NETFS_FILL_WITH_ZEROES;
> +				goto set;
> +			}
> +			if (subreq->len > ictx->zero_point - subreq->start)
> +				subreq->len = ictx->zero_point - subreq->start;
> +		}
>  		if (subreq->len > rreq->i_size - subreq->start)
>  			subreq->len = rreq->i_size - subreq->start;
>  		if (rreq->rsize && subreq->len > rreq->rsize)
> @@ -607,6 +616,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
>  		}
>  	}
>  
> +set:
>  	if (subreq->len > rreq->len)
>  		pr_warn("R=%08x[%u] SREQ>RREQ %zx > %zx\n",
>  			rreq->debug_id, subreq->debug_index,
> diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
> index 40421ced4cd3..31e45dfad5b0 100644
> --- a/fs/netfs/misc.c
> +++ b/fs/netfs/misc.c
> @@ -240,6 +240,11 @@ EXPORT_SYMBOL(netfs_invalidate_folio);
>  bool netfs_release_folio(struct folio *folio, gfp_t gfp)
>  {
>  	struct netfs_inode *ctx = netfs_inode(folio_inode(folio));
> +	unsigned long long end;
> +
> +	end = folio_pos(folio) + folio_size(folio);
> +	if (end > ctx->zero_point)
> +		ctx->zero_point = end;
>  
>  	if (folio_test_private(folio))
>  		return false;
> diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
> index 96a65cf9b5ec..07cd88897c33 100644
> --- a/fs/smb/client/cifsfs.c
> +++ b/fs/smb/client/cifsfs.c
> @@ -1220,7 +1220,7 @@ static int cifs_precopy_set_eof(struct inode *src_inode, struct cifsInodeInfo *s
>  	if (rc < 0)
>  		goto set_failed;
>  
> -	netfs_resize_file(&src_cifsi->netfs, src_end);
> +	netfs_resize_file(&src_cifsi->netfs, src_end, true);
>  	fscache_resize_cookie(cifs_inode_cookie(src_inode), src_end);
>  	return 0;
>  
> @@ -1351,7 +1351,7 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off,
>  			smb_file_src, smb_file_target, off, len, destoff);
>  		if (rc == 0 && new_size > i_size_read(target_inode)) {
>  			truncate_setsize(target_inode, new_size);
> -			netfs_resize_file(&target_cifsi->netfs, new_size);
> +			netfs_resize_file(&target_cifsi->netfs, new_size, true);
>  			fscache_resize_cookie(cifs_inode_cookie(target_inode),
>  					      new_size);
>  		}
> diff --git a/include/linux/netfs.h b/include/linux/netfs.h
> index fc77f7be220a..2005ad3b0e25 100644
> --- a/include/linux/netfs.h
> +++ b/include/linux/netfs.h
> @@ -136,6 +136,8 @@ struct netfs_inode {
>  	struct fscache_cookie	*cache;
>  #endif
>  	loff_t			remote_i_size;	/* Size of the remote file */
> +	loff_t			zero_point;	/* Size after which we assume there's no data
> +						 * on the server */
>  	unsigned long		flags;
>  #define NETFS_ICTX_ODIRECT	0		/* The file has DIO in progress */
>  #define NETFS_ICTX_UNBUFFERED	1		/* I/O should not use the pagecache */
> @@ -465,22 +467,30 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
>  {
>  	ctx->ops = ops;
>  	ctx->remote_i_size = i_size_read(&ctx->inode);
> +	ctx->zero_point = ctx->remote_i_size;
>  	ctx->flags = 0;
>  #if IS_ENABLED(CONFIG_FSCACHE)
>  	ctx->cache = NULL;
>  #endif
> +	/* ->releasepage() drives zero_point */
> +	mapping_set_release_always(ctx->inode.i_mapping);
>  }
>  
>  /**
>   * netfs_resize_file - Note that a file got resized
>   * @ctx: The netfs inode being resized
>   * @new_i_size: The new file size
> + * @changed_on_server: The change was applied to the server
>   *
>   * Inform the netfs lib that a file got resized so that it can adjust its state.
>   */
> -static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size)
> +static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size,
> +				     bool changed_on_server)
>  {
> -	ctx->remote_i_size = new_i_size;
> +	if (changed_on_server)
> +		ctx->remote_i_size = new_i_size;
> +	if (new_i_size < ctx->zero_point)
> +		ctx->zero_point = new_i_size;
>  }
>  
>  /**
> 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2023-12-14 14:07 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-13 15:23 [PATCH v4 00/39] netfs, afs, 9p: Delegate high-level I/O to netfslib David Howells
2023-12-13 15:23 ` [PATCH v4 01/39] netfs, fscache: Move fs/fscache/* into fs/netfs/ David Howells
2023-12-13 15:23 ` [PATCH v4 02/39] netfs, fscache: Combine fscache with netfs David Howells
2023-12-13 15:23 ` [PATCH v4 03/39] netfs, fscache: Remove ->begin_cache_operation David Howells
2023-12-13 15:23 ` [PATCH v4 04/39] netfs, fscache: Move /proc/fs/fscache to /proc/fs/netfs and put in a symlink David Howells
2023-12-13 15:23 ` [PATCH v4 05/39] netfs: Move pinning-for-writeback from fscache to netfs David Howells
2023-12-13 15:23 ` [PATCH v4 06/39] netfs: Add a procfile to list in-progress requests David Howells
2023-12-13 15:59   ` Jeff Layton
2023-12-13 15:23 ` [PATCH v4 07/39] netfs: Allow the netfs to make the io (sub)request alloc larger David Howells
2023-12-13 15:23 ` [PATCH v4 08/39] netfs: Add a ->free_subrequest() op David Howells
2023-12-13 15:23 ` [PATCH v4 09/39] afs: Don't use folio->private to record partial modification David Howells
2023-12-13 15:23 ` [PATCH v4 10/39] netfs: Provide invalidate_folio and release_folio calls David Howells
2023-12-13 16:05   ` Jeff Layton
2023-12-13 15:23 ` [PATCH v4 11/39] netfs: Implement unbuffered/DIO vs buffered I/O locking David Howells
2023-12-13 16:08   ` Jeff Layton
2023-12-13 16:30     ` Jeff Layton
2023-12-13 15:23 ` [PATCH v4 12/39] netfs: Add iov_iters to (sub)requests to describe various buffers David Howells
2023-12-13 16:37   ` Jeff Layton
2023-12-19 14:31   ` David Howells
2023-12-19 14:40   ` David Howells
2023-12-13 15:23 ` [PATCH v4 13/39] netfs: Add support for DIO buffering David Howells
2023-12-13 15:23 ` [PATCH v4 14/39] netfs: Provide tools to create a buffer in an xarray David Howells
2023-12-13 15:23 ` [PATCH v4 15/39] netfs: Add bounce buffering support David Howells
2023-12-13 15:23 ` [PATCH v4 16/39] netfs: Add func to calculate pagecount/size-limited span of an iterator David Howells
2023-12-13 15:23 ` [PATCH v4 17/39] netfs: Limit subrequest by size or number of segments David Howells
2023-12-13 15:23 ` [PATCH v4 18/39] netfs: Export netfs_put_subrequest() and some tracepoints David Howells
2023-12-13 18:01   ` Jeff Layton
2023-12-19 14:42   ` David Howells
2023-12-19 14:48   ` David Howells
2023-12-13 15:23 ` [PATCH v4 19/39] netfs: Extend the netfs_io_*request structs to handle writes David Howells
2023-12-13 15:23 ` [PATCH v4 20/39] netfs: Add a hook to allow tell the netfs to update its i_size David Howells
2023-12-13 15:23 ` [PATCH v4 21/39] netfs: Make netfs_put_request() handle a NULL pointer David Howells
2023-12-13 15:23 ` [PATCH v4 22/39] netfs: Make the refcounting of netfs_begin_read() easier to use David Howells
2023-12-13 15:23 ` [PATCH v4 23/39] netfs: Prep to use folio->private for write grouping and streaming write David Howells
2023-12-13 15:23 ` [PATCH v4 24/39] netfs: Dispatch write requests to process a writeback slice David Howells
2023-12-13 15:23 ` [PATCH v4 25/39] netfs: Provide func to copy data to pagecache for buffered write David Howells
2023-12-13 15:23 ` [PATCH v4 26/39] netfs: Make netfs_read_folio() handle streaming-write pages David Howells
2023-12-13 15:23 ` [PATCH v4 27/39] netfs: Allocate multipage folios in the writepath David Howells
2023-12-13 15:23 ` [PATCH v4 28/39] netfs: Implement support for unbuffered/DIO read David Howells
2023-12-14 12:43   ` Jeff Layton
2023-12-19 15:46   ` David Howells
2023-12-13 15:23 ` [PATCH v4 29/39] netfs: Implement unbuffered/DIO write support David Howells
2023-12-13 15:23 ` [PATCH v4 30/39] netfs: Implement buffered write API David Howells
2023-12-13 15:23 ` [PATCH v4 31/39] netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() David Howells
2023-12-13 15:23 ` [PATCH v4 32/39] netfs: Provide netfs_file_read_iter() David Howells
2023-12-13 15:23 ` [PATCH v4 33/39] netfs, cachefiles: Pass upper bound length to allow expansion David Howells
2023-12-13 15:23 ` [PATCH v4 34/39] netfs: Provide a writepages implementation David Howells
2023-12-13 15:23 ` [PATCH v4 35/39] netfs: Provide a launder_folio implementation David Howells
2023-12-13 15:23 ` [PATCH v4 36/39] netfs: Implement a write-through caching option David Howells
2023-12-14 13:49   ` Jeff Layton
2023-12-19 16:51   ` David Howells
2023-12-19 17:19     ` Jeff Layton
2023-12-13 15:23 ` [PATCH v4 37/39] netfs: Optimise away reads above the point at which there can be no data David Howells
2023-12-14 14:07   ` Jeff Layton [this message]
2023-12-19 16:56   ` David Howells
2023-12-13 15:23 ` [PATCH v4 38/39] afs: Use the netfs write helpers David Howells
2023-12-13 15:23 ` [PATCH v4 39/39] 9p: Use netfslib read/write_iter David Howells
2023-12-13 15:39   ` Christian Schoenebeck
2023-12-14 14:11 ` [PATCH v4 00/39] netfs, afs, 9p: Delegate high-level I/O to netfslib Jeff Layton
2023-12-15 12:03 ` Christian Brauner
2023-12-15 13:29   ` Dominique Martinet
2023-12-18 11:05     ` Christian Brauner
2023-12-20 10:04   ` David Howells
2023-12-20 13:26     ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b9413cc37a231a97059c7d028d404ab35363764.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=christian@brauner.io \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=idryomov@gmail.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=marc.dionne@auristor.com \
    --cc=netdev@vger.kernel.org \
    --cc=pc@manguebit.com \
    --cc=smfrench@gmail.com \
    --cc=sprasad@microsoft.com \
    --cc=tom@talpey.com \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).