All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Anna Schumaker <anna@kernel.org>
Cc: Trond Myklebust <trondmy@kernel.org>,
	Tom Haynes <loghyr@hammerspace.com>, Chuck Lever <cel@kernel.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/4] nfs4.2: add UNCACHEABLE_FILE_DATA attribute support
Date: Thu, 25 Jun 2026 13:51:06 -0400	[thread overview]
Message-ID: <aj1qil-3Bc7ppLlY@kernel.org> (raw)
In-Reply-To: <24b3d9cd-d06a-4c35-b316-f7ec88f48316@app.fastmail.com>

On Thu, Jun 25, 2026 at 10:56:02AM -0400, Anna Schumaker wrote:
> Hi Mike,
> 
> On Wed, Jun 24, 2026, at 3:17 PM, Mike Snitzer wrote:
> > From: Tom Haynes <loghyr@hammerspace.com>
> >
> > Recognize the NFSv4.2 per-file UNCACHEABLE_FILE_DATA attribute (attr 87,
> > draft-ietf-nfsv4-uncacheable-files): decode it via GETATTR, track per-
> > exported-filesystem support, and record on the inode whether a regular
> > file's data must not be cached.  Acting on the attribute (opening such
> > files O_DIRECT) is done by a subsequent change.
> >
> > If the NFSv4 server reports a regular file's UNCACHEABLE_FILE_DATA as
> > true, it indicates the file's data must not be cached; the client records
> > this in NFS_I(inode)->uncacheable_file_data for use by the I/O paths.
> >
> > The UNCACHEABLE_FILE_DATA attribute applies only to regular files
> > (NF4REG); per the draft a server MUST reject a query of it on any other
> > object type with NFS4ERR_INVAL.  A subsequent commit gates the client
> > accordingly.
> >
> > See: https://datatracker.ietf.org/doc/draft-ietf-nfsv4-uncacheable-files/
> >
> > Signed-off-by: Tom Haynes <loghyr@hammerspace.com>
> > [snitzer: adapt Tom's original code focused on metadata for ABE]
> > Co-developed-by: Mike Snitzer <snitzer@hammerspace.com>
> > Signed-off-by: Mike Snitzer <snitzer@hammerspace.com>
> > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > Assisted-by: Claude:claude-opus-4-8
> > ---
> >  fs/nfs/inode.c          | 22 +++++++++++++++++++---
> >  fs/nfs/nfs4proc.c       | 14 ++++++++++++--
> >  fs/nfs/nfs4trace.h      |  4 +++-
> >  fs/nfs/nfs4xdr.c        | 35 ++++++++++++++++++++++++++++++++++-
> >  fs/nfs/nfstrace.h       |  3 ++-
> >  include/linux/nfs4.h    |  1 +
> >  include/linux/nfs_fs.h  |  3 +++
> >  include/linux/nfs_xdr.h |  8 +++++++-
> >  8 files changed, 81 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index 5bcd4027d203..c1227b7c5545 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -507,6 +507,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh 
> > *fh, struct nfs_fattr *fattr)
> >  		inode->i_blocks = 0;
> >  		nfsi->write_io = 0;
> >  		nfsi->read_io = 0;
> > +		nfsi->uncacheable_file_data = 0;
> > 
> >  		nfsi->read_cache_jiffies = fattr->time_start;
> >  		nfsi->attr_gencount = fattr->gencount;
> > @@ -561,6 +562,11 @@ nfs_fhget(struct super_block *sb, struct nfs_fh 
> > *fh, struct nfs_fattr *fattr)
> >  		} else if (fattr_supported & NFS_ATTR_FATTR_SPACE_USED &&
> >  			   fattr->size != 0)
> >  			nfs_set_cache_invalid(inode, NFS_INO_INVALID_BLOCKS);
> > +		if (fattr->valid & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +			nfsi->uncacheable_file_data =
> > +				!!(fattr->aux_flags & NFS_AUX_UNCACHEABLE_FILE_DATA);
> > +		else if (fattr_supported & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +			nfs_set_cache_invalid(inode, NFS_INO_INVALID_UNCACHEABLE_FILE_DATA);
> > 
> >  		nfs_setsecurity(inode, fattr);
> > 
> > @@ -1975,7 +1981,8 @@ static int 
> > nfs_inode_finish_partial_attr_update(const struct nfs_fattr *fattr,
> >  		NFS_INO_INVALID_ATIME | NFS_INO_INVALID_CTIME |
> >  		NFS_INO_INVALID_MTIME | NFS_INO_INVALID_SIZE |
> >  		NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_OTHER |
> > -		NFS_INO_INVALID_NLINK | NFS_INO_INVALID_BTIME;
> > +		NFS_INO_INVALID_NLINK | NFS_INO_INVALID_BTIME |
> > +		NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  	unsigned long cache_validity = NFS_I(inode)->cache_validity;
> >  	enum nfs4_change_attr_type ctype = 
> > NFS_SERVER(inode)->change_attr_type;
> > 
> > @@ -2297,7 +2304,8 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  	nfsi->cache_validity &= ~(NFS_INO_INVALID_ATTR
> >  			| NFS_INO_INVALID_ATIME
> >  			| NFS_INO_REVAL_FORCED
> > -			| NFS_INO_INVALID_BLOCKS);
> > +			| NFS_INO_INVALID_BLOCKS
> > +			| NFS_INO_INVALID_UNCACHEABLE_FILE_DATA);
> > 
> >  	/* Do atomic weak cache consistency updates */
> >  	nfs_wcc_update_inode(inode, fattr);
> > @@ -2337,7 +2345,8 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  					| NFS_INO_INVALID_NLINK
> >  					| NFS_INO_INVALID_MODE
> >  					| NFS_INO_INVALID_OTHER
> > -					| NFS_INO_INVALID_BTIME;
> > +					| NFS_INO_INVALID_BTIME
> > +					| NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  				if (S_ISDIR(inode->i_mode))
> >  					nfs_force_lookup_revalidate(inode);
> >  				attr_changed = true;
> > @@ -2461,6 +2470,13 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  		nfsi->cache_validity |=
> >  			save_cache_validity & NFS_INO_INVALID_BLOCKS;
> > 
> > +	if (fattr->valid & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +		nfsi->uncacheable_file_data =
> > +				!!(fattr->aux_flags & NFS_AUX_UNCACHEABLE_FILE_DATA);
> > +	else if (fattr_supported & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +		nfsi->cache_validity |=
> > +			save_cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> > +
> >  	/* Update attrtimeo value if we're out of the unstable period */
> >  	if (attr_changed) {
> >  		nfs_inc_stats(inode, NFSIOS_ATTRINVALIDATE);
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 1360409d8de9..d237abca4793 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -228,6 +228,7 @@ const u32 nfs4_fattr_bitmap[3] = {
> >  #ifdef CONFIG_NFS_V4_SECURITY_LABEL
> >  	FATTR4_WORD2_SECURITY_LABEL
> >  #endif
> > +	| FATTR4_WORD2_UNCACHEABLE_FILE_DATA
> >  };
> > 
> >  static const u32 nfs4_pnfs_open_bitmap[3] = {
> > @@ -250,6 +251,7 @@ static const u32 nfs4_pnfs_open_bitmap[3] = {
> >  #ifdef CONFIG_NFS_V4_SECURITY_LABEL
> >  	| FATTR4_WORD2_SECURITY_LABEL
> >  #endif
> > +	| FATTR4_WORD2_UNCACHEABLE_FILE_DATA
> >  };
> > 
> >  static const u32 nfs4_open_noattr_bitmap[3] = {
> > @@ -327,6 +329,9 @@ static void nfs4_bitmap_copy_adjust(__u32 *dst, 
> > const __u32 *src,
> >  	if (!(cache_validity & NFS_INO_INVALID_BTIME))
> >  		dst[1] &= ~FATTR4_WORD1_TIME_CREATE;
> > 
> > +	if (!(cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA))
> > +		dst[2] &= ~FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > +
> >  	if (nfs_have_delegated_mtime(inode)) {
> >  		if (!(cache_validity & NFS_INO_INVALID_ATIME))
> >  			dst[1] &= ~(FATTR4_WORD1_TIME_ACCESS|FATTR4_WORD1_TIME_ACCESS_SET);
> > @@ -1238,7 +1243,7 @@ nfs4_update_changeattr_locked(struct inode *inode,
> >  				NFS_INO_INVALID_SIZE | NFS_INO_INVALID_OTHER |
> >  				NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_NLINK |
> >  				NFS_INO_INVALID_MODE | NFS_INO_INVALID_BTIME |
> > -				NFS_INO_INVALID_XATTR;
> > +				NFS_INO_INVALID_XATTR | NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  		nfsi->attrtimeo = NFS_MINATTRTIMEO(inode);
> >  	}
> >  	nfsi->attrtimeo_timestamp = jiffies;
> > @@ -3839,6 +3844,7 @@ nfs4_atomic_open(struct inode *dir, struct 
> > nfs_open_context *ctx,
> > 
> >  	if (IS_ERR(state))
> >  		return ERR_CAST(state);
> > +
> 
> I think this unrelated whitespace change accidentally snuck in

Yeap, will clean up if v2 needed.
 
> >  	return state->inode;
> >  }
> > 
> > @@ -3857,7 +3863,7 @@ static void nfs4_close_context(struct 
> > nfs_open_context *ctx, int is_sync)
> > 
> >  #define FATTR4_WORD1_NFS40_MASK (2*FATTR4_WORD1_MOUNTED_ON_FILEID - 1UL)
> >  #define FATTR4_WORD2_NFS41_MASK (2*FATTR4_WORD2_SUPPATTR_EXCLCREAT - 1UL)
> > -#define FATTR4_WORD2_NFS42_MASK (2*FATTR4_WORD2_OPEN_ARGUMENTS - 1UL)
> > +#define FATTR4_WORD2_NFS42_MASK (2*FATTR4_WORD2_UNCACHEABLE_FILE_DATA - 1UL)
> > 
> >  #define FATTR4_WORD2_NFS42_TIME_DELEG_MASK \
> >  	(FATTR4_WORD2_TIME_DELEG_MODIFY|FATTR4_WORD2_TIME_DELEG_ACCESS)
> > @@ -3981,6 +3987,8 @@ static int _nfs4_server_capabilities(struct 
> > nfs_server *server, struct nfs_fh *f
> >  		memcpy(server->attr_bitmask_nl, res.attr_bitmask,
> >  				sizeof(server->attr_bitmask));
> >  		server->attr_bitmask_nl[2] &= ~FATTR4_WORD2_SECURITY_LABEL;
> > +		if (!(res.attr_bitmask[2] & FATTR4_WORD2_UNCACHEABLE_FILE_DATA))
> > +			server->fattr_valid &= ~NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA;
> > 
> >  		if (res.open_caps.oa_share_access_want[0] &
> >  		    NFS4_SHARE_WANT_OPEN_XOR_DELEGATION)
> > @@ -5809,6 +5817,8 @@ void nfs4_bitmask_set(__u32 bitmask[], const __u32 src[],
> >  		bitmask[1] |= FATTR4_WORD1_SPACE_USED;
> >  	if (cache_validity & NFS_INO_INVALID_BTIME)
> >  		bitmask[1] |= FATTR4_WORD1_TIME_CREATE;
> > +	if (cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA)
> > +		bitmask[2] |= FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > 
> >  	if (cache_validity & NFS_INO_INVALID_SIZE)
> >  		bitmask[0] |= FATTR4_WORD0_SIZE;
> > diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
> > index 1ed677810d9d..27748a979e12 100644
> > --- a/fs/nfs/nfs4trace.h
> > +++ b/fs/nfs/nfs4trace.h
> > @@ -33,7 +33,9 @@
> >  		{ NFS_ATTR_FATTR_CHANGE, "CHANGE" }, \
> >  		{ NFS_ATTR_FATTR_OWNER_NAME, "OWNER_NAME" }, \
> >  		{ NFS_ATTR_FATTR_GROUP_NAME, "GROUP_NAME" }, \
> > -		{ NFS_ATTR_FATTR_BTIME, "BTIME" })
> > +		{ NFS_ATTR_FATTR_BTIME, "BTIME" }, \
> > +		{ NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA, "UNCACHEABLE_FILE_DATA" })
> > +
> > 
> >  DECLARE_EVENT_CLASS(nfs4_clientid_event,
> >  		TP_PROTO(
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index c23c2eee1b5c..5020ac86b977 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -120,7 +120,8 @@ static int decode_layoutget(struct xdr_stream *xdr, 
> > struct rpc_rqst *req,
> >  				3*nfstime4_maxsz + \
> >  				nfs4_owner_maxsz + \
> >  				nfs4_group_maxsz + nfs4_label_maxsz + \
> > -				 decode_mdsthreshold_maxsz))
> > +				 decode_mdsthreshold_maxsz) + \
> > +				 NFS4_fattr4_uncacheable_file_data_sz)
> 
> Can this simply be a '1' like many of the other values here? I haven't
> looked into why the double-parentheses are here yet, but it might be
> styilisticly better to put the new value inside them right after
> decode_mdsthreshold_maxsz.

I used the xdrgen generated variable because over time the reason for
each unnamed +1 gets lost.  So no, I purposely made a point to put a
reason to the additional space usage.

As for the parens, I think maybe just looking at the diff isn't
adequate, if you look at the change applied to the code it will make
sense (I think).

> >  #define nfs4_fattr_maxsz	(nfs4_fattr_bitmap_maxsz + \
> >  				nfs4_fattr_value_maxsz)
> >  #define decode_getattr_maxsz    (op_decode_hdr_maxsz + 
> > nfs4_fattr_maxsz)
> > @@ -4380,6 +4381,30 @@ static int decode_attr_open_arguments(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  	return 0;
> >  }
> > 
> > +static int decode_attr_uncacheable_file_data(struct xdr_stream *xdr, 
> > uint32_t *bitmap,
> > +				   uint32_t *res, uint64_t *flags)
> > +{
> > +	int status = 0;
> > +	__be32 *p;
> > +
> > +	if (unlikely(bitmap[2] & (FATTR4_WORD2_UNCACHEABLE_FILE_DATA - 1U)))
> > +		return -EIO;
> > +	if (likely(bitmap[2] & FATTR4_WORD2_UNCACHEABLE_FILE_DATA)) {
> > +		p = xdr_inline_decode(xdr, 4);
> > +		if (unlikely(!p))
> > +			return -EIO;
> > +		if (be32_to_cpup(p))
> > +			*res |= NFS_AUX_UNCACHEABLE_FILE_DATA;
> > +		else
> > +			*res &= ~NFS_AUX_UNCACHEABLE_FILE_DATA;
> > +		bitmap[2] &= ~FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > +		*flags |= NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA;
> > +	}
> > +	dprintk("%s: uncacheable_file_data: =%s\n", __func__,
> > +		(*res & NFS_AUX_UNCACHEABLE_FILE_DATA) == 0 ? "false" : "true");
> > +	return status;
> > +}
> > +
> >  static int verify_attr_len(struct xdr_stream *xdr, unsigned int savep, 
> > uint32_t attrlen)
> >  {
> >  	unsigned int attrwords = XDR_QUADLEN(attrlen);
> > @@ -4725,6 +4750,8 @@ static int decode_getfattr_attrs(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  	uint32_t type;
> >  	int32_t err;
> > 
> > +	fattr->aux_flags = 0;
> > +
> >  	status = decode_attr_type(xdr, bitmap, &type);
> >  	if (status < 0)
> >  		goto xdr_error;
> > @@ -4843,6 +4870,12 @@ static int decode_getfattr_attrs(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  		goto xdr_error;
> >  	fattr->valid |= status;
> > 
> > +	status = decode_attr_uncacheable_file_data(xdr, bitmap, &fattr->aux_flags,
> > +					 &fattr->valid);
> > +	if (status < 0)
> > +		goto xdr_error;
> > +
> > +	status = 0;
> >  xdr_error:
> >  	dprintk("%s: xdr returned %d\n", __func__, -status);
> >  	return status;
> > diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
> > index 4ada21f4eebd..b15c1732c869 100644
> > --- a/fs/nfs/nfstrace.h
> > +++ b/fs/nfs/nfstrace.h
> > @@ -33,7 +33,8 @@
> >  			{ NFS_INO_INVALID_XATTR, "INVALID_XATTR" }, \
> >  			{ NFS_INO_INVALID_NLINK, "INVALID_NLINK" }, \
> >  			{ NFS_INO_INVALID_MODE, "INVALID_MODE" }, \
> > -			{ NFS_INO_INVALID_BTIME, "INVALID_BTIME" })
> > +			{ NFS_INO_INVALID_BTIME, "INVALID_BTIME" }, \
> > +			{ NFS_INO_INVALID_UNCACHEABLE_FILE_DATA, "INVALID_UNCACHEABLE_FILE_DATA" })
> > 
> >  #define nfs_show_nfsi_flags(v) \
> >  	__print_flags(v, "|", \
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index 34aa303354bc..af402373d0e7 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -476,6 +476,7 @@ enum {
> >  #define FATTR4_WORD2_ACL_TRUEFORM_SCOPE	BIT(FATTR4_ACL_TRUEFORM_SCOPE 
> > - 64)
> >  #define FATTR4_WORD2_POSIX_DEFAULT_ACL	BIT(FATTR4_POSIX_DEFAULT_ACL - 
> > 64)
> >  #define FATTR4_WORD2_POSIX_ACCESS_ACL	BIT(FATTR4_POSIX_ACCESS_ACL - 64)
> > +#define 
> > FATTR4_WORD2_UNCACHEABLE_FILE_DATA	BIT(FATTR4_UNCACHEABLE_FILE_DATA - 
> > 64)
> > 
> >  /* MDS threshold bitmap bits */
> >  #define THRESHOLD_RD                    (1UL << 0)
> > diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> > index ec17e602c979..b9228086a1df 100644
> > --- a/include/linux/nfs_fs.h
> > +++ b/include/linux/nfs_fs.h
> > @@ -162,6 +162,8 @@ struct nfs_inode {
> > 
> >  	struct timespec64	btime;
> > 
> > +	unsigned char		uncacheable_file_data : 1;
> > +
> 
> Since this is a boolean value, could we store it in a bool?

Sure, we can use a bitfield with bool as well. I don't have a
preference, but seeing "bool" would express the boolean flag nature of
this struct member (and any other flags that might follow).

Mike

  reply	other threads:[~2026-06-25 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 19:17 [PATCH 0/4] nfs: NFSv4.2 client support for UNCACHEABLE_FILE_DATA Mike Snitzer
2026-06-24 19:17 ` [PATCH 1/4] nfs4.2: add nfs4_2.x to generate the UNCACHEABLE_FILE_DATA attribute Mike Snitzer
2026-06-25 14:26   ` Anna Schumaker
2026-06-25 17:31     ` Mike Snitzer
2026-06-25 18:08       ` Chuck Lever
2026-06-24 19:17 ` [PATCH 2/4] nfs4.2: add UNCACHEABLE_FILE_DATA attribute support Mike Snitzer
2026-06-25 14:56   ` Anna Schumaker
2026-06-25 17:51     ` Mike Snitzer [this message]
2026-06-24 19:17 ` [PATCH 3/4] nfs4.2: request UNCACHEABLE_FILE_DATA only for regular files Mike Snitzer
2026-06-24 19:17 ` [PATCH 4/4] nfs4.2: open UNCACHEABLE_FILE_DATA files with O_DIRECT Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aj1qil-3Bc7ppLlY@kernel.org \
    --to=snitzer@kernel.org \
    --cc=anna@kernel.org \
    --cc=cel@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=loghyr@hammerspace.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.