Linux NFS development
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Anna Schumaker <anna@kernel.org>
Cc: Trond Myklebust <trondmy@kernel.org>,
	Tom Haynes <loghyr@hammerspace.com>, Chuck Lever <cel@kernel.org>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/4] nfs4.2: add UNCACHEABLE_FILE_DATA attribute support
Date: Thu, 25 Jun 2026 13:51:06 -0400	[thread overview]
Message-ID: <aj1qil-3Bc7ppLlY@kernel.org> (raw)
In-Reply-To: <24b3d9cd-d06a-4c35-b316-f7ec88f48316@app.fastmail.com>

On Thu, Jun 25, 2026 at 10:56:02AM -0400, Anna Schumaker wrote:
> Hi Mike,
> 
> On Wed, Jun 24, 2026, at 3:17 PM, Mike Snitzer wrote:
> > From: Tom Haynes <loghyr@hammerspace.com>
> >
> > Recognize the NFSv4.2 per-file UNCACHEABLE_FILE_DATA attribute (attr 87,
> > draft-ietf-nfsv4-uncacheable-files): decode it via GETATTR, track per-
> > exported-filesystem support, and record on the inode whether a regular
> > file's data must not be cached.  Acting on the attribute (opening such
> > files O_DIRECT) is done by a subsequent change.
> >
> > If the NFSv4 server reports a regular file's UNCACHEABLE_FILE_DATA as
> > true, it indicates the file's data must not be cached; the client records
> > this in NFS_I(inode)->uncacheable_file_data for use by the I/O paths.
> >
> > The UNCACHEABLE_FILE_DATA attribute applies only to regular files
> > (NF4REG); per the draft a server MUST reject a query of it on any other
> > object type with NFS4ERR_INVAL.  A subsequent commit gates the client
> > accordingly.
> >
> > See: https://datatracker.ietf.org/doc/draft-ietf-nfsv4-uncacheable-files/
> >
> > Signed-off-by: Tom Haynes <loghyr@hammerspace.com>
> > [snitzer: adapt Tom's original code focused on metadata for ABE]
> > Co-developed-by: Mike Snitzer <snitzer@hammerspace.com>
> > Signed-off-by: Mike Snitzer <snitzer@hammerspace.com>
> > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > Assisted-by: Claude:claude-opus-4-8
> > ---
> >  fs/nfs/inode.c          | 22 +++++++++++++++++++---
> >  fs/nfs/nfs4proc.c       | 14 ++++++++++++--
> >  fs/nfs/nfs4trace.h      |  4 +++-
> >  fs/nfs/nfs4xdr.c        | 35 ++++++++++++++++++++++++++++++++++-
> >  fs/nfs/nfstrace.h       |  3 ++-
> >  include/linux/nfs4.h    |  1 +
> >  include/linux/nfs_fs.h  |  3 +++
> >  include/linux/nfs_xdr.h |  8 +++++++-
> >  8 files changed, 81 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index 5bcd4027d203..c1227b7c5545 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -507,6 +507,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh 
> > *fh, struct nfs_fattr *fattr)
> >  		inode->i_blocks = 0;
> >  		nfsi->write_io = 0;
> >  		nfsi->read_io = 0;
> > +		nfsi->uncacheable_file_data = 0;
> > 
> >  		nfsi->read_cache_jiffies = fattr->time_start;
> >  		nfsi->attr_gencount = fattr->gencount;
> > @@ -561,6 +562,11 @@ nfs_fhget(struct super_block *sb, struct nfs_fh 
> > *fh, struct nfs_fattr *fattr)
> >  		} else if (fattr_supported & NFS_ATTR_FATTR_SPACE_USED &&
> >  			   fattr->size != 0)
> >  			nfs_set_cache_invalid(inode, NFS_INO_INVALID_BLOCKS);
> > +		if (fattr->valid & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +			nfsi->uncacheable_file_data =
> > +				!!(fattr->aux_flags & NFS_AUX_UNCACHEABLE_FILE_DATA);
> > +		else if (fattr_supported & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +			nfs_set_cache_invalid(inode, NFS_INO_INVALID_UNCACHEABLE_FILE_DATA);
> > 
> >  		nfs_setsecurity(inode, fattr);
> > 
> > @@ -1975,7 +1981,8 @@ static int 
> > nfs_inode_finish_partial_attr_update(const struct nfs_fattr *fattr,
> >  		NFS_INO_INVALID_ATIME | NFS_INO_INVALID_CTIME |
> >  		NFS_INO_INVALID_MTIME | NFS_INO_INVALID_SIZE |
> >  		NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_OTHER |
> > -		NFS_INO_INVALID_NLINK | NFS_INO_INVALID_BTIME;
> > +		NFS_INO_INVALID_NLINK | NFS_INO_INVALID_BTIME |
> > +		NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  	unsigned long cache_validity = NFS_I(inode)->cache_validity;
> >  	enum nfs4_change_attr_type ctype = 
> > NFS_SERVER(inode)->change_attr_type;
> > 
> > @@ -2297,7 +2304,8 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  	nfsi->cache_validity &= ~(NFS_INO_INVALID_ATTR
> >  			| NFS_INO_INVALID_ATIME
> >  			| NFS_INO_REVAL_FORCED
> > -			| NFS_INO_INVALID_BLOCKS);
> > +			| NFS_INO_INVALID_BLOCKS
> > +			| NFS_INO_INVALID_UNCACHEABLE_FILE_DATA);
> > 
> >  	/* Do atomic weak cache consistency updates */
> >  	nfs_wcc_update_inode(inode, fattr);
> > @@ -2337,7 +2345,8 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  					| NFS_INO_INVALID_NLINK
> >  					| NFS_INO_INVALID_MODE
> >  					| NFS_INO_INVALID_OTHER
> > -					| NFS_INO_INVALID_BTIME;
> > +					| NFS_INO_INVALID_BTIME
> > +					| NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  				if (S_ISDIR(inode->i_mode))
> >  					nfs_force_lookup_revalidate(inode);
> >  				attr_changed = true;
> > @@ -2461,6 +2470,13 @@ static int nfs_update_inode(struct inode *inode, 
> > struct nfs_fattr *fattr)
> >  		nfsi->cache_validity |=
> >  			save_cache_validity & NFS_INO_INVALID_BLOCKS;
> > 
> > +	if (fattr->valid & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +		nfsi->uncacheable_file_data =
> > +				!!(fattr->aux_flags & NFS_AUX_UNCACHEABLE_FILE_DATA);
> > +	else if (fattr_supported & NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA)
> > +		nfsi->cache_validity |=
> > +			save_cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> > +
> >  	/* Update attrtimeo value if we're out of the unstable period */
> >  	if (attr_changed) {
> >  		nfs_inc_stats(inode, NFSIOS_ATTRINVALIDATE);
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 1360409d8de9..d237abca4793 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -228,6 +228,7 @@ const u32 nfs4_fattr_bitmap[3] = {
> >  #ifdef CONFIG_NFS_V4_SECURITY_LABEL
> >  	FATTR4_WORD2_SECURITY_LABEL
> >  #endif
> > +	| FATTR4_WORD2_UNCACHEABLE_FILE_DATA
> >  };
> > 
> >  static const u32 nfs4_pnfs_open_bitmap[3] = {
> > @@ -250,6 +251,7 @@ static const u32 nfs4_pnfs_open_bitmap[3] = {
> >  #ifdef CONFIG_NFS_V4_SECURITY_LABEL
> >  	| FATTR4_WORD2_SECURITY_LABEL
> >  #endif
> > +	| FATTR4_WORD2_UNCACHEABLE_FILE_DATA
> >  };
> > 
> >  static const u32 nfs4_open_noattr_bitmap[3] = {
> > @@ -327,6 +329,9 @@ static void nfs4_bitmap_copy_adjust(__u32 *dst, 
> > const __u32 *src,
> >  	if (!(cache_validity & NFS_INO_INVALID_BTIME))
> >  		dst[1] &= ~FATTR4_WORD1_TIME_CREATE;
> > 
> > +	if (!(cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA))
> > +		dst[2] &= ~FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > +
> >  	if (nfs_have_delegated_mtime(inode)) {
> >  		if (!(cache_validity & NFS_INO_INVALID_ATIME))
> >  			dst[1] &= ~(FATTR4_WORD1_TIME_ACCESS|FATTR4_WORD1_TIME_ACCESS_SET);
> > @@ -1238,7 +1243,7 @@ nfs4_update_changeattr_locked(struct inode *inode,
> >  				NFS_INO_INVALID_SIZE | NFS_INO_INVALID_OTHER |
> >  				NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_NLINK |
> >  				NFS_INO_INVALID_MODE | NFS_INO_INVALID_BTIME |
> > -				NFS_INO_INVALID_XATTR;
> > +				NFS_INO_INVALID_XATTR | NFS_INO_INVALID_UNCACHEABLE_FILE_DATA;
> >  		nfsi->attrtimeo = NFS_MINATTRTIMEO(inode);
> >  	}
> >  	nfsi->attrtimeo_timestamp = jiffies;
> > @@ -3839,6 +3844,7 @@ nfs4_atomic_open(struct inode *dir, struct 
> > nfs_open_context *ctx,
> > 
> >  	if (IS_ERR(state))
> >  		return ERR_CAST(state);
> > +
> 
> I think this unrelated whitespace change accidentally snuck in

Yeap, will clean up if v2 needed.
 
> >  	return state->inode;
> >  }
> > 
> > @@ -3857,7 +3863,7 @@ static void nfs4_close_context(struct 
> > nfs_open_context *ctx, int is_sync)
> > 
> >  #define FATTR4_WORD1_NFS40_MASK (2*FATTR4_WORD1_MOUNTED_ON_FILEID - 1UL)
> >  #define FATTR4_WORD2_NFS41_MASK (2*FATTR4_WORD2_SUPPATTR_EXCLCREAT - 1UL)
> > -#define FATTR4_WORD2_NFS42_MASK (2*FATTR4_WORD2_OPEN_ARGUMENTS - 1UL)
> > +#define FATTR4_WORD2_NFS42_MASK (2*FATTR4_WORD2_UNCACHEABLE_FILE_DATA - 1UL)
> > 
> >  #define FATTR4_WORD2_NFS42_TIME_DELEG_MASK \
> >  	(FATTR4_WORD2_TIME_DELEG_MODIFY|FATTR4_WORD2_TIME_DELEG_ACCESS)
> > @@ -3981,6 +3987,8 @@ static int _nfs4_server_capabilities(struct 
> > nfs_server *server, struct nfs_fh *f
> >  		memcpy(server->attr_bitmask_nl, res.attr_bitmask,
> >  				sizeof(server->attr_bitmask));
> >  		server->attr_bitmask_nl[2] &= ~FATTR4_WORD2_SECURITY_LABEL;
> > +		if (!(res.attr_bitmask[2] & FATTR4_WORD2_UNCACHEABLE_FILE_DATA))
> > +			server->fattr_valid &= ~NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA;
> > 
> >  		if (res.open_caps.oa_share_access_want[0] &
> >  		    NFS4_SHARE_WANT_OPEN_XOR_DELEGATION)
> > @@ -5809,6 +5817,8 @@ void nfs4_bitmask_set(__u32 bitmask[], const __u32 src[],
> >  		bitmask[1] |= FATTR4_WORD1_SPACE_USED;
> >  	if (cache_validity & NFS_INO_INVALID_BTIME)
> >  		bitmask[1] |= FATTR4_WORD1_TIME_CREATE;
> > +	if (cache_validity & NFS_INO_INVALID_UNCACHEABLE_FILE_DATA)
> > +		bitmask[2] |= FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > 
> >  	if (cache_validity & NFS_INO_INVALID_SIZE)
> >  		bitmask[0] |= FATTR4_WORD0_SIZE;
> > diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
> > index 1ed677810d9d..27748a979e12 100644
> > --- a/fs/nfs/nfs4trace.h
> > +++ b/fs/nfs/nfs4trace.h
> > @@ -33,7 +33,9 @@
> >  		{ NFS_ATTR_FATTR_CHANGE, "CHANGE" }, \
> >  		{ NFS_ATTR_FATTR_OWNER_NAME, "OWNER_NAME" }, \
> >  		{ NFS_ATTR_FATTR_GROUP_NAME, "GROUP_NAME" }, \
> > -		{ NFS_ATTR_FATTR_BTIME, "BTIME" })
> > +		{ NFS_ATTR_FATTR_BTIME, "BTIME" }, \
> > +		{ NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA, "UNCACHEABLE_FILE_DATA" })
> > +
> > 
> >  DECLARE_EVENT_CLASS(nfs4_clientid_event,
> >  		TP_PROTO(
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index c23c2eee1b5c..5020ac86b977 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -120,7 +120,8 @@ static int decode_layoutget(struct xdr_stream *xdr, 
> > struct rpc_rqst *req,
> >  				3*nfstime4_maxsz + \
> >  				nfs4_owner_maxsz + \
> >  				nfs4_group_maxsz + nfs4_label_maxsz + \
> > -				 decode_mdsthreshold_maxsz))
> > +				 decode_mdsthreshold_maxsz) + \
> > +				 NFS4_fattr4_uncacheable_file_data_sz)
> 
> Can this simply be a '1' like many of the other values here? I haven't
> looked into why the double-parentheses are here yet, but it might be
> styilisticly better to put the new value inside them right after
> decode_mdsthreshold_maxsz.

I used the xdrgen generated variable because over time the reason for
each unnamed +1 gets lost.  So no, I purposely made a point to put a
reason to the additional space usage.

As for the parens, I think maybe just looking at the diff isn't
adequate, if you look at the change applied to the code it will make
sense (I think).

> >  #define nfs4_fattr_maxsz	(nfs4_fattr_bitmap_maxsz + \
> >  				nfs4_fattr_value_maxsz)
> >  #define decode_getattr_maxsz    (op_decode_hdr_maxsz + 
> > nfs4_fattr_maxsz)
> > @@ -4380,6 +4381,30 @@ static int decode_attr_open_arguments(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  	return 0;
> >  }
> > 
> > +static int decode_attr_uncacheable_file_data(struct xdr_stream *xdr, 
> > uint32_t *bitmap,
> > +				   uint32_t *res, uint64_t *flags)
> > +{
> > +	int status = 0;
> > +	__be32 *p;
> > +
> > +	if (unlikely(bitmap[2] & (FATTR4_WORD2_UNCACHEABLE_FILE_DATA - 1U)))
> > +		return -EIO;
> > +	if (likely(bitmap[2] & FATTR4_WORD2_UNCACHEABLE_FILE_DATA)) {
> > +		p = xdr_inline_decode(xdr, 4);
> > +		if (unlikely(!p))
> > +			return -EIO;
> > +		if (be32_to_cpup(p))
> > +			*res |= NFS_AUX_UNCACHEABLE_FILE_DATA;
> > +		else
> > +			*res &= ~NFS_AUX_UNCACHEABLE_FILE_DATA;
> > +		bitmap[2] &= ~FATTR4_WORD2_UNCACHEABLE_FILE_DATA;
> > +		*flags |= NFS_ATTR_FATTR_UNCACHEABLE_FILE_DATA;
> > +	}
> > +	dprintk("%s: uncacheable_file_data: =%s\n", __func__,
> > +		(*res & NFS_AUX_UNCACHEABLE_FILE_DATA) == 0 ? "false" : "true");
> > +	return status;
> > +}
> > +
> >  static int verify_attr_len(struct xdr_stream *xdr, unsigned int savep, 
> > uint32_t attrlen)
> >  {
> >  	unsigned int attrwords = XDR_QUADLEN(attrlen);
> > @@ -4725,6 +4750,8 @@ static int decode_getfattr_attrs(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  	uint32_t type;
> >  	int32_t err;
> > 
> > +	fattr->aux_flags = 0;
> > +
> >  	status = decode_attr_type(xdr, bitmap, &type);
> >  	if (status < 0)
> >  		goto xdr_error;
> > @@ -4843,6 +4870,12 @@ static int decode_getfattr_attrs(struct 
> > xdr_stream *xdr, uint32_t *bitmap,
> >  		goto xdr_error;
> >  	fattr->valid |= status;
> > 
> > +	status = decode_attr_uncacheable_file_data(xdr, bitmap, &fattr->aux_flags,
> > +					 &fattr->valid);
> > +	if (status < 0)
> > +		goto xdr_error;
> > +
> > +	status = 0;
> >  xdr_error:
> >  	dprintk("%s: xdr returned %d\n", __func__, -status);
> >  	return status;
> > diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
> > index 4ada21f4eebd..b15c1732c869 100644
> > --- a/fs/nfs/nfstrace.h
> > +++ b/fs/nfs/nfstrace.h
> > @@ -33,7 +33,8 @@
> >  			{ NFS_INO_INVALID_XATTR, "INVALID_XATTR" }, \
> >  			{ NFS_INO_INVALID_NLINK, "INVALID_NLINK" }, \
> >  			{ NFS_INO_INVALID_MODE, "INVALID_MODE" }, \
> > -			{ NFS_INO_INVALID_BTIME, "INVALID_BTIME" })
> > +			{ NFS_INO_INVALID_BTIME, "INVALID_BTIME" }, \
> > +			{ NFS_INO_INVALID_UNCACHEABLE_FILE_DATA, "INVALID_UNCACHEABLE_FILE_DATA" })
> > 
> >  #define nfs_show_nfsi_flags(v) \
> >  	__print_flags(v, "|", \
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index 34aa303354bc..af402373d0e7 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -476,6 +476,7 @@ enum {
> >  #define FATTR4_WORD2_ACL_TRUEFORM_SCOPE	BIT(FATTR4_ACL_TRUEFORM_SCOPE 
> > - 64)
> >  #define FATTR4_WORD2_POSIX_DEFAULT_ACL	BIT(FATTR4_POSIX_DEFAULT_ACL - 
> > 64)
> >  #define FATTR4_WORD2_POSIX_ACCESS_ACL	BIT(FATTR4_POSIX_ACCESS_ACL - 64)
> > +#define 
> > FATTR4_WORD2_UNCACHEABLE_FILE_DATA	BIT(FATTR4_UNCACHEABLE_FILE_DATA - 
> > 64)
> > 
> >  /* MDS threshold bitmap bits */
> >  #define THRESHOLD_RD                    (1UL << 0)
> > diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> > index ec17e602c979..b9228086a1df 100644
> > --- a/include/linux/nfs_fs.h
> > +++ b/include/linux/nfs_fs.h
> > @@ -162,6 +162,8 @@ struct nfs_inode {
> > 
> >  	struct timespec64	btime;
> > 
> > +	unsigned char		uncacheable_file_data : 1;
> > +
> 
> Since this is a boolean value, could we store it in a bool?

Sure, we can use a bitfield with bool as well. I don't have a
preference, but seeing "bool" would express the boolean flag nature of
this struct member (and any other flags that might follow).

Mike

  reply	other threads:[~2026-06-25 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 19:17 [PATCH 0/4] nfs: NFSv4.2 client support for UNCACHEABLE_FILE_DATA Mike Snitzer
2026-06-24 19:17 ` [PATCH 1/4] nfs4.2: add nfs4_2.x to generate the UNCACHEABLE_FILE_DATA attribute Mike Snitzer
2026-06-25 14:26   ` Anna Schumaker
2026-06-25 17:31     ` Mike Snitzer
2026-06-25 18:08       ` Chuck Lever
2026-06-24 19:17 ` [PATCH 2/4] nfs4.2: add UNCACHEABLE_FILE_DATA attribute support Mike Snitzer
2026-06-25 14:56   ` Anna Schumaker
2026-06-25 17:51     ` Mike Snitzer [this message]
2026-06-24 19:17 ` [PATCH 3/4] nfs4.2: request UNCACHEABLE_FILE_DATA only for regular files Mike Snitzer
2026-06-24 19:17 ` [PATCH 4/4] nfs4.2: open UNCACHEABLE_FILE_DATA files with O_DIRECT Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aj1qil-3Bc7ppLlY@kernel.org \
    --to=snitzer@kernel.org \
    --cc=anna@kernel.org \
    --cc=cel@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=loghyr@hammerspace.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox