From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>, Scott Mayhew <smayhew@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2] nfsd: fix file change detection in CB_GETATTR
Date: Mon, 06 Apr 2026 10:38:24 -0400 [thread overview]
Message-ID: <e812ca8b2caaf2cafaf94b7b5be487346fe07285.camel@kernel.org> (raw)
In-Reply-To: <fb00b2d1-e7be-4567-a077-9ec26a938a5c@app.fastmail.com>
On Mon, 2026-04-06 at 10:12 -0400, Chuck Lever wrote:
>
> On Sun, Apr 5, 2026, at 6:13 PM, Scott Mayhew wrote:
> > On Sat, 04 Apr 2026, Chuck Lever wrote:
> > > On Fri, Apr 3, 2026, at 8:54 PM, Scott Mayhew wrote:
>
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index fa657badf5f8..53d8e7e7d60b 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > >
> > > > @@ -9459,17 +9461,18 @@ static int cb_getattr_update_times(struct
> > > > dentry *dentry, struct nfs4_delegation
> > > > * caller must put the reference.
> > > > */
> > > > __be32
> > > > -nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct dentry
> > > > *dentry,
> > > > - struct nfs4_delegation **pdp)
> > > > +nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct path *path,
> > > > + struct kstat *stat, struct nfs4_delegation **pdp)
> > >
> > > Passing the kstat struct in saves some stack just as I suggested,
> > > but it is an ugly API. The nfsd4_encode_fattr4() call stack is tall,
> > > though -- did you happen to measure how deep it gets after this patch
> > > is applied?
> > >
> >
> > I tried using the stack tracer:
> >
> > # echo 1 >/proc/sys/kernel/stack_tracer_enabled
> > # echo vfs_getattr >/sys/kernel/debug/tracing/stack_trace_filter
> > # cat /sys/kernel/debug/tracing/stack_trace
> > Depth Size Location (18 entries)
> > ----- ---- --------
> > 0) 1936 48 vfs_getattr+0x9/0x50
> > 1) 1888 552 nfsd4_encode_fattr4+0x1b2/0x7a0 [nfsd]
> > 2) 1336 80 nfsd4_encode_entry4_fattr+0xf8/0x210 [nfsd]
> > 3) 1256 96 nfsd4_encode_entry4+0x10b/0x2a0 [nfsd]
> > 4) 1160 144 nfsd_buffered_readdir+0x139/0x310 [nfsd]
> > 5) 1016 80 nfsd_readdir+0x9f/0x180 [nfsd]
> > 6) 936 80 nfsd4_encode_readdir+0xdf/0x1e0 [nfsd]
> > 7) 856 80 nfsd4_encode_operation+0xcf/0x3d0 [nfsd]
> > 8) 776 80 nfsd4_proc_compound+0x1d6/0x7a0 [nfsd]
> > 9) 696 80 nfsd_dispatch+0xd9/0x240 [nfsd]
> > 10) 616 80 svc_process_common+0x4cb/0x6b0 [sunrpc]
> > 11) 536 40 svc_process+0x150/0x240 [sunrpc]
> > 12) 496 72 svc_handle_xprt+0x4b0/0x5f0 [sunrpc]
> > 13) 424 56 svc_recv+0x1b2/0x3a0 [sunrpc]
> > 14) 368 80 nfsd+0x11c/0x3d0 [nfsd]
> > 15) 288 56 kthread+0xe3/0x120
> > 16) 232 40 ret_from_fork+0x1a1/0x270
> > 17) 192 192 ret_from_fork_asm+0x1a/0x30
> >
> > But that's capturing a vfs_getattr() call from encoding a readdir reply,
> > rather than the vfs_getattr() call I added to nfsd4_deleg_getattr_conflict().
> >
> > Here's the stack depth for nfsd4_deleg_getattr_conflict():
> >
> > # echo nfsd4_deleg_getattr_conflict
> > > /sys/kernel/debug/tracing/stack_trace_filter
> > # echo 0 >/sys/kernel/debug/tracing/stack_max_size
> > # cat /sys/kernel/debug/tracing/stack_trace
> > Depth Size Location (14 entries)
> > ----- ---- --------
> > 0) 1472 48 nfsd4_deleg_getattr_conflict+0x9/0x410 [nfsd]
> > 1) 1424 552 nfsd4_encode_fattr4+0x191/0x7a0 [nfsd]
> > 2) 872 16 nfsd4_encode_getattr+0x2c/0x40 [nfsd]
> > 3) 856 80 nfsd4_encode_operation+0xcf/0x3d0 [nfsd]
> > 4) 776 80 nfsd4_proc_compound+0x1d6/0x7a0 [nfsd]
> > 5) 696 80 nfsd_dispatch+0xd9/0x240 [nfsd]
> > 6) 616 80 svc_process_common+0x4cb/0x6b0 [sunrpc]
> > 7) 536 40 svc_process+0x150/0x240 [sunrpc]
> > 8) 496 72 svc_handle_xprt+0x4b0/0x5f0 [sunrpc]
> > 9) 424 56 svc_recv+0x1b2/0x3a0 [sunrpc]
> > 10) 368 80 nfsd+0x11c/0x3d0 [nfsd]
> > 11) 288 56 kthread+0xe3/0x120
> > 12) 232 40 ret_from_fork+0x1a1/0x270
> > 13) 192 192 ret_from_fork_asm+0x1a/0x30
> >
> > Manually inspecting function graphs of vfs_getattr(), it looks like the deepest
> > function (that we can trace) is avc_lookup(), so here's a bpftrace script that
> > prints the stack depth from avc_lookup() via nfsd4_deleg_getattr_conflict()
> > (I mostly punted to Gemini for this):
> >
> > # cat peak-usage.bt
> > kprobe:nfsd4_deleg_getattr_conflict {
> > @in_deleg_conflict[tid] = 1;
> > }
> >
> > kprobe:avc_lookup /@in_deleg_conflict[tid]/ {
> > $stack_size = (uint64)16384;
> > $sp = reg("sp");
> > $stack_base = $sp & ~($stack_size - 1);
> > $total_used = $stack_base + $stack_size - $sp;
> >
> > if ($total_used > @max_depth_bytes) {
> > @max_depth_bytes = $total_used;
> > @deepest_stack = kstack;
> > }
> > }
> >
> > kretprobe:nfsd4_deleg_getattr_conflict { delete(@in_deleg_conflict[tid]); }
> >
> > And finally the result:
> >
> > # bpftrace peak-usage.bt
> > Attached 3 probes
> > ^C
> >
> > @deepest_stack:
> > avc_lookup+1
> > avc_has_perm_noaudit+60
> > avc_has_perm+89
> > selinux_inode_getattr+203
> > security_inode_getattr+70
> > vfs_getattr+35
> > nfsd4_deleg_getattr_conflict+958
> > nfsd4_encode_fattr4+401
> > nfsd4_encode_getattr+44
> > nfsd4_encode_operation+207
> > nfsd4_proc_compound+470
> > nfsd_dispatch+217
> > svc_process_common+1227
> > svc_process+336
> > svc_handle_xprt+1200
> > svc_recv+434
> > nfsd+284
> > kthread+227
> > ret_from_fork+417
> > ret_from_fork_asm+26
> >
> > @max_depth_bytes: 1792
>
> Since the new code only needs the file's size, perhaps you can get
> away with
>
> if (i_size_read(inode) != ncf->ncf_cb_fsize)
>
> rather than
>
> err = vfs_getattr(path, stat, STATX_SIZE, AT_STATX_SYNC_AS_STAT);
> if (err) {
> status = nfserrno(err);
> goto out_status;
> }
> if (stat->size != ncf->ncf_cb_fsize)
>
> Then there's no longer a need for a struct kstat at all. The client is
> holding a delegation, so I would expect the file size to be stable.
>
I hate relying on the size for this, since it's not a reliable
indicator of change, especially on something that has a fixed size (VM
image, for instance).
In fact, I wonder if this ought to be looking at the mtime too. If the
mtime that the client reported is later than the mtime the server has,
then we can probably infer that there are buffered writes out there.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-04-06 14:38 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 0:54 [PATCH v2] nfsd: fix file change detection in CB_GETATTR Scott Mayhew
2026-04-04 15:07 ` Chuck Lever
2026-04-05 22:13 ` Scott Mayhew
2026-04-06 14:12 ` Chuck Lever
2026-04-06 14:38 ` Jeff Layton [this message]
2026-04-06 17:47 ` Scott Mayhew
2026-04-06 18:13 ` Jeff Layton
2026-04-06 18:48 ` Jeff Layton
2026-04-06 22:14 ` Scott Mayhew
2026-04-06 17:20 ` Scott Mayhew
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e812ca8b2caaf2cafaf94b7b5be487346fe07285.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=smayhew@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox