From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Alexander Aring <alex.aring@gmail.com>,
Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
Calum Mackay <calum.mackay@oracle.com>,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH v6 10/20] nfsd: add notification handlers for dir events
Date: Fri, 12 Jun 2026 14:36:06 -0400 [thread overview]
Message-ID: <d2a53fc34760fc986315df31ca6887bb3a54f47d.camel@kernel.org> (raw)
In-Reply-To: <b94c3e40-0520-4e83-9b4f-53a9325cecfe@app.fastmail.com>
On Fri, 2026-06-12 at 13:51 -0400, Chuck Lever wrote:
> On Thu, Jun 11, 2026, at 1:50 PM, Jeff Layton wrote:
> > Add the necessary parts to accept a fsnotify callback for directory
> > change event and create a CB_NOTIFY request for it. When a dir nfsd_file
> > is created set a handle_event callback to handle the notification.
> >
> > Use that to allocate a nfsd_notify_event object and then hand off a
> > reference to each delegation's CB_NOTIFY. If anything fails along the
> > way, recall any affected delegations.
> >
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
>
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index ca4dd2f969eb..59378751d596 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
>
> > @@ -904,13 +908,45 @@ static void nfs4_xdr_enc_cb_notify(struct rpc_rqst *req,
> > encode_cb_sequence4args(xdr, cb, &hdr);
> >
> > /*
> > - * FIXME: get stateid and fh from delegation. Inline the cna_changes
> > - * buffer, and zero it.
> > + * nfsd4_cb_notify_prepare() sized the payload against a single page,
> > + * but did not account for the compound, sequence, stateid, and
> > + * filehandle encoded here. If the variable-length encode overflows the
> > + * backchannel send buffer, roll back to before the operation so that a
> > + * truncated CB_NOTIFY is never placed on the wire.
> > */
> > - xdrgen_encode_CB_NOTIFY4args(xdr, &args);
> > + start = xdr_stream_pos(xdr);
> > +
> > + p = xdr_reserve_space(xdr, 4);
> > + if (!p)
> > + goto out_err;
> > + *p = cpu_to_be32(OP_CB_NOTIFY);
>
> Please use xdr_stream_encode_u32 for this purpose.
>
Ok
>
> > +
> > + args.cna_stateid.seqid = dp->dl_stid.sc_stateid.si_generation;
> > + memcpy(&args.cna_stateid.other, &dp->dl_stid.sc_stateid.si_opaque,
> > + ARRAY_SIZE(args.cna_stateid.other));
> > + args.cna_fh.len = dp->dl_stid.sc_file->fi_fhandle.fh_size;
> > + args.cna_fh.data = dp->dl_stid.sc_file->fi_fhandle.fh_raw;
> > + args.cna_changes.count = ncn->ncn_nf_cnt;
> > + args.cna_changes.element = ncn->ncn_nf;
> > + if (!xdrgen_encode_CB_NOTIFY4args(xdr, &args))
> > + goto out_err;
> >
> > hdr.nops++;
> > encode_cb_nops(&hdr);
> > + return;
> > +
> > +out_err:
> > + /*
> > + * Drop the CB_NOTIFY op and emit a valid CB_SEQUENCE-only compound so
> > + * the client still advances its slot. Flag the failure so the done
> > + * handler recalls the delegation and the missed notification is not
> > + * silently lost. The flag is written here in the transmit path and read
> > + * in the done handler; the two are serialized phases of the same
> > + * rpc_task, so no additional barrier is needed.
> > + */
> > + ncn->ncn_encode_err = true;
>
> This flag is zeroed only once, at allocation time in alloc_init_dir_deleg().
> It is never cleared in nfsd4_cb_notify_prepare().
>
> Since nfsd4_cb_notify_release() can requeue the callback (via
> nfsd4_run_cb_notify) when events arrive while a callback is in flight,
> ->prepare may encode cleanly and return true, but nfsd4_cb_notify_done()
> still observes the stale ncn_encode_err == true and calls
> nfsd_break_one_deleg() -- discarding a good notification and recalling
> the delegation unnecessarily.
>
Ok, so we need to reset this in ->prepare.
>
> > + xdr_truncate_encode(xdr, start);
> > + encode_cb_nops(&hdr);
> > }
> >
> > static int nfs4_xdr_dec_cb_notify(struct rpc_rqst *rqstp,
>
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 0a15d7f3b543..513cbc1a583f 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
>
> > @@ -3471,19 +3472,146 @@ nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
> > nfs4_put_stid(&dp->dl_stid);
> > }
> >
> > +static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > +{
> > + bool queued;
> > +
> > + if (test_and_set_bit(NFSD4_CALLBACK_RUNNING, &dp->dl_recall.cb_flags))
> > + return;
> > +
> > + /*
> > + * We're assuming the state code never drops its reference
> > + * without first removing the lease. Since we're in this lease
> > + * callback (and since the lease code is serialized by the
> > + * flc_lock) we know the server hasn't removed the lease yet, and
> > + * we know it's safe to take a reference.
> > + */
> > + refcount_inc(&dp->dl_stid.sc_count);
> > + queued = nfsd4_run_cb(&dp->dl_recall);
> > + WARN_ON_ONCE(!queued);
> > + if (!queued) {
> > + refcount_dec(&dp->dl_stid.sc_count);
> > + clear_bit(NFSD4_CALLBACK_RUNNING, &dp->dl_recall.cb_flags);
> > + }
> > +}
>
> nfsd_break_one_deleg() does an unconditional
> refcount_inc(&dp->dl_stid.sc_count), and its comment justifies this
> with "the lease code is serialized by the flc_lock." That invariant
> holds when called from nfsd_break_deleg_cb() under flc_lock, but
> nfsd4_cb_notify_prepare() runs on a workqueue WITHOUT flc_lock. Its
> out_recall: path calls nfsd_break_one_deleg(dp)
> directly. The delegation can be concurrently destroyed with sc_count
> already at zero, making this an inc-from-zero.
>
> The dispatch path nfsd4_run_cb_notify already does this correctly with
> refcount_inc_not_zero. The out_recall path needs the same guard (skip
> the recall / bail if the refcount is already zero).
>
> I notice that the last unapplied patch ("nfsd: add
> support to CB_NOTIFY for dir attribute changes") rewrites the guard
> "if (count > NOTIFY4_EVENT_QUEUE_SIZE)" into "if (count > limit)" with
> limit = NOTIFY4_EVENT_QUEUE_SIZE - 1 when NOTIFY4_CHANGE_DIR_ATTRS is
> requested. That turns the previously-dead overflow branch into a live,
> routine path to out_recall, which adds another normal-operation route
> into this unlocked recall.
>
This wart has been there a long time, and we just papered over it with
the lock.
I think we need to do a refcount_inc_not_zero() in
nfsd_break_one_deleg() and just return without queuing the callback if
it's already at 0. That means that the recall is racing with the lease
teardown, so I think the right thing to do is to not send the recall in
that case.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-06-12 18:36 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 17:50 [PATCH v6 00/20] nfsd: add support for CB_NOTIFY callbacks in directory delegations Jeff Layton
2026-06-11 17:50 ` [PATCH v6 01/20] nfsd: check fl_lmops in nfsd_breaker_owns_lease() Jeff Layton
2026-06-11 17:50 ` [PATCH v6 02/20] nfsd: add protocol support for CB_NOTIFY Jeff Layton
2026-06-11 21:33 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 03/20] nfs_common: add new NOTIFY4_* flags proposed in RFC8881bis Jeff Layton
2026-06-11 17:50 ` [PATCH v6 04/20] nfsd: allow nfsd to get a dir lease with an ignore mask Jeff Layton
2026-06-11 17:50 ` [PATCH v6 05/20] nfsd: update the fsnotify mark when setting or removing a dir delegation Jeff Layton
2026-06-11 17:50 ` [PATCH v6 06/20] nfsd: make nfsd4_callback_ops->prepare operation bool return Jeff Layton
2026-06-11 17:50 ` [PATCH v6 07/20] nfsd: add callback encoding and decoding linkages for CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 08/20] nfsd: use RCU to protect fi_deleg_file Jeff Layton
2026-06-11 17:50 ` [PATCH v6 09/20] nfsd: add data structures for handling CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 10/20] nfsd: add notification handlers for dir events Jeff Layton
2026-06-12 17:51 ` Chuck Lever
2026-06-12 18:36 ` Jeff Layton [this message]
2026-06-11 17:50 ` [PATCH v6 11/20] nfsd: apply the notify mask to the delegation when requested Jeff Layton
2026-06-12 17:57 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 12/20] nfsd: add helper to marshal a fattr4 from completed args Jeff Layton
2026-06-11 17:50 ` [PATCH v6 13/20] nfsd: allow nfsd4_encode_fattr4_change() to work with no export Jeff Layton
2026-06-11 17:50 ` [PATCH v6 14/20] nfsd: send basic file attributes in CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 15/20] nfsd: allow encoding a filehandle into fattr4 without a svc_fh Jeff Layton
2026-06-12 18:03 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 16/20] nfsd: add a fi_connectable flag to struct nfs4_file Jeff Layton
2026-06-12 18:06 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 17/20] nfsd: add the filehandle to returned attributes in CB_NOTIFY Jeff Layton
2026-06-12 18:08 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 18/20] nfsd: properly track requested child attributes Jeff Layton
2026-06-12 18:10 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 19/20] nfsd: track requested dir attributes Jeff Layton
2026-06-12 18:13 ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 20/20] nfsd: add support to CB_NOTIFY for dir attribute changes Jeff Layton
2026-06-12 18:21 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d2a53fc34760fc986315df31ca6887bb3a54f47d.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=alex.aring@gmail.com \
--cc=amir73il@gmail.com \
--cc=anna@kernel.org \
--cc=brauner@kernel.org \
--cc=calum.mackay@oracle.com \
--cc=cel@kernel.org \
--cc=corbet@lwn.net \
--cc=jack@suse.cz \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=rostedt@goodmis.org \
--cc=skhan@linuxfoundation.org \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox