Linux Documentation
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	Trond Myklebust <trondmy@kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	 Shuah Khan <skhan@linuxfoundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Alexander Aring <alex.aring@gmail.com>,
	 Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner	 <brauner@kernel.org>,
	Calum Mackay <calum.mackay@oracle.com>,
	 linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	 linux-nfs@vger.kernel.org
Subject: Re: [PATCH v6 10/20] nfsd: add notification handlers for dir events
Date: Fri, 12 Jun 2026 14:36:06 -0400	[thread overview]
Message-ID: <d2a53fc34760fc986315df31ca6887bb3a54f47d.camel@kernel.org> (raw)
In-Reply-To: <b94c3e40-0520-4e83-9b4f-53a9325cecfe@app.fastmail.com>

On Fri, 2026-06-12 at 13:51 -0400, Chuck Lever wrote:
> On Thu, Jun 11, 2026, at 1:50 PM, Jeff Layton wrote:
> > Add the necessary parts to accept a fsnotify callback for directory
> > change event and create a CB_NOTIFY request for it. When a dir nfsd_file
> > is created set a handle_event callback to handle the notification.
> > 
> > Use that to allocate a nfsd_notify_event object and then hand off a
> > reference to each delegation's CB_NOTIFY. If anything fails along the
> > way, recall any affected delegations.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index ca4dd2f969eb..59378751d596 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> 
> > @@ -904,13 +908,45 @@ static void nfs4_xdr_enc_cb_notify(struct rpc_rqst *req,
> >  	encode_cb_sequence4args(xdr, cb, &hdr);
> > 
> >  	/*
> > -	 * FIXME: get stateid and fh from delegation. Inline the cna_changes
> > -	 * buffer, and zero it.
> > +	 * nfsd4_cb_notify_prepare() sized the payload against a single page,
> > +	 * but did not account for the compound, sequence, stateid, and
> > +	 * filehandle encoded here. If the variable-length encode overflows the
> > +	 * backchannel send buffer, roll back to before the operation so that a
> > +	 * truncated CB_NOTIFY is never placed on the wire.
> >  	 */
> > -	xdrgen_encode_CB_NOTIFY4args(xdr, &args);
> > +	start = xdr_stream_pos(xdr);
> > +
> > +	p = xdr_reserve_space(xdr, 4);
> > +	if (!p)
> > +		goto out_err;
> > +	*p = cpu_to_be32(OP_CB_NOTIFY);
> 
> Please use xdr_stream_encode_u32 for this purpose.
> 

Ok

> 
> > +
> > +	args.cna_stateid.seqid = dp->dl_stid.sc_stateid.si_generation;
> > +	memcpy(&args.cna_stateid.other, &dp->dl_stid.sc_stateid.si_opaque,
> > +	       ARRAY_SIZE(args.cna_stateid.other));
> > +	args.cna_fh.len = dp->dl_stid.sc_file->fi_fhandle.fh_size;
> > +	args.cna_fh.data = dp->dl_stid.sc_file->fi_fhandle.fh_raw;
> > +	args.cna_changes.count = ncn->ncn_nf_cnt;
> > +	args.cna_changes.element = ncn->ncn_nf;
> > +	if (!xdrgen_encode_CB_NOTIFY4args(xdr, &args))
> > +		goto out_err;
> > 
> >  	hdr.nops++;
> >  	encode_cb_nops(&hdr);
> > +	return;
> > +
> > +out_err:
> > +	/*
> > +	 * Drop the CB_NOTIFY op and emit a valid CB_SEQUENCE-only compound so
> > +	 * the client still advances its slot. Flag the failure so the done
> > +	 * handler recalls the delegation and the missed notification is not
> > +	 * silently lost. The flag is written here in the transmit path and read
> > +	 * in the done handler; the two are serialized phases of the same
> > +	 * rpc_task, so no additional barrier is needed.
> > +	 */
> > +	ncn->ncn_encode_err = true;
> 
> This flag is zeroed only once, at allocation time in alloc_init_dir_deleg().
> It is never cleared in nfsd4_cb_notify_prepare().
> 
> Since nfsd4_cb_notify_release() can requeue the callback (via
> nfsd4_run_cb_notify) when events arrive while a callback is in flight,
> ->prepare may encode cleanly and return true, but nfsd4_cb_notify_done()
> still observes the stale ncn_encode_err == true and calls
> nfsd_break_one_deleg() -- discarding a good notification and recalling
> the delegation unnecessarily.
> 

Ok, so we need to reset this in ->prepare.

> 
> > +	xdr_truncate_encode(xdr, start);
> > +	encode_cb_nops(&hdr);
> >  }
> > 
> >  static int nfs4_xdr_dec_cb_notify(struct rpc_rqst *rqstp,
> 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 0a15d7f3b543..513cbc1a583f 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> 
> > @@ -3471,19 +3472,146 @@ nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
> >  	nfs4_put_stid(&dp->dl_stid);
> >  }
> > 
> > +static void nfsd_break_one_deleg(struct nfs4_delegation *dp)
> > +{
> > +	bool queued;
> > +
> > +	if (test_and_set_bit(NFSD4_CALLBACK_RUNNING, &dp->dl_recall.cb_flags))
> > +		return;
> > +
> > +	/*
> > +	 * We're assuming the state code never drops its reference
> > +	 * without first removing the lease.  Since we're in this lease
> > +	 * callback (and since the lease code is serialized by the
> > +	 * flc_lock) we know the server hasn't removed the lease yet, and
> > +	 * we know it's safe to take a reference.
> > +	 */
> > +	refcount_inc(&dp->dl_stid.sc_count);
> > +	queued = nfsd4_run_cb(&dp->dl_recall);
> > +	WARN_ON_ONCE(!queued);
> > +	if (!queued) {
> > +		refcount_dec(&dp->dl_stid.sc_count);
> > +		clear_bit(NFSD4_CALLBACK_RUNNING, &dp->dl_recall.cb_flags);
> > +	}
> > +}
> 
> nfsd_break_one_deleg() does an unconditional
> refcount_inc(&dp->dl_stid.sc_count), and its comment justifies this
> with "the lease code is serialized by the flc_lock." That invariant
> holds when called from nfsd_break_deleg_cb() under flc_lock, but
> nfsd4_cb_notify_prepare() runs on a workqueue WITHOUT flc_lock. Its
> out_recall: path calls nfsd_break_one_deleg(dp)
> directly. The delegation can be concurrently destroyed with sc_count
> already at zero, making this an inc-from-zero.
> 
> The dispatch path nfsd4_run_cb_notify already does this correctly with
> refcount_inc_not_zero. The out_recall path needs the same guard (skip
> the recall / bail if the refcount is already zero).
> 
> I notice that the last unapplied patch ("nfsd: add
> support to CB_NOTIFY for dir attribute changes") rewrites the guard
> "if (count > NOTIFY4_EVENT_QUEUE_SIZE)" into "if (count > limit)" with
> limit = NOTIFY4_EVENT_QUEUE_SIZE - 1 when NOTIFY4_CHANGE_DIR_ATTRS is
> requested. That turns the previously-dead overflow branch into a live,
> routine path to out_recall, which adds another normal-operation route
> into this unlocked recall.
> 

This wart has been there a long time, and we just papered over it with
the lock.

I think we need to do a refcount_inc_not_zero() in
nfsd_break_one_deleg() and just return without queuing the callback if
it's already at 0. That means that the recall is racing with the lease
teardown, so I think the right thing to do is to not send the recall in
that case.
-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2026-06-12 18:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 17:50 [PATCH v6 00/20] nfsd: add support for CB_NOTIFY callbacks in directory delegations Jeff Layton
2026-06-11 17:50 ` [PATCH v6 01/20] nfsd: check fl_lmops in nfsd_breaker_owns_lease() Jeff Layton
2026-06-11 17:50 ` [PATCH v6 02/20] nfsd: add protocol support for CB_NOTIFY Jeff Layton
2026-06-11 21:33   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 03/20] nfs_common: add new NOTIFY4_* flags proposed in RFC8881bis Jeff Layton
2026-06-11 17:50 ` [PATCH v6 04/20] nfsd: allow nfsd to get a dir lease with an ignore mask Jeff Layton
2026-06-11 17:50 ` [PATCH v6 05/20] nfsd: update the fsnotify mark when setting or removing a dir delegation Jeff Layton
2026-06-11 17:50 ` [PATCH v6 06/20] nfsd: make nfsd4_callback_ops->prepare operation bool return Jeff Layton
2026-06-11 17:50 ` [PATCH v6 07/20] nfsd: add callback encoding and decoding linkages for CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 08/20] nfsd: use RCU to protect fi_deleg_file Jeff Layton
2026-06-11 17:50 ` [PATCH v6 09/20] nfsd: add data structures for handling CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 10/20] nfsd: add notification handlers for dir events Jeff Layton
2026-06-12 17:51   ` Chuck Lever
2026-06-12 18:36     ` Jeff Layton [this message]
2026-06-11 17:50 ` [PATCH v6 11/20] nfsd: apply the notify mask to the delegation when requested Jeff Layton
2026-06-12 17:57   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 12/20] nfsd: add helper to marshal a fattr4 from completed args Jeff Layton
2026-06-11 17:50 ` [PATCH v6 13/20] nfsd: allow nfsd4_encode_fattr4_change() to work with no export Jeff Layton
2026-06-11 17:50 ` [PATCH v6 14/20] nfsd: send basic file attributes in CB_NOTIFY Jeff Layton
2026-06-11 17:50 ` [PATCH v6 15/20] nfsd: allow encoding a filehandle into fattr4 without a svc_fh Jeff Layton
2026-06-12 18:03   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 16/20] nfsd: add a fi_connectable flag to struct nfs4_file Jeff Layton
2026-06-12 18:06   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 17/20] nfsd: add the filehandle to returned attributes in CB_NOTIFY Jeff Layton
2026-06-12 18:08   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 18/20] nfsd: properly track requested child attributes Jeff Layton
2026-06-12 18:10   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 19/20] nfsd: track requested dir attributes Jeff Layton
2026-06-12 18:13   ` Chuck Lever
2026-06-11 17:50 ` [PATCH v6 20/20] nfsd: add support to CB_NOTIFY for dir attribute changes Jeff Layton
2026-06-12 18:21   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2a53fc34760fc986315df31ca6887bb3a54f47d.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=alex.aring@gmail.com \
    --cc=amir73il@gmail.com \
    --cc=anna@kernel.org \
    --cc=brauner@kernel.org \
    --cc=calum.mackay@oracle.com \
    --cc=cel@kernel.org \
    --cc=corbet@lwn.net \
    --cc=jack@suse.cz \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=skhan@linuxfoundation.org \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox