Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: NeilBrown <neil@brown.name>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Olga Kornievskaia	 <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	 "J. Bruce Fields" <bfields@fieldses.org>,
	Scott Mayhew <smayhew@redhat.com>,
	Trond Myklebust	 <Trond.Myklebust@netapp.com>,
	Andreas Gruenbacher <agruen@suse.de>,
	Mike Snitzer <snitzer@kernel.org>,
	Rick Macklem <rmacklem@uoguelph.ca>, Chris Mason <clm@meta.com>,
	 linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/10] nfsd: fix BUG_ON in nfsd4_alloc_layout_stateid on racing delegation revoke
Date: Fri, 29 May 2026 10:44:06 -0400	[thread overview]
Message-ID: <c06960228dc36d25b5a7d0e3640c352bc0fd99d5.camel@kernel.org> (raw)
In-Reply-To: <178001164345.3379282.16920221845834080481@noble.neil.brown.name>

On Fri, 2026-05-29 at 09:40 +1000, NeilBrown wrote:
> On Fri, 29 May 2026, Jeff Layton wrote:
> > nfsd4_alloc_layout_stateid reads fp->fi_deleg_file without holding
> > fi_lock when the parent stateid is a delegation. A concurrent delegation
> > revoke via the laundromat can clear fi_deleg_file under fi_lock, causing
> > nfsd_file_get() to return NULL and triggering the BUG_ON.
> > 
> > This race is client-reachable: two NFS clients can trigger it by having
> > one hold a delegation while another opens the same file to force a
> > recall. When the first client doesn't respond to the recall, the
> > laundromat revokes it. A concurrent LAYOUTGET from any client using the
> > delegation stateid hits the race window.
> > 
> > Fix this by taking fi_lock around the fi_deleg_file read in the
> > SC_TYPE_DELEG path, and replacing the BUG_ON with a graceful error
> > return that cleans up the partially-initialized layout stateid.
> 
> Replacing the BUG_ON with a graceful error is certainly sensible and
> probably all that is needed to fix the problem.
> 
> I cannot see how the spinlock achieves anything.  If ->fi_deleg_file
> could become NULL at this point, it can become NULL just before we take
> the spinlock.
> 
> We do need to be sure the file (if there is one) doesn't get freed while
> nfsd_file_get() is incrementing the refcount, but rcu_read_lock() is the
> normal tool for that.
> In this case we have
> 
>   		ls->ls_file = nfsd_file_get(rcu_dereference_protected(fp->fi_deleg_file, 1));
> 
> rcu_dereference_protected(...., 1)
> which for me is a warning sign.  What does the '1' mean here?
> Presumably something that we cannot easily assert with a C condition, in
> which case a comment is called for.
> 
> Based on the recent commit which added this I'm guessing 
>    fi_delegees > 0 guarantees stability
> 
> If that is right, then why do we also need a spinlock to guarantee
> stability?
> 
> Confused.
> 
> NeilBrown
> 

Good point. I think the right thing to do here is to use RCU to access
this pointer. Don't think it's protected by anything now. I'll do some
more analysis and put together a v2.

Thanks for the review!
-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2026-05-29 14:44 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28 21:55 [PATCH 00/10] nfsd: a pile of fixes for random bugs Jeff Layton
2026-05-28 21:55 ` [PATCH 01/10] nfsd: fix BUG_ON in nfsd4_alloc_layout_stateid on racing delegation revoke Jeff Layton
2026-05-28 23:40   ` NeilBrown
2026-05-29 14:44     ` Jeff Layton [this message]
2026-05-28 21:55 ` [PATCH 02/10] nfsd: drain callbacks and clear cl_cb_session Jeff Layton
2026-05-29 15:13   ` Chuck Lever
2026-05-29 17:31     ` Jeff Layton
2026-05-28 21:55 ` [PATCH 03/10] nfsd: serialize nfsd4_end_grace() with atomic test-and-set Jeff Layton
2026-05-29 15:38   ` Chuck Lever
2026-05-29 15:57     ` Jeff Layton
2026-05-29 16:05       ` Chuck Lever
2026-05-29 17:02         ` Jeff Layton
2026-05-28 21:55 ` [PATCH 04/10] nfsd: dedup nfs4_client_to_reclaim inserts Jeff Layton
2026-05-29 16:22   ` Chuck Lever
2026-05-28 21:55 ` [PATCH 05/10] nfsd: gate nfs3 setacl by argp->mask Jeff Layton
2026-05-28 21:55 ` [PATCH 06/10] NFSD: Enable return of an updated stable_how to NFS clients Jeff Layton
2026-05-29 10:56   ` Jeff Layton
2026-05-30  7:58   ` NFSv4.1 COMMIT of all changed areas only on flush? " Cedric Blancher
2026-05-30 10:24     ` Jeff Layton
2026-05-28 21:55 ` [PATCH 07/10] NFSD: check truncate permission under inode lock Jeff Layton
2026-05-28 21:55 ` [PATCH 08/10] nfsd: fix partial-write detection in nfsd_direct_write Jeff Layton
2026-05-29 16:57   ` Chuck Lever
2026-05-29 17:01     ` Jeff Layton
2026-05-29 17:03       ` Chuck Lever
2026-05-29 17:06         ` Jeff Layton
2026-05-29 17:09           ` Chuck Lever
2026-05-28 21:55 ` [PATCH 09/10] nfsd: cap decoded POSIX ACL count to bound sort cost Jeff Layton
2026-05-28 22:11   ` Rick Macklem
2026-05-28 23:11     ` Chuck Lever
2026-05-29  0:07       ` Chuck Lever
2026-05-29 10:48         ` Jeff Layton
2026-05-29 13:20           ` Chuck Lever
2026-05-29  7:34   ` Cedric Blancher
2026-05-29 10:50     ` Jeff Layton
2026-05-29 18:34   ` Chuck Lever
2026-05-29 18:41     ` Jeff Layton
2026-05-29 18:48       ` Chuck Lever
2026-05-29 23:04     ` Rick Macklem
2026-05-28 21:55 ` [PATCH 10/10] nfsd: validate symlink target length in NFSv4 CREATE Jeff Layton
2026-05-29 18:55   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c06960228dc36d25b5a7d0e3640c352bc0fd99d5.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=agruen@suse.de \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=clm@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=rmacklem@uoguelph.ca \
    --cc=smayhew@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox