From: Jeff Layton <jlayton@kernel.org>
To: NeilBrown <neil@brown.name>
Cc: linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: try_break_deleg() and atomic_open()
Date: Tue, 12 May 2026 06:52:22 -0400 [thread overview]
Message-ID: <a278be1fa50f3c52af869233ce34d3139c33b653.camel@kernel.org> (raw)
In-Reply-To: <177855793113.2788210.10945921479429705266@noble.neil.brown.name>
On Tue, 2026-05-12 at 13:52 +1000, NeilBrown wrote:
> On Tue, 12 May 2026, Jeff Layton wrote:
> > On Tue, 2026-05-12 at 08:21 +1000, NeilBrown wrote:
> > > Hi Jeff,
> > > quick question (I hope).
> > > Should atomic_open() call try_break_deleg() on the directory
> > > when a create is pending?
> > >
> > > This seems a bit iffy because the VFS doesn't necessarily know if a
> > > create will happen before it calls ->atomic_open, so it cannot know
> > > if it needs to break the deleg or not.
> >
> > Agreed, so I'm thinking no to doing that in generic code.
> >
> > > So maybe the individual ->atomic_open functions should do it?
> > >
> >
> > I think that's probably what has to happen:
> >
> > atomic_open() is there to handle the non-trivial open cases (mostly
> > network and clustered filesystems). Those, in general, also require
> > non-trivial delegation/lease handling. I think we sort of need to leave
> > it to the underlying fs in those cases since the kernel doesn't have
> > enough info to do it.
>
> I had a look and could only find gfs2/nolocks and NFSv4 as filesystems
> which support leases on directories and use ->atomic_open.
>
Maybe also smb/client? Ceph also supports clustered leases, but the
kernel client never got support for it.
> I wonder if gfs2/nolocks should not advertise ->atomic_open. The
> implication of nolocks is (I assume) that there is only the one active
> client, and in that case no special handling is needed for exclusive
> create.
>
Yeah, I don't see the point of GFS2's ->atomic_open in the case of a
single-client filesystem.
> NFSv4 uses delegations to provide leases. So the ->atomic_open
> handler does have work to do to cancel any lease while keeping the
> delegation.
> We would need to either allow ->atomic_open to return the deleg_inode
> somehow, or have ->atomic_open drop the parent lock so that it can
> safely wait.
>
> Or we could just ignore the issue until I manage to land my changed to
> push locking down into the filesystem, and then locking/waiting becomes
> much easier.
>
To make sure I understand the scenario: the NFS client has a directory
delegation on the parent directory and an application has taken out a
lease on that directory. We now want to issue an open(..., O_CREAT) on
a file in that directory for which we don't yet have a dentry?
FWIW, the current NFSv4 client code won't hand out leases on a
directory, as struct nfs_dir_operations doesn't set the ->setlease
operation, so this situation shouldn't arise. Ditto for CIFS.
But, let's pretend that it's possible: Ideally we'd just leave it up to
the server to recall the deleg if a create happens, but most servers
won't revoke the deleg of the client making the change. So I think if
we ever did want to support this, then the NFSv4 client would need to
revoke the (local application's) lease on its own.
>
> >
> > > I'm looking at dentry_create() which calls atomic_open() is quite a
> > > different way to how lookup_open() calls it. I'd like to change
> > > nfsd4 so it calls something a lot more like lookup_open() and in
> > > looking at what I would need to change, delegated_inode stood out.
> > >
> >
> > Understood. I wish that were a bit less klunky, but I don't see a great
> > way to make it so.
>
> We could check for a lease and if one is present then do the lookup
> separately from ->atomic_open. If that finds a match then no create is
> needed. If it doesn't then there is justification to break the lease
> before calling ->atomic_open.
>
> This means that when there is a lease on an NFS directory, other apps
> have to do a LOOKUP for uncached names before sending a creating OPEN.
> Maybe that is an acceptable cost.
>
> Should an O_CREAT open *always* break a directory lease, even if the
> name happens to exist?
> I note that man/man2const/F_GETLEASE.2const in man-pages.git doesn't
> mention directories.
>
There is a coming update to the manpage, but it may not have trickled
out to the distros yet. My thinking at this point is that this would
have to be handled inside of the NFS (or CIFS) client.
Now though I'm wondering if the NFSv4 client-side lease implementation
is actually broken:
Suppose an application takes out a read lease on a fd1 for a file and
then another application on the same client opens fd2 on the file for
write. I don't think a lease break will happen today since the activity
comes from the same client.
OTOH, maybe it does work since the v4 client does set a local lease on
the file? I think we'll need to test this to see how that works.
--
Jeff Layton <jlayton@kernel.org>
prev parent reply other threads:[~2026-05-12 10:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 22:21 try_break_deleg() and atomic_open() NeilBrown
2026-05-11 23:34 ` Jeff Layton
2026-05-12 3:52 ` NeilBrown
2026-05-12 10:52 ` Jeff Layton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a278be1fa50f3c52af869233ce34d3139c33b653.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox