* [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid()
@ 2022-10-11 18:03 Al Viro
2022-10-11 19:02 ` Miklos Szeredi
0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2022-10-11 18:03 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Miklos Szeredi, Trond Myklebust, Jeff Layton
In the original commit that introduced that thing, we have
a somewhat strange note in commit message:
Use filp_close instead of open coding. filp_close does a bit more than
just release the locks and put the filp. It also calls ->flush and
dnotify_flush, both of which should be done here anyway.
How could dnotify_flush() possibly catch anything here? In the current
form the value we pass as id is
(fl_owner_t)lockowner(stp->st_stateowner)
and lockowner is container_of(so, struct nfs4_lockowner, lo_owner);
dnotify_flush() looks for matches on dn->dn_owner == id; anything
not matching is left alone. And ->d_owner is set only by attach_dn(),
which gets the value from
fl_owner_t id = current->files;
If we ever see a match here, we are in deep trouble - the same address
being used as struct files_struct * and struct nfs4_lockowner * at
the same time...
Another interesting question is about FUSE ->flush() - how is the
server supposed to use the value it gets from
inarg.lock_owner = fuse_lock_owner_id(fm->fc, id);
in fuse_flush()? Note that e.g. async write might be followed by
close() before the completion. Moreover, it's possible to start
async write and do unshare(CLONE_FILES); if the descriptor table
used to be shared and all other threads exit after our unshare,
it's possible to get
async write begins, fuse_send_write() called with current->files as owner
flush happens, with current->files as id
what used to be current->files gets freed and memory reused
async write completes
Miklos, could you give some braindump on that?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid()
2022-10-11 18:03 [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid() Al Viro
@ 2022-10-11 19:02 ` Miklos Szeredi
2022-10-11 20:45 ` Jeff Layton
0 siblings, 1 reply; 4+ messages in thread
From: Miklos Szeredi @ 2022-10-11 19:02 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, Miklos Szeredi, Trond Myklebust, Jeff Layton
On Tue, 11 Oct 2022 at 20:04, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Another interesting question is about FUSE ->flush() - how is the
> server supposed to use the value it gets from
> inarg.lock_owner = fuse_lock_owner_id(fm->fc, id);
> in fuse_flush()? Note that e.g. async write might be followed by
> close() before the completion. Moreover, it's possible to start
> async write and do unshare(CLONE_FILES); if the descriptor table
> used to be shared and all other threads exit after our unshare,
> it's possible to get
> async write begins, fuse_send_write() called with current->files as owner
> flush happens, with current->files as id
> what used to be current->files gets freed and memory reused
> async write completes
>
> Miklos, could you give some braindump on that?
The lock_owner in flush is supposed to be used for remote posix lock
release [1]. I don't like posix lock semantics the least bit, and in
hindsight it would have been better to just not try to support remote
posix locks (nfs doesn't, so why would anyone care for it in fuse?)
Anyway, it's probably too late to get rid of this wart now.
The lock_owner field in read/write/setattr was added for mandatory
locking [2]. Now that support for mandatory locking has been removed
this is dead code, I guess. Will clean up in fuse as well.
Thanks,
Miklos
[1] v2.6.18: 7142125937e1 ("[PATCH] fuse: add POSIX file locking support")
[2] v2.6.24: f33321141b27 ("fuse: add support for mandatory locking")
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid()
2022-10-11 19:02 ` Miklos Szeredi
@ 2022-10-11 20:45 ` Jeff Layton
2022-10-12 7:08 ` Miklos Szeredi
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Layton @ 2022-10-11 20:45 UTC (permalink / raw)
To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel, Miklos Szeredi, Trond Myklebust
On Tue, 2022-10-11 at 21:02 +0200, Miklos Szeredi wrote:
> On Tue, 11 Oct 2022 at 20:04, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > Another interesting question is about FUSE ->flush() - how is the
> > server supposed to use the value it gets from
> > inarg.lock_owner = fuse_lock_owner_id(fm->fc, id);
> > in fuse_flush()? Note that e.g. async write might be followed by
> > close() before the completion. Moreover, it's possible to start
> > async write and do unshare(CLONE_FILES); if the descriptor table
> > used to be shared and all other threads exit after our unshare,
> > it's possible to get
> > async write begins, fuse_send_write() called with current->files as owner
> > flush happens, with current->files as id
> > what used to be current->files gets freed and memory reused
> > async write completes
> >
> > Miklos, could you give some braindump on that?
>
> The lock_owner in flush is supposed to be used for remote posix lock
> release [1]. I don't like posix lock semantics the least bit, and in
> hindsight it would have been better to just not try to support remote
> posix locks (nfs doesn't, so why would anyone care for it in fuse?)
> Anyway, it's probably too late to get rid of this wart now.
>
The NFS client maintains lock records in the local VFS. When a file is
closed, the VFS issues a whole file unlock. You're probably getting
bitten by this in locks_remove_posix:
ctx = smp_load_acquire(&inode->i_flctx);
if (!ctx || list_empty(&ctx->flc_posix))
return;
Because FUSE doesn't set any locks in the local kernel, that final
unlock never occurs.
There are a couple of options here: You could have FUSE start setting
local lock records, or you could look at pushing the above check down
into the individual ->lock ops.
> The lock_owner field in read/write/setattr was added for mandatory
> locking [2]. Now that support for mandatory locking has been removed
> this is dead code, I guess. Will clean up in fuse as well.
>
That sounds like a good plan.
>
> [1] v2.6.18: 7142125937e1 ("[PATCH] fuse: add POSIX file locking support")
> [2] v2.6.24: f33321141b27 ("fuse: add support for mandatory locking")
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid()
2022-10-11 20:45 ` Jeff Layton
@ 2022-10-12 7:08 ` Miklos Szeredi
0 siblings, 0 replies; 4+ messages in thread
From: Miklos Szeredi @ 2022-10-12 7:08 UTC (permalink / raw)
To: Jeff Layton; +Cc: Al Viro, linux-fsdevel, Miklos Szeredi, Trond Myklebust
On Tue, 11 Oct 2022 at 22:45, Jeff Layton <jlayton@redhat.com> wrote:
> The NFS client maintains lock records in the local VFS.
This is what fuse does, unless server explicitly requests remote posix
locking with the FUSE_POSIX_LOCKS init flag. Not sure how many real
fuse filesystem out there actually implement this, probably not a lot.
> When a file is
> closed, the VFS issues a whole file unlock. You're probably getting
> bitten by this in locks_remove_posix:
>
> ctx = smp_load_acquire(&inode->i_flctx);
> if (!ctx || list_empty(&ctx->flc_posix))
> return;
>
> Because FUSE doesn't set any locks in the local kernel, that final
> unlock never occurs.
Fuse doesn't care, because if remote posix locks are enabled, then
unlocking should be done in the FUSE_FLUSH request.
> There are a couple of options here: You could have FUSE start setting
> local lock records, or you could look at pushing the above check down
> into the individual ->lock ops.
I don't remember the details, but setting local locks wasn't a good option.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-10-12 7:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-11 18:03 [RFC] fl_owner_t and use of filp_close() in nfs4_free_lock_stateid() Al Viro
2022-10-11 19:02 ` Miklos Szeredi
2022-10-11 20:45 ` Jeff Layton
2022-10-12 7:08 ` Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).