* [PATCHES][RFC] rework of struct fd handling
@ 2024-06-07 1:56 Al Viro
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
2024-06-07 15:30 ` [PATCHES][RFC] rework of struct fd handling Christian Brauner
0 siblings, 2 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:56 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Linus Torvalds, Christian Brauner
Experimental series trying to sanitize the handling
of struct fd. Lightly tested, in serious need of review.
It's 6.10-rc1-based, lives in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.fd
Individual patches in followups, descriptions below.
Shortlog:
Al Viro (19):
powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
lirc: rc_dev_get_from_fd(): fix file leak
introduce fd_file(), convert all accessors to it.
struct fd: representation change
add struct fd constructors, get rid of __to_fd()
net/socket.c: switch to CLASS(fd)
introduce struct fderr, convert overlayfs uses to that
fdget_raw() users: switch to CLASS(fd_raw, ...)
css_set_fork(): switch to CLASS(fd_raw, ...)
introduce "fd_pos" class
switch simple users of fdget() to CLASS(fd, ...)
bpf: switch to CLASS(fd, ...)
convert vmsplice() to CLASS(fd, ...)
finit_module(): convert to CLASS(fd, ...)
timerfd: switch to CLASS(fd, ...)
do_mq_notify(): switch to CLASS(fd, ...)
simplify xfs_find_handle() a bit
convert kernel/events/core.c
deal with the last remaing boolean uses of fd_file()
Diffstat:
arch/alpha/kernel/osf_sys.c | 7 +-
arch/arm/kernel/sys_oabi-compat.c | 18 +-
arch/powerpc/kvm/book3s_64_vio.c | 9 +-
arch/powerpc/kvm/powerpc.c | 26 +--
arch/powerpc/platforms/cell/spu_syscalls.c | 17 +-
arch/x86/kernel/cpu/sgx/main.c | 10 +-
arch/x86/kvm/svm/sev.c | 43 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 27 +--
drivers/gpu/drm/drm_syncobj.c | 11 +-
drivers/infiniband/core/ucma.c | 21 +-
drivers/infiniband/core/uverbs_cmd.c | 12 +-
drivers/media/mc/mc-request.c | 22 +-
drivers/media/rc/lirc_dev.c | 13 +-
drivers/vfio/group.c | 10 +-
drivers/vfio/virqfd.c | 20 +-
drivers/virt/acrn/irqfd.c | 14 +-
drivers/xen/privcmd.c | 35 ++--
fs/btrfs/ioctl.c | 7 +-
fs/coda/inode.c | 13 +-
fs/eventfd.c | 9 +-
fs/eventpoll.c | 62 ++----
fs/ext4/ioctl.c | 23 +-
fs/f2fs/file.c | 17 +-
fs/fcntl.c | 74 +++----
fs/fhandle.c | 7 +-
fs/file.c | 26 +--
fs/fsopen.c | 23 +-
fs/fuse/dev.c | 10 +-
fs/ioctl.c | 47 ++---
fs/kernel_read_file.c | 12 +-
fs/locks.c | 27 +--
fs/namei.c | 19 +-
fs/namespace.c | 53 ++---
fs/notify/fanotify/fanotify_user.c | 50 ++---
fs/notify/inotify/inotify_user.c | 44 ++--
fs/ocfs2/cluster/heartbeat.c | 17 +-
fs/open.c | 67 +++---
fs/overlayfs/file.c | 187 +++++++----------
fs/quota/quota.c | 18 +-
fs/read_write.c | 227 +++++++++-----------
fs/readdir.c | 38 ++--
fs/remap_range.c | 11 +-
fs/select.c | 17 +-
fs/signalfd.c | 11 +-
fs/smb/client/ioctl.c | 17 +-
fs/splice.c | 82 +++-----
fs/stat.c | 10 +-
fs/statfs.c | 12 +-
fs/sync.c | 33 ++-
fs/timerfd.c | 42 ++--
fs/utimes.c | 11 +-
fs/xattr.c | 64 +++---
fs/xfs/xfs_exchrange.c | 12 +-
fs/xfs/xfs_handle.c | 16 +-
fs/xfs/xfs_ioctl.c | 85 +++-----
include/linux/bpf.h | 9 +-
include/linux/cleanup.h | 2 +-
include/linux/file.h | 89 +++++---
io_uring/sqpoll.c | 31 +--
ipc/mqueue.c | 126 +++++------
kernel/bpf/bpf_inode_storage.c | 29 +--
kernel/bpf/btf.c | 13 +-
kernel/bpf/map_in_map.c | 37 +---
kernel/bpf/syscall.c | 197 ++++++-----------
kernel/bpf/token.c | 19 +-
kernel/bpf/verifier.c | 20 +-
kernel/cgroup/cgroup.c | 21 +-
kernel/events/core.c | 67 +++---
kernel/module/main.c | 15 +-
kernel/nsproxy.c | 15 +-
kernel/pid.c | 26 +--
kernel/signal.c | 33 ++-
kernel/sys.c | 21 +-
kernel/taskstats.c | 20 +-
kernel/watch_queue.c | 8 +-
mm/fadvise.c | 10 +-
mm/filemap.c | 19 +-
mm/memcontrol.c | 37 ++--
mm/readahead.c | 25 +--
net/core/net_namespace.c | 14 +-
net/core/sock_map.c | 23 +-
net/socket.c | 325 +++++++++++++----------------
security/integrity/ima/ima_main.c | 9 +-
security/landlock/syscalls.c | 57 ++---
security/loadpin/loadpin.c | 10 +-
sound/core/pcm_native.c | 6 +-
virt/kvm/eventfd.c | 19 +-
virt/kvm/vfio.c | 18 +-
88 files changed, 1234 insertions(+), 1951 deletions(-)
01/19) powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
02/19) lirc: rc_dev_get_from_fd(): fix file leak
First two patches are obvious leak fixes - missing fdput()
on one a failure exits.
03/19) introduce fd_file(), convert all accessors to it.
For any changes of struct fd representation we need to
turn existing accesses to fields into calls of wrappers.
Accesses to struct fd::flags are very few (3 in linux/file.h,
1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in
explicit initializers).
Those can be dealt with in the commit converting to
new layout; accesses to struct fd::file are too many for that.
This commit converts (almost) all of f.file to
fd_file(f). It's not entirely mechanical ('file' is used as
a member name more than just in struct fd) and it does not
even attempt to distinguish the uses in pointer context from
those in boolean context; the latter will be eventually turned
into a separate helper (fd_empty()).
NB: this commit is where I'd expect arseloads of conflicts
through the cycle, simply because of the breadth of area being
touched. The biggest one, as well (500 lines modified).
Might be worth splitting - not sure.
04/19) struct fd: representation change
The absolute majority of instances comes from fdget() and its
relatives; the underlying primitives actually return a struct file
reference and a couple of flags encoded into an unsigned long - the lower
two bits of file address are always zero, so we can stash the flags
into those. On the way out we use __to_fd() to unpack that unsigned
long into struct fd.
Let's use that representation for struct fd itself - make it
a structure with a single unsigned long member (.word), with the value
equal either to (unsigned long)p | flags, p being an address of some
struct file instance, or to 0 for an empty fd.
Note that we never used a struct fd instance with NULL ->file
and non-zero ->flags; the emptiness had been checked as (!f.file) and
we expected e.g. fdput(empty) to be a no-op. With new representation
we can use (!f.word) for emptiness check; that is enough for compiler
to figure out that (f.word & FDPUT_FPUT) will be false and that fdput(f)
will be a no-op in such case.
For now the new predicate (fd_empty(f)) has no users; all the
existing checks have form (!fd_file(f)). We will convert to fd_empty()
use later; here we only define it (and tell the compiler that it's
unlikely to return true).
This commit only deals with representation change; there will
be followups.
NOTE: overlayfs part is _not_ in the final form - it will be
massaged shortly.
05/19) add struct fd constructors, get rid of __to_fd()
Make __fdget() et.al. return struct fd directly.
New helpers: BORROWED_FD(file) and CLONED_FD(file), for
borrowed and cloned file references resp.
NOTE: this might need tuning; in particular, inline on
__fget_light() is there to keep the code generation same as
before - we probably want to keep it inlined in fdget() et.al.
(especially so in fdget_pos()), but that needs profiling.
Next two commits deal with the worst irregularities in struct fd use:
in net/socket.c we have fdget() without matching fdput() - fdget() is
done in sockfd_lookup_light(), then the results are passed (in modified
form) to caller, which deals with conditional fput(). And in
overlayfs we have an almost-but-not-quite struct fd shoehorned into
struct fd, with ugly calling conventions as the result of that.
I'm not sure what order would be better for these two commits.
06/19) net/socket.c: switch to CLASS(fd)
I strongly suspect that important part in sockfd_lookup_light() is
avoiding needless file refcount operations, not the marginal reduction of
the register pressure from not keeping a struct file pointer in the caller.
If that's true, we should get the same benefits from straight
fdget()/fdput(). And AFAICS with sane use of CLASS(fd) we can get a better
code generation...
Would be nice if somebody tested it on networking test suites
(including benchmarks)...
sockfd_lookup_light() does fdget(), uses sock_from_file() to
get the associated socket and returns the struct socket reference to
the caller, along with "do we need to fput()" flag. No matching fdput(),
the caller does its equivalent manually, using the fact that sock->file
points to the struct file the socket has come from.
Get rid of that - have the callers do fdget()/fdput() and
use sock_from_file() directly. That kills sockfd_lookup_light()
and fput_light() (no users left).
What's more, we can get rid of explicit fdget()/fdput() by
switching to CLASS(fd, ...) - code generation does not suffer, since
now fdput() inserted on "descriptor is not opened" failure exit
is recognized to be a no-op by compiler.
We could split that commit in two (getting rid of sockd_lookup_light()
and switch to CLASS(fd, ...)), but AFAICS it ends up being harder to read
that way.
07/19) introduce struct fderr, convert overlayfs uses to that
Similar to struct fd; unlike struct fd, it can represent
error values.
Accessors:
* fd_empty(f): true if f represents an error
* fd_file(f): just as for struct fd it yields a pointer to
struct file if fd_empty(f) is false. If
fd_empty(f) is true, fd_file(f) is guaranteed
_not_ to be an address of any object (IS_ERR()
will be true in that case)
* fd_error(f): if f represents an error, returns that error,
otherwise the return value is junk.
Constructors:
* ERR_FD(-E...): an instance encoding given error [ERR_FDERR, perhaps?]
* BORROWED_FDERR(file): if file points to a struct file instance,
return a struct fderr representing that file
reference with no flags set.
if file is an ERR_PTR(-E...), return a struct
fderr representing that error.
file MUST NOT be NULL.
* CLONED_FDERR(file): similar, but in case when file points to
a struct file instance, set FDPUT_FPUT in flags.
fdput_err() serves as a destructor.
See fs/overlayfs/file.c for example of use.
08/19) fdget_raw() users: switch to CLASS(fd_raw, ...)
all convert trivially
09/19) css_set_fork(): switch to CLASS(fd_raw, ...)
reference acquired there by fget_raw() is not stashed anywhere -
we could as well borrow instead.
10/19) introduce "fd_pos" class
fdget_pos() for constructor, fdput_pos() for cleanup, users of
fd..._pos() converted.
11/19) switch simple users of fdget() to CLASS(fd, ...)
low-hanging fruit; that's another likely source of conflicts
over the cycle and it might make a lot of sense to split; fortunately,
it can be split pretty much on per-function basis - chunks are independent
from each other.
12/19) bpf: switch to CLASS(fd, ...)
Calling conventions for __bpf_map_get() would be more convenient
if it left fpdut() on failure to callers. Makes for simpler logics
in the callers.
Among other things, the proof of memory safety no longer has to
rely upon file->private_data never being ERR_PTR(...) for bpffs files.
Original calling conventions made it impossible for the caller to tell
whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found
the file not be a bpf map one (in which case it would've done fdput())
or because it found that ERR_PTR(-EINVAL) in file->private_data of a
bpf map file (in which case fdput() would _not_ have been done).
With that calling conventions change it's easy to switch all
bpffs users to CLASS(fd, ...).
13/19) convert vmsplice() to CLASS(fd, ...)
Irregularity here is fdput() not in the same scope as fdget();
we could just lift it out vmsplice_type() in vmsplice(2), but there's
no much point keeping vmsplice_type() separate after that...
14/19) finit_module(): convert to CLASS(fd, ...)
Slightly unidiomatic emptiness check; just lift it out of
idempotent_init_module() and into finit_module(2) and that's it.
15/19) timerfd: switch to CLASS(fd, ...)
Fold timerfd_fget() into both callers to have fdget() and fdput()
in the same scope. Could be done in different ways, but this is probably
the smallest solution.
16/19) do_mq_notify(): switch to CLASS(fd, ...)
a minor twist is the reuse of struct fd instance in there
17/19) simplify xfs_find_handle() a bit
XFS_IOC_FD_TO_HANDLE can grab a reference to copied ->f_path and
let the file go; results in simpler control flow - cleanup is
the same for both "by descriptor" and "by pathname" cases.
NOTE: grabbing f->f_path to pin file_inode(f) is valid, since
we are dealing with XFS files here - no reassignments of file_inode().
18/19) convert kernel/events/core.c
a questionable trick in perf_event_open(2) - deliberate call of
fdget(-1), expecting it to yield empty
19/19) deal with the last remaing boolean uses of fd_file()
... replacing them with uses of fd_empty()
^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
2024-06-07 1:56 [PATCHES][RFC] rework of struct fd handling Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
` (18 more replies)
2024-06-07 15:30 ` [PATCHES][RFC] rework of struct fd handling Christian Brauner
1 sibling, 19 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
missing fdput() on one of the failure exits
Fixes: eacc56bb9de3e # v5.2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
arch/powerpc/kvm/powerpc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index d32abe7fe6ab..d11767208bfc 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1984,8 +1984,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
break;
r = -ENXIO;
- if (!xive_enabled())
+ if (!xive_enabled()) {
+ fdput(f);
break;
+ }
r = -EPERM;
dev = kvm_device_from_filp(f.file);
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:17 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 03/19] introduce fd_file(), convert all accessors to it Al Viro
` (17 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
missing fdput() on a failure exit
Fixes: 6a9d552483d50 "media: rc: bpf attach/detach requires write permission" # v6.9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
drivers/media/rc/lirc_dev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
index 52aea4167718..717c441b4a86 100644
--- a/drivers/media/rc/lirc_dev.c
+++ b/drivers/media/rc/lirc_dev.c
@@ -828,8 +828,10 @@ struct rc_dev *rc_dev_get_from_fd(int fd, bool write)
return ERR_PTR(-EINVAL);
}
- if (write && !(f.file->f_mode & FMODE_WRITE))
+ if (write && !(f.file->f_mode & FMODE_WRITE)) {
+ fdput(f);
return ERR_PTR(-EPERM);
+ }
fh = f.file->private_data;
dev = fh->rc;
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 03/19] introduce fd_file(), convert all accessors to it.
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
2024-06-07 1:59 ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 04/19] struct fd: representation change Al Viro
` (16 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
For any changes of struct fd representation we need to
turn existing accesses to fields into calls of wrappers.
Accesses to struct fd::flags are very few (3 in linux/file.h,
1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in
explicit initializers).
Those can be dealt with in the commit converting to
new layout; accesses to struct fd::file are too many for that.
This commit converts (almost) all of f.file to
fd_file(f). It's not entirely mechanical ('file' is used as
a member name more than just in struct fd) and it does not
even attempt to distinguish the uses in pointer context from
those in boolean context; the latter will be eventually turned
into a separate helper (fd_empty()).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
arch/alpha/kernel/osf_sys.c | 4 +-
arch/arm/kernel/sys_oabi-compat.c | 10 +-
arch/powerpc/kvm/book3s_64_vio.c | 4 +-
arch/powerpc/kvm/powerpc.c | 12 +--
arch/powerpc/platforms/cell/spu_syscalls.c | 8 +-
arch/x86/kernel/cpu/sgx/main.c | 4 +-
arch/x86/kvm/svm/sev.c | 16 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 8 +-
drivers/gpu/drm/drm_syncobj.c | 6 +-
drivers/infiniband/core/ucma.c | 6 +-
drivers/infiniband/core/uverbs_cmd.c | 8 +-
drivers/media/mc/mc-request.c | 6 +-
drivers/media/rc/lirc_dev.c | 8 +-
drivers/vfio/group.c | 6 +-
drivers/vfio/virqfd.c | 6 +-
drivers/virt/acrn/irqfd.c | 6 +-
drivers/xen/privcmd.c | 10 +-
fs/btrfs/ioctl.c | 4 +-
fs/coda/inode.c | 4 +-
fs/eventfd.c | 4 +-
fs/eventpoll.c | 30 +++---
fs/ext4/ioctl.c | 6 +-
fs/f2fs/file.c | 6 +-
fs/fcntl.c | 38 +++----
fs/fhandle.c | 4 +-
fs/fsopen.c | 6 +-
fs/fuse/dev.c | 6 +-
fs/ioctl.c | 30 +++---
fs/kernel_read_file.c | 4 +-
fs/locks.c | 14 +--
fs/namei.c | 10 +-
fs/namespace.c | 12 +--
fs/notify/fanotify/fanotify_user.c | 12 +--
fs/notify/inotify/inotify_user.c | 12 +--
fs/ocfs2/cluster/heartbeat.c | 6 +-
fs/open.c | 24 ++---
fs/overlayfs/file.c | 40 +++----
fs/quota/quota.c | 8 +-
fs/read_write.c | 118 ++++++++++-----------
fs/readdir.c | 20 ++--
fs/remap_range.c | 2 +-
fs/select.c | 8 +-
fs/signalfd.c | 6 +-
fs/smb/client/ioctl.c | 8 +-
fs/splice.c | 22 ++--
fs/stat.c | 4 +-
fs/statfs.c | 4 +-
fs/sync.c | 14 +--
fs/timerfd.c | 8 +-
fs/utimes.c | 4 +-
fs/xattr.c | 36 +++----
fs/xfs/xfs_exchrange.c | 4 +-
fs/xfs/xfs_handle.c | 4 +-
fs/xfs/xfs_ioctl.c | 28 ++---
include/linux/cleanup.h | 2 +-
include/linux/file.h | 6 +-
io_uring/sqpoll.c | 10 +-
ipc/mqueue.c | 50 ++++-----
kernel/bpf/bpf_inode_storage.c | 14 +--
kernel/bpf/btf.c | 6 +-
kernel/bpf/syscall.c | 42 ++++----
kernel/bpf/token.c | 10 +-
kernel/cgroup/cgroup.c | 4 +-
kernel/events/core.c | 12 +--
kernel/module/main.c | 2 +-
kernel/nsproxy.c | 12 +--
kernel/pid.c | 10 +-
kernel/signal.c | 6 +-
kernel/sys.c | 10 +-
kernel/taskstats.c | 4 +-
kernel/watch_queue.c | 4 +-
mm/fadvise.c | 4 +-
mm/filemap.c | 6 +-
mm/memcontrol.c | 12 +--
mm/readahead.c | 10 +-
net/core/net_namespace.c | 6 +-
net/socket.c | 12 +--
security/integrity/ima/ima_main.c | 4 +-
security/landlock/syscalls.c | 22 ++--
security/loadpin/loadpin.c | 4 +-
sound/core/pcm_native.c | 6 +-
virt/kvm/eventfd.c | 6 +-
virt/kvm/vfio.c | 8 +-
83 files changed, 502 insertions(+), 500 deletions(-)
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index e5f881bc8288..56fea57f9642 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -160,10 +160,10 @@ SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd,
.count = count
};
- if (!arg.file)
+ if (!fd_file(arg))
return -EBADF;
- error = iterate_dir(arg.file, &buf.ctx);
+ error = iterate_dir(fd_file(arg), &buf.ctx);
if (error >= 0)
error = buf.error;
if (count != buf.count)
diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index d00f4040a9f5..f5781ff54a5c 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -239,19 +239,19 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd,
struct flock64 flock;
long err = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
goto out;
switch (cmd) {
case F_GETLK64:
case F_OFD_GETLK:
- err = security_file_fcntl(f.file, cmd, arg);
+ err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
break;
err = get_oabi_flock(&flock, argp);
if (err)
break;
- err = fcntl_getlk64(f.file, cmd, &flock);
+ err = fcntl_getlk64(fd_file(f), cmd, &flock);
if (!err)
err = put_oabi_flock(&flock, argp);
break;
@@ -259,13 +259,13 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd,
case F_SETLKW64:
case F_OFD_SETLK:
case F_OFD_SETLKW:
- err = security_file_fcntl(f.file, cmd, arg);
+ err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
break;
err = get_oabi_flock(&flock, argp);
if (err)
break;
- err = fcntl_setlk64(fd, f.file, cmd, &flock);
+ err = fcntl_setlk64(fd, fd_file(f), cmd, &flock);
break;
default:
err = sys_fcntl64(fd, cmd, arg);
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b569ebaa590e..ce8f9539af2b 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -118,12 +118,12 @@ long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
struct fd f;
f = fdget(tablefd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
- if (stt == f.file->private_data) {
+ if (stt == fd_file(f)->private_data) {
found = true;
break;
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index d11767208bfc..fd62144e497e 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1938,11 +1938,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = -EBADF;
f = fdget(cap->args[0]);
- if (!f.file)
+ if (!fd_file(f))
break;
r = -EPERM;
- dev = kvm_device_from_filp(f.file);
+ dev = kvm_device_from_filp(fd_file(f));
if (dev)
r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
@@ -1957,11 +1957,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = -EBADF;
f = fdget(cap->args[0]);
- if (!f.file)
+ if (!fd_file(f))
break;
r = -EPERM;
- dev = kvm_device_from_filp(f.file);
+ dev = kvm_device_from_filp(fd_file(f));
if (dev) {
if (xics_on_xive())
r = kvmppc_xive_connect_vcpu(dev, vcpu, cap->args[1]);
@@ -1980,7 +1980,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = -EBADF;
f = fdget(cap->args[0]);
- if (!f.file)
+ if (!fd_file(f))
break;
r = -ENXIO;
@@ -1990,7 +1990,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
}
r = -EPERM;
- dev = kvm_device_from_filp(f.file);
+ dev = kvm_device_from_filp(fd_file(f));
if (dev)
r = kvmppc_xive_native_connect_vcpu(dev, vcpu,
cap->args[1]);
diff --git a/arch/powerpc/platforms/cell/spu_syscalls.c b/arch/powerpc/platforms/cell/spu_syscalls.c
index 87ad7d563cfa..cd7d42fc12a6 100644
--- a/arch/powerpc/platforms/cell/spu_syscalls.c
+++ b/arch/powerpc/platforms/cell/spu_syscalls.c
@@ -66,8 +66,8 @@ SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
if (flags & SPU_CREATE_AFFINITY_SPU) {
struct fd neighbor = fdget(neighbor_fd);
ret = -EBADF;
- if (neighbor.file) {
- ret = calls->create_thread(name, flags, mode, neighbor.file);
+ if (fd_file(neighbor)) {
+ ret = calls->create_thread(name, flags, mode, fd_file(neighbor));
fdput(neighbor);
}
} else
@@ -89,8 +89,8 @@ SYSCALL_DEFINE3(spu_run,int, fd, __u32 __user *, unpc, __u32 __user *, ustatus)
ret = -EBADF;
arg = fdget(fd);
- if (arg.file) {
- ret = calls->spu_run(arg.file, unpc, ustatus);
+ if (fd_file(arg)) {
+ ret = calls->spu_run(fd_file(arg), unpc, ustatus);
fdput(arg);
}
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 27892e57c4ef..d01deb386395 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -895,10 +895,10 @@ int sgx_set_attribute(unsigned long *allowed_attributes,
{
struct fd f = fdget(attribute_fd);
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
- if (f.file->f_op != &sgx_provision_fops) {
+ if (fd_file(f)->f_op != &sgx_provision_fops) {
fdput(f);
return -EINVAL;
}
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0623cfaa7bb0..6a8154d6935a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -377,10 +377,10 @@ static int __sev_issue_cmd(int fd, int id, void *data, int *error)
int ret;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- ret = sev_issue_cmd_external_user(f.file, id, data, error);
+ ret = sev_issue_cmd_external_user(fd_file(f), id, data, error);
fdput(f);
return ret;
@@ -1913,15 +1913,15 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
bool charged = false;
int ret;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (!file_is_kvm(f.file)) {
+ if (!file_is_kvm(fd_file(f))) {
ret = -EBADF;
goto out_fput;
}
- source_kvm = f.file->private_data;
+ source_kvm = fd_file(f)->private_data;
ret = sev_lock_two_vms(kvm, source_kvm);
if (ret)
goto out_fput;
@@ -2214,15 +2214,15 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
struct kvm_sev_info *source_sev, *mirror_sev;
int ret;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (!file_is_kvm(f.file)) {
+ if (!file_is_kvm(fd_file(f))) {
ret = -EBADF;
goto e_source_fput;
}
- source_kvm = f.file->private_data;
+ source_kvm = fd_file(f)->private_data;
ret = sev_lock_two_vms(kvm, source_kvm);
if (ret)
goto e_source_fput;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index 863b2a34b2d6..a9298cb8d19a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -43,10 +43,10 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
uint32_t id;
int r;
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
- r = amdgpu_file_to_fpriv(f.file, &fpriv);
+ r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
if (r) {
fdput(f);
return r;
@@ -72,10 +72,10 @@ static int amdgpu_sched_context_priority_override(struct amdgpu_device *adev,
struct amdgpu_ctx *ctx;
int r;
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
- r = amdgpu_file_to_fpriv(f.file, &fpriv);
+ r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
if (r) {
fdput(f);
return r;
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index a0e94217b511..7fb31ca3b5fc 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -715,16 +715,16 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
struct fd f = fdget(fd);
int ret;
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
- if (f.file->f_op != &drm_syncobj_file_fops) {
+ if (fd_file(f)->f_op != &drm_syncobj_file_fops) {
fdput(f);
return -EINVAL;
}
/* take a reference to put in the idr */
- syncobj = f.file->private_data;
+ syncobj = fd_file(f)->private_data;
drm_syncobj_get(syncobj);
idr_preload(GFP_KERNEL);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 5f5ad8faf86e..dc57d07a1f45 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1624,13 +1624,13 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
/* Get current fd to protect against it being closed */
f = fdget(cmd.fd);
- if (!f.file)
+ if (!fd_file(f))
return -ENOENT;
- if (f.file->f_op != &ucma_fops) {
+ if (fd_file(f)->f_op != &ucma_fops) {
ret = -EINVAL;
goto file_put;
}
- cur_file = f.file->private_data;
+ cur_file = fd_file(f)->private_data;
/* Validate current fd and prevent destruction of id. */
ctx = ucma_get_ctx(cur_file, cmd.id);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3d3ee3eca983..03ea3afcb31f 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -584,12 +584,12 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
if (cmd.fd != -1) {
/* search for file descriptor */
f = fdget(cmd.fd);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto err_tree_mutex_unlock;
}
- inode = file_inode(f.file);
+ inode = file_inode(fd_file(f));
xrcd = find_xrcd(ibudev, inode);
if (!xrcd && !(cmd.oflags & O_CREAT)) {
/* no file descriptor. Need CREATE flag */
@@ -632,7 +632,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
atomic_inc(&xrcd->usecnt);
}
- if (f.file)
+ if (fd_file(f))
fdput(f);
mutex_unlock(&ibudev->xrcd_tree_mutex);
@@ -648,7 +648,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
uobj_alloc_abort(&obj->uobject, attrs);
err_tree_mutex_unlock:
- if (f.file)
+ if (fd_file(f))
fdput(f);
mutex_unlock(&ibudev->xrcd_tree_mutex);
diff --git a/drivers/media/mc/mc-request.c b/drivers/media/mc/mc-request.c
index addb8f2d8939..e064914c476e 100644
--- a/drivers/media/mc/mc-request.c
+++ b/drivers/media/mc/mc-request.c
@@ -254,12 +254,12 @@ media_request_get_by_fd(struct media_device *mdev, int request_fd)
return ERR_PTR(-EBADR);
f = fdget(request_fd);
- if (!f.file)
+ if (!fd_file(f))
goto err_no_req_fd;
- if (f.file->f_op != &request_fops)
+ if (fd_file(f)->f_op != &request_fops)
goto err_fput;
- req = f.file->private_data;
+ req = fd_file(f)->private_data;
if (req->mdev != mdev)
goto err_fput;
diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
index 717c441b4a86..b8dfd530fab7 100644
--- a/drivers/media/rc/lirc_dev.c
+++ b/drivers/media/rc/lirc_dev.c
@@ -820,20 +820,20 @@ struct rc_dev *rc_dev_get_from_fd(int fd, bool write)
struct lirc_fh *fh;
struct rc_dev *dev;
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &lirc_fops) {
+ if (fd_file(f)->f_op != &lirc_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- if (write && !(f.file->f_mode & FMODE_WRITE)) {
+ if (write && !(fd_file(f)->f_mode & FMODE_WRITE)) {
fdput(f);
return ERR_PTR(-EPERM);
}
- fh = f.file->private_data;
+ fh = fd_file(f)->private_data;
dev = fh->rc;
get_device(&dev->dev);
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index 610a429c6191..f0d77e3c7196 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -112,7 +112,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
return -EFAULT;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
mutex_lock(&group->group_lock);
@@ -125,13 +125,13 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
goto out_unlock;
}
- container = vfio_container_from_file(f.file);
+ container = vfio_container_from_file(fd_file(f));
if (container) {
ret = vfio_container_attach_group(container, group);
goto out_unlock;
}
- iommufd = iommufd_ctx_from_file(f.file);
+ iommufd = iommufd_ctx_from_file(fd_file(f));
if (!IS_ERR(iommufd)) {
if (IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
group->type == VFIO_NO_IOMMU)
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
index 532269133801..d22881245e89 100644
--- a/drivers/vfio/virqfd.c
+++ b/drivers/vfio/virqfd.c
@@ -134,12 +134,12 @@ int vfio_virqfd_enable(void *opaque,
INIT_WORK(&virqfd->flush_inject, virqfd_flush_inject);
irqfd = fdget(fd);
- if (!irqfd.file) {
+ if (!fd_file(irqfd)) {
ret = -EBADF;
goto err_fd;
}
- ctx = eventfd_ctx_fileget(irqfd.file);
+ ctx = eventfd_ctx_fileget(fd_file(irqfd));
if (IS_ERR(ctx)) {
ret = PTR_ERR(ctx);
goto err_ctx;
@@ -171,7 +171,7 @@ int vfio_virqfd_enable(void *opaque,
init_waitqueue_func_entry(&virqfd->wait, virqfd_wakeup);
init_poll_funcptr(&virqfd->pt, virqfd_ptable_queue_proc);
- events = vfs_poll(irqfd.file, &virqfd->pt);
+ events = vfs_poll(fd_file(irqfd), &virqfd->pt);
/*
* Check if there was an event already pending on the eventfd
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
index d4ad211dce7a..9994d818bb7e 100644
--- a/drivers/virt/acrn/irqfd.c
+++ b/drivers/virt/acrn/irqfd.c
@@ -125,12 +125,12 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work);
f = fdget(args->fd);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto out;
}
- eventfd = eventfd_ctx_fileget(f.file);
+ eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(eventfd)) {
ret = PTR_ERR(eventfd);
goto fail;
@@ -157,7 +157,7 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
mutex_unlock(&vm->irqfds_lock);
/* Check the pending event in this stage */
- events = vfs_poll(f.file, &irqfd->pt);
+ events = vfs_poll(fd_file(f), &irqfd->pt);
if (events & EPOLLIN)
acrn_irqfd_inject(irqfd);
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 67dfa4778864..c35c2455aa61 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -950,12 +950,12 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
INIT_WORK(&kirqfd->shutdown, irqfd_shutdown);
f = fdget(irqfd->fd);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto error_kfree;
}
- kirqfd->eventfd = eventfd_ctx_fileget(f.file);
+ kirqfd->eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(kirqfd->eventfd)) {
ret = PTR_ERR(kirqfd->eventfd);
goto error_fd_put;
@@ -985,7 +985,7 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
* Check if there was an event already pending on the eventfd before we
* registered, and trigger it as if we didn't miss it.
*/
- events = vfs_poll(f.file, &kirqfd->pt);
+ events = vfs_poll(fd_file(f), &kirqfd->pt);
if (events & EPOLLIN)
irqfd_inject(kirqfd);
@@ -1331,12 +1331,12 @@ static int privcmd_ioeventfd_assign(struct privcmd_ioeventfd *ioeventfd)
return -ENOMEM;
f = fdget(ioeventfd->event_fd);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto error_kfree;
}
- kioeventfd->eventfd = eventfd_ctx_fileget(f.file);
+ kioeventfd->eventfd = eventfd_ctx_fileget(fd_file(f));
fdput(f);
if (IS_ERR(kioeventfd->eventfd)) {
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index efd5d6e9589e..2dccb7f90e4d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1311,12 +1311,12 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
} else {
struct fd src = fdget(fd);
struct inode *src_inode;
- if (!src.file) {
+ if (!fd_file(src)) {
ret = -EINVAL;
goto out_drop_write;
}
- src_inode = file_inode(src.file);
+ src_inode = file_inode(fd_file(src));
if (src_inode->i_sb != file_inode(file)->i_sb) {
btrfs_info(BTRFS_I(file_inode(file))->root->fs_info,
"Snapshot src from another FS");
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 6898dc621011..7d56b6d1e4c3 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -127,9 +127,9 @@ static int coda_parse_fd(struct fs_context *fc, int fd)
int idx;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- inode = file_inode(f.file);
+ inode = file_inode(fd_file(f));
if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR) {
fdput(f);
return invalf(fc, "code: Not coda psdev");
diff --git a/fs/eventfd.c b/fs/eventfd.c
index 9afdb722fa92..22c934f3a080 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -349,9 +349,9 @@ struct eventfd_ctx *eventfd_ctx_fdget(int fd)
{
struct eventfd_ctx *ctx;
struct fd f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- ctx = eventfd_ctx_fileget(f.file);
+ ctx = eventfd_ctx_fileget(fd_file(f));
fdput(f);
return ctx;
}
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index f53ca4f7fced..28d1a754cf33 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2266,17 +2266,17 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
error = -EBADF;
f = fdget(epfd);
- if (!f.file)
+ if (!fd_file(f))
goto error_return;
/* Get the "struct file *" for the target file */
tf = fdget(fd);
- if (!tf.file)
+ if (!fd_file(tf))
goto error_fput;
/* The target file descriptor must support poll */
error = -EPERM;
- if (!file_can_poll(tf.file))
+ if (!file_can_poll(fd_file(tf)))
goto error_tgt_fput;
/* Check if EPOLLWAKEUP is allowed */
@@ -2289,7 +2289,7 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
* adding an epoll file descriptor inside itself.
*/
error = -EINVAL;
- if (f.file == tf.file || !is_file_epoll(f.file))
+ if (fd_file(f) == fd_file(tf) || !is_file_epoll(fd_file(f)))
goto error_tgt_fput;
/*
@@ -2300,7 +2300,7 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
if (ep_op_has_event(op) && (epds->events & EPOLLEXCLUSIVE)) {
if (op == EPOLL_CTL_MOD)
goto error_tgt_fput;
- if (op == EPOLL_CTL_ADD && (is_file_epoll(tf.file) ||
+ if (op == EPOLL_CTL_ADD && (is_file_epoll(fd_file(tf)) ||
(epds->events & ~EPOLLEXCLUSIVE_OK_BITS)))
goto error_tgt_fput;
}
@@ -2309,7 +2309,7 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
* At this point it is safe to assume that the "private_data" contains
* our own data structure.
*/
- ep = f.file->private_data;
+ ep = fd_file(f)->private_data;
/*
* When we insert an epoll file descriptor inside another epoll file
@@ -2330,16 +2330,16 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
if (error)
goto error_tgt_fput;
if (op == EPOLL_CTL_ADD) {
- if (READ_ONCE(f.file->f_ep) || ep->gen == loop_check_gen ||
- is_file_epoll(tf.file)) {
+ if (READ_ONCE(fd_file(f)->f_ep) || ep->gen == loop_check_gen ||
+ is_file_epoll(fd_file(tf))) {
mutex_unlock(&ep->mtx);
error = epoll_mutex_lock(&epnested_mutex, 0, nonblock);
if (error)
goto error_tgt_fput;
loop_check_gen++;
full_check = 1;
- if (is_file_epoll(tf.file)) {
- tep = tf.file->private_data;
+ if (is_file_epoll(fd_file(tf))) {
+ tep = fd_file(tf)->private_data;
error = -ELOOP;
if (ep_loop_check(ep, tep) != 0)
goto error_tgt_fput;
@@ -2355,14 +2355,14 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
* above, we can be sure to be able to use the item looked up by
* ep_find() till we release the mutex.
*/
- epi = ep_find(ep, tf.file, fd);
+ epi = ep_find(ep, fd_file(tf), fd);
error = -EINVAL;
switch (op) {
case EPOLL_CTL_ADD:
if (!epi) {
epds->events |= EPOLLERR | EPOLLHUP;
- error = ep_insert(ep, epds, tf.file, fd, full_check);
+ error = ep_insert(ep, epds, fd_file(tf), fd, full_check);
} else
error = -EEXIST;
break;
@@ -2443,7 +2443,7 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
/* Get the "struct file *" for the eventpoll file */
f = fdget(epfd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
/*
@@ -2451,14 +2451,14 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
* the user passed to us _is_ an eventpoll file.
*/
error = -EINVAL;
- if (!is_file_epoll(f.file))
+ if (!is_file_epoll(fd_file(f)))
goto error_fput;
/*
* At this point it is safe to assume that the "private_data" contains
* our own data structure.
*/
- ep = f.file->private_data;
+ ep = fd_file(f)->private_data;
/* Time to fish for events ... */
error = ep_poll(ep, events, maxevents, to);
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index dab7acd49709..63eaa1fa2556 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -1343,10 +1343,10 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
me.moved_len = 0;
donor = fdget(me.donor_fd);
- if (!donor.file)
+ if (!fd_file(donor))
return -EBADF;
- if (!(donor.file->f_mode & FMODE_WRITE)) {
+ if (!(fd_file(donor)->f_mode & FMODE_WRITE)) {
err = -EBADF;
goto mext_out;
}
@@ -1367,7 +1367,7 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
if (err)
goto mext_out;
- err = ext4_move_extents(filp, donor.file, me.orig_start,
+ err = ext4_move_extents(filp, fd_file(donor), me.orig_start,
me.donor_start, me.len, &me.moved_len);
mnt_drop_write_file(filp);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5c0b281a70f3..89db22f9488b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2970,10 +2970,10 @@ static int __f2fs_ioc_move_range(struct file *filp,
return -EBADF;
dst = fdget(range->dst_fd);
- if (!dst.file)
+ if (!fd_file(dst))
return -EBADF;
- if (!(dst.file->f_mode & FMODE_WRITE)) {
+ if (!(fd_file(dst)->f_mode & FMODE_WRITE)) {
err = -EBADF;
goto err_out;
}
@@ -2982,7 +2982,7 @@ static int __f2fs_ioc_move_range(struct file *filp,
if (err)
goto err_out;
- err = f2fs_move_file_range(filp, range->pos_in, dst.file,
+ err = f2fs_move_file_range(filp, range->pos_in, fd_file(dst),
range->pos_out, range->len);
mnt_drop_write_file(filp);
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 300e5d9ad913..2b5616762354 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -340,7 +340,7 @@ static long f_dupfd_query(int fd, struct file *filp)
* overkill, but given our lockless file pointer lookup, the
* alternatives are complicated.
*/
- return f.file == filp;
+ return fd_file(f) == filp;
}
static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
@@ -479,17 +479,17 @@ SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
struct fd f = fdget_raw(fd);
long err = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
goto out;
- if (unlikely(f.file->f_mode & FMODE_PATH)) {
+ if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
goto out1;
}
- err = security_file_fcntl(f.file, cmd, arg);
+ err = security_file_fcntl(fd_file(f), cmd, arg);
if (!err)
- err = do_fcntl(fd, cmd, arg, f.file);
+ err = do_fcntl(fd, cmd, arg, fd_file(f));
out1:
fdput(f);
@@ -506,15 +506,15 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
struct flock64 flock;
long err = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
goto out;
- if (unlikely(f.file->f_mode & FMODE_PATH)) {
+ if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
goto out1;
}
- err = security_file_fcntl(f.file, cmd, arg);
+ err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
goto out1;
@@ -524,7 +524,7 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
err = -EFAULT;
if (copy_from_user(&flock, argp, sizeof(flock)))
break;
- err = fcntl_getlk64(f.file, cmd, &flock);
+ err = fcntl_getlk64(fd_file(f), cmd, &flock);
if (!err && copy_to_user(argp, &flock, sizeof(flock)))
err = -EFAULT;
break;
@@ -535,10 +535,10 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
err = -EFAULT;
if (copy_from_user(&flock, argp, sizeof(flock)))
break;
- err = fcntl_setlk64(fd, f.file, cmd, &flock);
+ err = fcntl_setlk64(fd, fd_file(f), cmd, &flock);
break;
default:
- err = do_fcntl(fd, cmd, arg, f.file);
+ err = do_fcntl(fd, cmd, arg, fd_file(f));
break;
}
out1:
@@ -643,15 +643,15 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
struct flock flock;
long err = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
return err;
- if (unlikely(f.file->f_mode & FMODE_PATH)) {
+ if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
goto out_put;
}
- err = security_file_fcntl(f.file, cmd, arg);
+ err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
goto out_put;
@@ -660,7 +660,7 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
err = get_compat_flock(&flock, compat_ptr(arg));
if (err)
break;
- err = fcntl_getlk(f.file, convert_fcntl_cmd(cmd), &flock);
+ err = fcntl_getlk(fd_file(f), convert_fcntl_cmd(cmd), &flock);
if (err)
break;
err = fixup_compat_flock(&flock);
@@ -672,7 +672,7 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
err = get_compat_flock64(&flock, compat_ptr(arg));
if (err)
break;
- err = fcntl_getlk(f.file, convert_fcntl_cmd(cmd), &flock);
+ err = fcntl_getlk(fd_file(f), convert_fcntl_cmd(cmd), &flock);
if (!err)
err = put_compat_flock64(&flock, compat_ptr(arg));
break;
@@ -681,7 +681,7 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
err = get_compat_flock(&flock, compat_ptr(arg));
if (err)
break;
- err = fcntl_setlk(fd, f.file, convert_fcntl_cmd(cmd), &flock);
+ err = fcntl_setlk(fd, fd_file(f), convert_fcntl_cmd(cmd), &flock);
break;
case F_SETLK64:
case F_SETLKW64:
@@ -690,10 +690,10 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
err = get_compat_flock64(&flock, compat_ptr(arg));
if (err)
break;
- err = fcntl_setlk(fd, f.file, convert_fcntl_cmd(cmd), &flock);
+ err = fcntl_setlk(fd, fd_file(f), convert_fcntl_cmd(cmd), &flock);
break;
default:
- err = do_fcntl(fd, cmd, arg, f.file);
+ err = do_fcntl(fd, cmd, arg, fd_file(f));
break;
}
out_put:
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 8a7f86c2139a..94fc4126eaa4 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -126,9 +126,9 @@ static struct vfsmount *get_vfsmount_from_fd(int fd)
spin_unlock(&fs->lock);
} else {
struct fd f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- mnt = mntget(f.file->f_path.mnt);
+ mnt = mntget(fd_file(f)->f_path.mnt);
fdput(f);
}
return mnt;
diff --git a/fs/fsopen.c b/fs/fsopen.c
index 6593ae518115..cf9f37b2a5d4 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -398,13 +398,13 @@ SYSCALL_DEFINE5(fsconfig,
}
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
ret = -EINVAL;
- if (f.file->f_op != &fscontext_fops)
+ if (fd_file(f)->f_op != &fscontext_fops)
goto out_f;
- fc = f.file->private_data;
+ fc = fd_file(f)->private_data;
if (fc->ops == &legacy_fs_context_ops) {
switch (cmd) {
case FSCONFIG_SET_BINARY:
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 9eb191b5c4de..991b9ae8e7c9 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -2321,15 +2321,15 @@ static long fuse_dev_ioctl_clone(struct file *file, __u32 __user *argp)
return -EFAULT;
f = fdget(oldfd);
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
/*
* Check against file->f_op because CUSE
* uses the same ioctl handler.
*/
- if (f.file->f_op == file->f_op)
- fud = fuse_get_dev(f.file);
+ if (fd_file(f)->f_op == file->f_op)
+ fud = fuse_get_dev(fd_file(f));
res = -EINVAL;
if (fud) {
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 64776891120c..6e0c954388d4 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -235,9 +235,9 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
loff_t cloned;
int ret;
- if (!src_file.file)
+ if (!fd_file(src_file))
return -EBADF;
- cloned = vfs_clone_file_range(src_file.file, off, dst_file, destoff,
+ cloned = vfs_clone_file_range(fd_file(src_file), off, dst_file, destoff,
olen, 0);
if (cloned < 0)
ret = cloned;
@@ -895,16 +895,16 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
struct fd f = fdget(fd);
int error;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = security_file_ioctl(f.file, cmd, arg);
+ error = security_file_ioctl(fd_file(f), cmd, arg);
if (error)
goto out;
- error = do_vfs_ioctl(f.file, fd, cmd, arg);
+ error = do_vfs_ioctl(fd_file(f), fd, cmd, arg);
if (error == -ENOIOCTLCMD)
- error = vfs_ioctl(f.file, cmd, arg);
+ error = vfs_ioctl(fd_file(f), cmd, arg);
out:
fdput(f);
@@ -953,32 +953,32 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
struct fd f = fdget(fd);
int error;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = security_file_ioctl_compat(f.file, cmd, arg);
+ error = security_file_ioctl_compat(fd_file(f), cmd, arg);
if (error)
goto out;
switch (cmd) {
/* FICLONE takes an int argument, so don't use compat_ptr() */
case FICLONE:
- error = ioctl_file_clone(f.file, arg, 0, 0, 0);
+ error = ioctl_file_clone(fd_file(f), arg, 0, 0, 0);
break;
#if defined(CONFIG_X86_64)
/* these get messy on amd64 due to alignment differences */
case FS_IOC_RESVSP_32:
case FS_IOC_RESVSP64_32:
- error = compat_ioctl_preallocate(f.file, 0, compat_ptr(arg));
+ error = compat_ioctl_preallocate(fd_file(f), 0, compat_ptr(arg));
break;
case FS_IOC_UNRESVSP_32:
case FS_IOC_UNRESVSP64_32:
- error = compat_ioctl_preallocate(f.file, FALLOC_FL_PUNCH_HOLE,
+ error = compat_ioctl_preallocate(fd_file(f), FALLOC_FL_PUNCH_HOLE,
compat_ptr(arg));
break;
case FS_IOC_ZERO_RANGE_32:
- error = compat_ioctl_preallocate(f.file, FALLOC_FL_ZERO_RANGE,
+ error = compat_ioctl_preallocate(fd_file(f), FALLOC_FL_ZERO_RANGE,
compat_ptr(arg));
break;
#endif
@@ -998,13 +998,13 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
* argument.
*/
default:
- error = do_vfs_ioctl(f.file, fd, cmd,
+ error = do_vfs_ioctl(fd_file(f), fd, cmd,
(unsigned long)compat_ptr(arg));
if (error != -ENOIOCTLCMD)
break;
- if (f.file->f_op->compat_ioctl)
- error = f.file->f_op->compat_ioctl(f.file, cmd, arg);
+ if (fd_file(f)->f_op->compat_ioctl)
+ error = fd_file(f)->f_op->compat_ioctl(fd_file(f), cmd, arg);
if (error == -ENOIOCTLCMD)
error = -ENOTTY;
break;
diff --git a/fs/kernel_read_file.c b/fs/kernel_read_file.c
index c429c42a6867..9ff37ae650ea 100644
--- a/fs/kernel_read_file.c
+++ b/fs/kernel_read_file.c
@@ -178,10 +178,10 @@ ssize_t kernel_read_file_from_fd(int fd, loff_t offset, void **buf,
struct fd f = fdget(fd);
ssize_t ret = -EBADF;
- if (!f.file || !(f.file->f_mode & FMODE_READ))
+ if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
goto out;
- ret = kernel_read_file(f.file, offset, buf, buf_size, file_size, id);
+ ret = kernel_read_file(fd_file(f), offset, buf, buf_size, file_size, id);
out:
fdput(f);
return ret;
diff --git a/fs/locks.c b/fs/locks.c
index 90c8746874de..ee8e1925dc42 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2153,15 +2153,15 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
error = -EBADF;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return error;
- if (type != F_UNLCK && !(f.file->f_mode & (FMODE_READ | FMODE_WRITE)))
+ if (type != F_UNLCK && !(fd_file(f)->f_mode & (FMODE_READ | FMODE_WRITE)))
goto out_putf;
- flock_make_lock(f.file, &fl, type);
+ flock_make_lock(fd_file(f), &fl, type);
- error = security_file_lock(f.file, fl.c.flc_type);
+ error = security_file_lock(fd_file(f), fl.c.flc_type);
if (error)
goto out_putf;
@@ -2169,12 +2169,12 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
if (can_sleep)
fl.c.flc_flags |= FL_SLEEP;
- if (f.file->f_op->flock)
- error = f.file->f_op->flock(f.file,
+ if (fd_file(f)->f_op->flock)
+ error = fd_file(f)->f_op->flock(fd_file(f),
(can_sleep) ? F_SETLKW : F_SETLK,
&fl);
else
- error = locks_lock_file_wait(f.file, &fl);
+ error = locks_lock_file_wait(fd_file(f), &fl);
locks_release_private(&fl);
out_putf:
diff --git a/fs/namei.c b/fs/namei.c
index 37fb0a8aa09a..72736b6328a6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2419,25 +2419,25 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
struct fd f = fdget_raw(nd->dfd);
struct dentry *dentry;
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
if (flags & LOOKUP_LINKAT_EMPTY) {
- if (f.file->f_cred != current_cred() &&
- !ns_capable(f.file->f_cred->user_ns, CAP_DAC_READ_SEARCH)) {
+ if (fd_file(f)->f_cred != current_cred() &&
+ !ns_capable(fd_file(f)->f_cred->user_ns, CAP_DAC_READ_SEARCH)) {
fdput(f);
return ERR_PTR(-ENOENT);
}
}
- dentry = f.file->f_path.dentry;
+ dentry = fd_file(f)->f_path.dentry;
if (*s && unlikely(!d_can_lookup(dentry))) {
fdput(f);
return ERR_PTR(-ENOTDIR);
}
- nd->path = f.file->f_path;
+ nd->path = fd_file(f)->f_path;
if (flags & LOOKUP_RCU) {
nd->inode = nd->path.dentry->d_inode;
nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq);
diff --git a/fs/namespace.c b/fs/namespace.c
index 5a51315c6678..7c8248aca8bd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3977,14 +3977,14 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
}
f = fdget(fs_fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
ret = -EINVAL;
- if (f.file->f_op != &fscontext_fops)
+ if (fd_file(f)->f_op != &fscontext_fops)
goto err_fsfd;
- fc = f.file->private_data;
+ fc = fd_file(f)->private_data;
ret = mutex_lock_interruptible(&fc->uapi_mutex);
if (ret < 0)
@@ -4527,15 +4527,15 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
return -EINVAL;
f = fdget(attr->userns_fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (!proc_ns_file(f.file)) {
+ if (!proc_ns_file(fd_file(f))) {
err = -EINVAL;
goto out_fput;
}
- ns = get_proc_ns(file_inode(f.file));
+ ns = get_proc_ns(file_inode(fd_file(f)));
if (ns->ops->type != CLONE_NEWUSER) {
err = -EINVAL;
goto out_fput;
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 9ec313e9f6e1..13454e5fd3fb 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1006,17 +1006,17 @@ static int fanotify_find_path(int dfd, const char __user *filename,
struct fd f = fdget(dfd);
ret = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
goto out;
ret = -ENOTDIR;
if ((flags & FAN_MARK_ONLYDIR) &&
- !(S_ISDIR(file_inode(f.file)->i_mode))) {
+ !(S_ISDIR(file_inode(fd_file(f))->i_mode))) {
fdput(f);
goto out;
}
- *path = f.file->f_path;
+ *path = fd_file(f)->f_path;
path_get(path);
fdput(f);
} else {
@@ -1753,14 +1753,14 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
}
f = fdget(fanotify_fd);
- if (unlikely(!f.file))
+ if (unlikely(!fd_file(f)))
return -EBADF;
/* verify that this is indeed an fanotify instance */
ret = -EINVAL;
- if (unlikely(f.file->f_op != &fanotify_fops))
+ if (unlikely(fd_file(f)->f_op != &fanotify_fops))
goto fput_and_out;
- group = f.file->private_data;
+ group = fd_file(f)->private_data;
/*
* An unprivileged user is not allowed to setup mount nor filesystem
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 4ffc30606e0b..c7e451d5bd51 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -753,7 +753,7 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
return -EINVAL;
f = fdget(fd);
- if (unlikely(!f.file))
+ if (unlikely(!fd_file(f)))
return -EBADF;
/* IN_MASK_ADD and IN_MASK_CREATE don't make sense together */
@@ -763,7 +763,7 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
}
/* verify that this is indeed an inotify instance */
- if (unlikely(f.file->f_op != &inotify_fops)) {
+ if (unlikely(fd_file(f)->f_op != &inotify_fops)) {
ret = -EINVAL;
goto fput_and_out;
}
@@ -780,7 +780,7 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
/* inode held in place by reference to path; group by fget on fd */
inode = path.dentry->d_inode;
- group = f.file->private_data;
+ group = fd_file(f)->private_data;
/* create/update an inode mark */
ret = inotify_update_watch(group, inode, mask);
@@ -798,14 +798,14 @@ SYSCALL_DEFINE2(inotify_rm_watch, int, fd, __s32, wd)
int ret = -EINVAL;
f = fdget(fd);
- if (unlikely(!f.file))
+ if (unlikely(!fd_file(f)))
return -EBADF;
/* verify that this is indeed an inotify instance */
- if (unlikely(f.file->f_op != &inotify_fops))
+ if (unlikely(fd_file(f)->f_op != &inotify_fops))
goto out;
- group = f.file->private_data;
+ group = fd_file(f)->private_data;
i_mark = inotify_idr_find(group, wd);
if (unlikely(!i_mark))
diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 1bde1281d514..4b9f45d7049e 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -1785,17 +1785,17 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
goto out;
f = fdget(fd);
- if (f.file == NULL)
+ if (fd_file(f) == NULL)
goto out;
if (reg->hr_blocks == 0 || reg->hr_start_block == 0 ||
reg->hr_block_bytes == 0)
goto out2;
- if (!S_ISBLK(f.file->f_mapping->host->i_mode))
+ if (!S_ISBLK(fd_file(f)->f_mapping->host->i_mode))
goto out2;
- reg->hr_bdev_file = bdev_file_open_by_dev(f.file->f_mapping->host->i_rdev,
+ reg->hr_bdev_file = bdev_file_open_by_dev(fd_file(f)->f_mapping->host->i_rdev,
BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
if (IS_ERR(reg->hr_bdev_file)) {
ret = PTR_ERR(reg->hr_bdev_file);
diff --git a/fs/open.c b/fs/open.c
index 89cafb572061..29e0ec819bc4 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -193,10 +193,10 @@ long do_sys_ftruncate(unsigned int fd, loff_t length, int small)
if (length < 0)
return -EINVAL;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = do_ftruncate(f.file, length, small);
+ error = do_ftruncate(fd_file(f), length, small);
fdput(f);
return error;
@@ -349,8 +349,8 @@ int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len)
struct fd f = fdget(fd);
int error = -EBADF;
- if (f.file) {
- error = vfs_fallocate(f.file, mode, offset, len);
+ if (fd_file(f)) {
+ error = vfs_fallocate(fd_file(f), mode, offset, len);
fdput(f);
}
return error;
@@ -581,16 +581,16 @@ SYSCALL_DEFINE1(fchdir, unsigned int, fd)
int error;
error = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
goto out;
error = -ENOTDIR;
- if (!d_can_lookup(f.file->f_path.dentry))
+ if (!d_can_lookup(fd_file(f)->f_path.dentry))
goto out_putf;
- error = file_permission(f.file, MAY_EXEC | MAY_CHDIR);
+ error = file_permission(fd_file(f), MAY_EXEC | MAY_CHDIR);
if (!error)
- set_fs_pwd(current->fs, &f.file->f_path);
+ set_fs_pwd(current->fs, &fd_file(f)->f_path);
out_putf:
fdput(f);
out:
@@ -671,8 +671,8 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
struct fd f = fdget(fd);
int err = -EBADF;
- if (f.file) {
- err = vfs_fchmod(f.file, mode);
+ if (fd_file(f)) {
+ err = vfs_fchmod(fd_file(f), mode);
fdput(f);
}
return err;
@@ -865,8 +865,8 @@ int ksys_fchown(unsigned int fd, uid_t user, gid_t group)
struct fd f = fdget(fd);
int error = -EBADF;
- if (f.file) {
- error = vfs_fchown(f.file, user, group);
+ if (fd_file(f)) {
+ error = vfs_fchown(fd_file(f), user, group);
fdput(f);
}
return error;
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 1a411cae57ed..c4963d0c5549 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -209,13 +209,13 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
* files, so we use the real file to perform seeks.
*/
ovl_inode_lock(inode);
- real.file->f_pos = file->f_pos;
+ fd_file(real)->f_pos = file->f_pos;
old_cred = ovl_override_creds(inode->i_sb);
- ret = vfs_llseek(real.file, offset, whence);
+ ret = vfs_llseek(fd_file(real), offset, whence);
revert_creds(old_cred);
- file->f_pos = real.file->f_pos;
+ file->f_pos = fd_file(real)->f_pos;
ovl_inode_unlock(inode);
fdput(real);
@@ -275,7 +275,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
if (ret)
return ret;
- ret = backing_file_read_iter(real.file, iter, iocb, iocb->ki_flags,
+ ret = backing_file_read_iter(fd_file(real), iter, iocb, iocb->ki_flags,
&ctx);
fdput(real);
@@ -314,7 +314,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
* this property in case it is set by the issuer.
*/
ifl &= ~IOCB_DIO_CALLER_COMP;
- ret = backing_file_write_iter(real.file, iter, iocb, ifl, &ctx);
+ ret = backing_file_write_iter(fd_file(real), iter, iocb, ifl, &ctx);
fdput(real);
out_unlock:
@@ -339,7 +339,7 @@ static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
if (ret)
return ret;
- ret = backing_file_splice_read(real.file, ppos, pipe, len, flags, &ctx);
+ ret = backing_file_splice_read(fd_file(real), ppos, pipe, len, flags, &ctx);
fdput(real);
return ret;
@@ -348,7 +348,7 @@ static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
/*
* Calling iter_file_splice_write() directly from overlay's f_op may deadlock
* due to lock order inversion between pipe->mutex in iter_file_splice_write()
- * and file_start_write(real.file) in ovl_write_iter().
+ * and file_start_write(fd_file(real)) in ovl_write_iter().
*
* So do everything ovl_write_iter() does and call iter_file_splice_write() on
* the real file.
@@ -373,7 +373,7 @@ static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
if (ret)
goto out_unlock;
- ret = backing_file_splice_write(pipe, real.file, ppos, len, flags, &ctx);
+ ret = backing_file_splice_write(pipe, fd_file(real), ppos, len, flags, &ctx);
fdput(real);
out_unlock:
@@ -397,9 +397,9 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
return ret;
/* Don't sync lower file for fear of receiving EROFS error */
- if (file_inode(real.file) == ovl_inode_upper(file_inode(file))) {
+ if (file_inode(fd_file(real)) == ovl_inode_upper(file_inode(file))) {
old_cred = ovl_override_creds(file_inode(file)->i_sb);
- ret = vfs_fsync_range(real.file, start, end, datasync);
+ ret = vfs_fsync_range(fd_file(real), start, end, datasync);
revert_creds(old_cred);
}
@@ -439,7 +439,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
goto out_unlock;
old_cred = ovl_override_creds(file_inode(file)->i_sb);
- ret = vfs_fallocate(real.file, mode, offset, len);
+ ret = vfs_fallocate(fd_file(real), mode, offset, len);
revert_creds(old_cred);
/* Update size */
@@ -464,7 +464,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
return ret;
old_cred = ovl_override_creds(file_inode(file)->i_sb);
- ret = vfs_fadvise(real.file, offset, len, advice);
+ ret = vfs_fadvise(fd_file(real), offset, len, advice);
revert_creds(old_cred);
fdput(real);
@@ -509,18 +509,18 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
switch (op) {
case OVL_COPY:
- ret = vfs_copy_file_range(real_in.file, pos_in,
- real_out.file, pos_out, len, flags);
+ ret = vfs_copy_file_range(fd_file(real_in), pos_in,
+ fd_file(real_out), pos_out, len, flags);
break;
case OVL_CLONE:
- ret = vfs_clone_file_range(real_in.file, pos_in,
- real_out.file, pos_out, len, flags);
+ ret = vfs_clone_file_range(fd_file(real_in), pos_in,
+ fd_file(real_out), pos_out, len, flags);
break;
case OVL_DEDUPE:
- ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
- real_out.file, pos_out, len,
+ ret = vfs_dedupe_file_range_one(fd_file(real_in), pos_in,
+ fd_file(real_out), pos_out, len,
flags);
break;
}
@@ -583,9 +583,9 @@ static int ovl_flush(struct file *file, fl_owner_t id)
if (err)
return err;
- if (real.file->f_op->flush) {
+ if (fd_file(real)->f_op->flush) {
old_cred = ovl_override_creds(file_inode(file)->i_sb);
- err = real.file->f_op->flush(real.file, id);
+ err = fd_file(real)->f_op->flush(fd_file(real), id);
revert_creds(old_cred);
}
fdput(real);
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 0e41fb84060f..290157bc7bec 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -980,7 +980,7 @@ SYSCALL_DEFINE4(quotactl_fd, unsigned int, fd, unsigned int, cmd,
int ret;
f = fdget_raw(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
ret = -EINVAL;
@@ -988,12 +988,12 @@ SYSCALL_DEFINE4(quotactl_fd, unsigned int, fd, unsigned int, cmd,
goto out;
if (quotactl_cmd_write(cmds)) {
- ret = mnt_want_write(f.file->f_path.mnt);
+ ret = mnt_want_write(fd_file(f)->f_path.mnt);
if (ret)
goto out;
}
- sb = f.file->f_path.mnt->mnt_sb;
+ sb = fd_file(f)->f_path.mnt->mnt_sb;
if (quotactl_cmd_onoff(cmds))
down_write(&sb->s_umount);
else
@@ -1007,7 +1007,7 @@ SYSCALL_DEFINE4(quotactl_fd, unsigned int, fd, unsigned int, cmd,
up_read(&sb->s_umount);
if (quotactl_cmd_write(cmds))
- mnt_drop_write(f.file->f_path.mnt);
+ mnt_drop_write(fd_file(f)->f_path.mnt);
out:
fdput(f);
return ret;
diff --git a/fs/read_write.c b/fs/read_write.c
index ef6339391351..415b509df359 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -294,12 +294,12 @@ static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
{
off_t retval;
struct fd f = fdget_pos(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
retval = -EINVAL;
if (whence <= SEEK_MAX) {
- loff_t res = vfs_llseek(f.file, offset, whence);
+ loff_t res = vfs_llseek(fd_file(f), offset, whence);
retval = res;
if (res != (loff_t)retval)
retval = -EOVERFLOW; /* LFS: should only happen on 32 bit platforms */
@@ -330,14 +330,14 @@ SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
struct fd f = fdget_pos(fd);
loff_t offset;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
retval = -EINVAL;
if (whence > SEEK_MAX)
goto out_putf;
- offset = vfs_llseek(f.file, ((loff_t) offset_high << 32) | offset_low,
+ offset = vfs_llseek(fd_file(f), ((loff_t) offset_high << 32) | offset_low,
whence);
retval = (int)offset;
@@ -610,15 +610,15 @@ ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
- if (f.file) {
- loff_t pos, *ppos = file_ppos(f.file);
+ if (fd_file(f)) {
+ loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
ppos = &pos;
}
- ret = vfs_read(f.file, buf, count, ppos);
+ ret = vfs_read(fd_file(f), buf, count, ppos);
if (ret >= 0 && ppos)
- f.file->f_pos = pos;
+ fd_file(f)->f_pos = pos;
fdput_pos(f);
}
return ret;
@@ -634,15 +634,15 @@ ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
- if (f.file) {
- loff_t pos, *ppos = file_ppos(f.file);
+ if (fd_file(f)) {
+ loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
ppos = &pos;
}
- ret = vfs_write(f.file, buf, count, ppos);
+ ret = vfs_write(fd_file(f), buf, count, ppos);
if (ret >= 0 && ppos)
- f.file->f_pos = pos;
+ fd_file(f)->f_pos = pos;
fdput_pos(f);
}
@@ -665,10 +665,10 @@ ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
return -EINVAL;
f = fdget(fd);
- if (f.file) {
+ if (fd_file(f)) {
ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PREAD)
- ret = vfs_read(f.file, buf, count, &pos);
+ if (fd_file(f)->f_mode & FMODE_PREAD)
+ ret = vfs_read(fd_file(f), buf, count, &pos);
fdput(f);
}
@@ -699,10 +699,10 @@ ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
return -EINVAL;
f = fdget(fd);
- if (f.file) {
+ if (fd_file(f)) {
ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PWRITE)
- ret = vfs_write(f.file, buf, count, &pos);
+ if (fd_file(f)->f_mode & FMODE_PWRITE)
+ ret = vfs_write(fd_file(f), buf, count, &pos);
fdput(f);
}
@@ -985,15 +985,15 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
- if (f.file) {
- loff_t pos, *ppos = file_ppos(f.file);
+ if (fd_file(f)) {
+ loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
ppos = &pos;
}
- ret = vfs_readv(f.file, vec, vlen, ppos, flags);
+ ret = vfs_readv(fd_file(f), vec, vlen, ppos, flags);
if (ret >= 0 && ppos)
- f.file->f_pos = pos;
+ fd_file(f)->f_pos = pos;
fdput_pos(f);
}
@@ -1009,15 +1009,15 @@ static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
- if (f.file) {
- loff_t pos, *ppos = file_ppos(f.file);
+ if (fd_file(f)) {
+ loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
ppos = &pos;
}
- ret = vfs_writev(f.file, vec, vlen, ppos, flags);
+ ret = vfs_writev(fd_file(f), vec, vlen, ppos, flags);
if (ret >= 0 && ppos)
- f.file->f_pos = pos;
+ fd_file(f)->f_pos = pos;
fdput_pos(f);
}
@@ -1043,10 +1043,10 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
return -EINVAL;
f = fdget(fd);
- if (f.file) {
+ if (fd_file(f)) {
ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PREAD)
- ret = vfs_readv(f.file, vec, vlen, &pos, flags);
+ if (fd_file(f)->f_mode & FMODE_PREAD)
+ ret = vfs_readv(fd_file(f), vec, vlen, &pos, flags);
fdput(f);
}
@@ -1066,10 +1066,10 @@ static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
return -EINVAL;
f = fdget(fd);
- if (f.file) {
+ if (fd_file(f)) {
ret = -ESPIPE;
- if (f.file->f_mode & FMODE_PWRITE)
- ret = vfs_writev(f.file, vec, vlen, &pos, flags);
+ if (fd_file(f)->f_mode & FMODE_PWRITE)
+ ret = vfs_writev(fd_file(f), vec, vlen, &pos, flags);
fdput(f);
}
@@ -1235,19 +1235,19 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
*/
retval = -EBADF;
in = fdget(in_fd);
- if (!in.file)
+ if (!fd_file(in))
goto out;
- if (!(in.file->f_mode & FMODE_READ))
+ if (!(fd_file(in)->f_mode & FMODE_READ))
goto fput_in;
retval = -ESPIPE;
if (!ppos) {
- pos = in.file->f_pos;
+ pos = fd_file(in)->f_pos;
} else {
pos = *ppos;
- if (!(in.file->f_mode & FMODE_PREAD))
+ if (!(fd_file(in)->f_mode & FMODE_PREAD))
goto fput_in;
}
- retval = rw_verify_area(READ, in.file, &pos, count);
+ retval = rw_verify_area(READ, fd_file(in), &pos, count);
if (retval < 0)
goto fput_in;
if (count > MAX_RW_COUNT)
@@ -1258,13 +1258,13 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
*/
retval = -EBADF;
out = fdget(out_fd);
- if (!out.file)
+ if (!fd_file(out))
goto fput_in;
- if (!(out.file->f_mode & FMODE_WRITE))
+ if (!(fd_file(out)->f_mode & FMODE_WRITE))
goto fput_out;
- in_inode = file_inode(in.file);
- out_inode = file_inode(out.file);
- out_pos = out.file->f_pos;
+ in_inode = file_inode(fd_file(in));
+ out_inode = file_inode(fd_file(out));
+ out_pos = fd_file(out)->f_pos;
if (!max)
max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
@@ -1284,33 +1284,33 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
* and the application is arguably buggy if it doesn't expect
* EAGAIN on a non-blocking file descriptor.
*/
- if (in.file->f_flags & O_NONBLOCK)
+ if (fd_file(in)->f_flags & O_NONBLOCK)
fl = SPLICE_F_NONBLOCK;
#endif
- opipe = get_pipe_info(out.file, true);
+ opipe = get_pipe_info(fd_file(out), true);
if (!opipe) {
- retval = rw_verify_area(WRITE, out.file, &out_pos, count);
+ retval = rw_verify_area(WRITE, fd_file(out), &out_pos, count);
if (retval < 0)
goto fput_out;
- retval = do_splice_direct(in.file, &pos, out.file, &out_pos,
+ retval = do_splice_direct(fd_file(in), &pos, fd_file(out), &out_pos,
count, fl);
} else {
- if (out.file->f_flags & O_NONBLOCK)
+ if (fd_file(out)->f_flags & O_NONBLOCK)
fl |= SPLICE_F_NONBLOCK;
- retval = splice_file_to_pipe(in.file, opipe, &pos, count, fl);
+ retval = splice_file_to_pipe(fd_file(in), opipe, &pos, count, fl);
}
if (retval > 0) {
add_rchar(current, retval);
add_wchar(current, retval);
- fsnotify_access(in.file);
- fsnotify_modify(out.file);
- out.file->f_pos = out_pos;
+ fsnotify_access(fd_file(in));
+ fsnotify_modify(fd_file(out));
+ fd_file(out)->f_pos = out_pos;
if (ppos)
*ppos = pos;
else
- in.file->f_pos = pos;
+ fd_file(in)->f_pos = pos;
}
inc_syscr(current);
@@ -1583,11 +1583,11 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
ssize_t ret = -EBADF;
f_in = fdget(fd_in);
- if (!f_in.file)
+ if (!fd_file(f_in))
goto out2;
f_out = fdget(fd_out);
- if (!f_out.file)
+ if (!fd_file(f_out))
goto out1;
ret = -EFAULT;
@@ -1595,21 +1595,21 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
if (copy_from_user(&pos_in, off_in, sizeof(loff_t)))
goto out;
} else {
- pos_in = f_in.file->f_pos;
+ pos_in = fd_file(f_in)->f_pos;
}
if (off_out) {
if (copy_from_user(&pos_out, off_out, sizeof(loff_t)))
goto out;
} else {
- pos_out = f_out.file->f_pos;
+ pos_out = fd_file(f_out)->f_pos;
}
ret = -EINVAL;
if (flags != 0)
goto out;
- ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len,
+ ret = vfs_copy_file_range(fd_file(f_in), pos_in, fd_file(f_out), pos_out, len,
flags);
if (ret > 0) {
pos_in += ret;
@@ -1619,14 +1619,14 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
if (copy_to_user(off_in, &pos_in, sizeof(loff_t)))
ret = -EFAULT;
} else {
- f_in.file->f_pos = pos_in;
+ fd_file(f_in)->f_pos = pos_in;
}
if (off_out) {
if (copy_to_user(off_out, &pos_out, sizeof(loff_t)))
ret = -EFAULT;
} else {
- f_out.file->f_pos = pos_out;
+ fd_file(f_out)->f_pos = pos_out;
}
}
diff --git a/fs/readdir.c b/fs/readdir.c
index 278bc0254732..eea0e6e9abcf 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -227,10 +227,10 @@ SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
.dirent = dirent
};
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = iterate_dir(f.file, &buf.ctx);
+ error = iterate_dir(fd_file(f), &buf.ctx);
if (buf.result)
error = buf.result;
@@ -320,10 +320,10 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
int error;
f = fdget_pos(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = iterate_dir(f.file, &buf.ctx);
+ error = iterate_dir(fd_file(f), &buf.ctx);
if (error >= 0)
error = buf.error;
if (buf.prev_reclen) {
@@ -403,10 +403,10 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd,
int error;
f = fdget_pos(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = iterate_dir(f.file, &buf.ctx);
+ error = iterate_dir(fd_file(f), &buf.ctx);
if (error >= 0)
error = buf.error;
if (buf.prev_reclen) {
@@ -485,10 +485,10 @@ COMPAT_SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
.dirent = dirent
};
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = iterate_dir(f.file, &buf.ctx);
+ error = iterate_dir(fd_file(f), &buf.ctx);
if (buf.result)
error = buf.result;
@@ -571,10 +571,10 @@ COMPAT_SYSCALL_DEFINE3(getdents, unsigned int, fd,
int error;
f = fdget_pos(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = iterate_dir(f.file, &buf.ctx);
+ error = iterate_dir(fd_file(f), &buf.ctx);
if (error >= 0)
error = buf.error;
if (buf.prev_reclen) {
diff --git a/fs/remap_range.c b/fs/remap_range.c
index 28246dfc8485..4403d5c68fcb 100644
--- a/fs/remap_range.c
+++ b/fs/remap_range.c
@@ -537,7 +537,7 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
for (i = 0, info = same->info; i < count; i++, info++) {
struct fd dst_fd = fdget(info->dest_fd);
- struct file *dst_file = dst_fd.file;
+ struct file *dst_file = fd_file(dst_fd);
if (!dst_file) {
info->status = -EBADF;
diff --git a/fs/select.c b/fs/select.c
index 9515c3fa1a03..97e1009dde00 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -532,10 +532,10 @@ static noinline_for_stack int do_select(int n, fd_set_bits *fds, struct timespec
continue;
mask = EPOLLNVAL;
f = fdget(i);
- if (f.file) {
+ if (fd_file(f)) {
wait_key_set(wait, in, out, bit,
busy_flag);
- mask = vfs_poll(f.file, wait);
+ mask = vfs_poll(fd_file(f), wait);
fdput(f);
}
@@ -864,13 +864,13 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
goto out;
mask = EPOLLNVAL;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
goto out;
/* userland u16 ->events contains POLL... bitmap */
filter = demangle_poll(pollfd->events) | EPOLLERR | EPOLLHUP;
pwait->_key = filter | busy_flag;
- mask = vfs_poll(f.file, pwait);
+ mask = vfs_poll(fd_file(f), pwait);
if (mask & busy_flag)
*can_busy_poll = true;
mask &= filter; /* Mask out unneeded events. */
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 4a5614442dbf..c39cf00ab28a 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -293,10 +293,10 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
fd_install(ufd, file);
} else {
struct fd f = fdget(ufd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- ctx = f.file->private_data;
- if (f.file->f_op != &signalfd_fops) {
+ ctx = fd_file(f)->private_data;
+ if (fd_file(f)->f_op != &signalfd_fops) {
fdput(f);
return -EINVAL;
}
diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c
index 855ac5a62edf..94bf2e5014d9 100644
--- a/fs/smb/client/ioctl.c
+++ b/fs/smb/client/ioctl.c
@@ -90,23 +90,23 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
}
src_file = fdget(srcfd);
- if (!src_file.file) {
+ if (!fd_file(src_file)) {
rc = -EBADF;
goto out_drop_write;
}
- if (src_file.file->f_op->unlocked_ioctl != cifs_ioctl) {
+ if (fd_file(src_file)->f_op->unlocked_ioctl != cifs_ioctl) {
rc = -EBADF;
cifs_dbg(VFS, "src file seems to be from a different filesystem type\n");
goto out_fput;
}
- src_inode = file_inode(src_file.file);
+ src_inode = file_inode(fd_file(src_file));
rc = -EINVAL;
if (S_ISDIR(src_inode->i_mode))
goto out_fput;
- rc = cifs_file_copychunk_range(xid, src_file.file, 0, dst_file, 0,
+ rc = cifs_file_copychunk_range(xid, fd_file(src_file), 0, dst_file, 0,
src_inode->i_size, 0);
if (rc > 0)
rc = 0;
diff --git a/fs/splice.c b/fs/splice.c
index 60aed8de21f8..06232d7e505f 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1566,11 +1566,11 @@ static ssize_t vmsplice_to_pipe(struct file *file, struct iov_iter *iter,
static int vmsplice_type(struct fd f, int *type)
{
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (f.file->f_mode & FMODE_WRITE) {
+ if (fd_file(f)->f_mode & FMODE_WRITE) {
*type = ITER_SOURCE;
- } else if (f.file->f_mode & FMODE_READ) {
+ } else if (fd_file(f)->f_mode & FMODE_READ) {
*type = ITER_DEST;
} else {
fdput(f);
@@ -1621,9 +1621,9 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
if (!iov_iter_count(&iter))
error = 0;
else if (type == ITER_SOURCE)
- error = vmsplice_to_pipe(f.file, &iter, flags);
+ error = vmsplice_to_pipe(fd_file(f), &iter, flags);
else
- error = vmsplice_to_user(f.file, &iter, flags);
+ error = vmsplice_to_user(fd_file(f), &iter, flags);
kfree(iov);
out_fdput:
@@ -1646,10 +1646,10 @@ SYSCALL_DEFINE6(splice, int, fd_in, loff_t __user *, off_in,
error = -EBADF;
in = fdget(fd_in);
- if (in.file) {
+ if (fd_file(in)) {
out = fdget(fd_out);
- if (out.file) {
- error = __do_splice(in.file, off_in, out.file, off_out,
+ if (fd_file(out)) {
+ error = __do_splice(fd_file(in), off_in, fd_file(out), off_out,
len, flags);
fdput(out);
}
@@ -2016,10 +2016,10 @@ SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags)
error = -EBADF;
in = fdget(fdin);
- if (in.file) {
+ if (fd_file(in)) {
out = fdget(fdout);
- if (out.file) {
- error = do_tee(in.file, out.file, len, flags);
+ if (fd_file(out)) {
+ error = do_tee(fd_file(in), fd_file(out), len, flags);
fdput(out);
}
fdput(in);
diff --git a/fs/stat.c b/fs/stat.c
index 70bd3e888cfa..740e5997da09 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -193,9 +193,9 @@ int vfs_fstat(int fd, struct kstat *stat)
int error;
f = fdget_raw(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = vfs_getattr(&f.file->f_path, stat, STATX_BASIC_STATS, 0);
+ error = vfs_getattr(&fd_file(f)->f_path, stat, STATX_BASIC_STATS, 0);
fdput(f);
return error;
}
diff --git a/fs/statfs.c b/fs/statfs.c
index 96d1c3edf289..9c7bb27e7932 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -116,8 +116,8 @@ int fd_statfs(int fd, struct kstatfs *st)
{
struct fd f = fdget_raw(fd);
int error = -EBADF;
- if (f.file) {
- error = vfs_statfs(&f.file->f_path, st);
+ if (fd_file(f)) {
+ error = vfs_statfs(&fd_file(f)->f_path, st);
fdput(f);
}
return error;
diff --git a/fs/sync.c b/fs/sync.c
index dc725914e1ed..67df255eb189 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -152,15 +152,15 @@ SYSCALL_DEFINE1(syncfs, int, fd)
struct super_block *sb;
int ret, ret2;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- sb = f.file->f_path.dentry->d_sb;
+ sb = fd_file(f)->f_path.dentry->d_sb;
down_read(&sb->s_umount);
ret = sync_filesystem(sb);
up_read(&sb->s_umount);
- ret2 = errseq_check_and_advance(&sb->s_wb_err, &f.file->f_sb_err);
+ ret2 = errseq_check_and_advance(&sb->s_wb_err, &fd_file(f)->f_sb_err);
fdput(f);
return ret ? ret : ret2;
@@ -208,8 +208,8 @@ static int do_fsync(unsigned int fd, int datasync)
struct fd f = fdget(fd);
int ret = -EBADF;
- if (f.file) {
- ret = vfs_fsync(f.file, datasync);
+ if (fd_file(f)) {
+ ret = vfs_fsync(fd_file(f), datasync);
fdput(f);
}
return ret;
@@ -360,8 +360,8 @@ int ksys_sync_file_range(int fd, loff_t offset, loff_t nbytes,
ret = -EBADF;
f = fdget(fd);
- if (f.file)
- ret = sync_file_range(f.file, offset, nbytes, flags);
+ if (fd_file(f))
+ ret = sync_file_range(fd_file(f), offset, nbytes, flags);
fdput(f);
return ret;
diff --git a/fs/timerfd.c b/fs/timerfd.c
index 4bf2f8bfec11..137523e0bb21 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -397,9 +397,9 @@ static const struct file_operations timerfd_fops = {
static int timerfd_fget(int fd, struct fd *p)
{
struct fd f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (f.file->f_op != &timerfd_fops) {
+ if (fd_file(f)->f_op != &timerfd_fops) {
fdput(f);
return -EINVAL;
}
@@ -482,7 +482,7 @@ static int do_timerfd_settime(int ufd, int flags,
ret = timerfd_fget(ufd, &f);
if (ret)
return ret;
- ctx = f.file->private_data;
+ ctx = fd_file(f)->private_data;
if (isalarm(ctx) && !capable(CAP_WAKE_ALARM)) {
fdput(f);
@@ -546,7 +546,7 @@ static int do_timerfd_gettime(int ufd, struct itimerspec64 *t)
int ret = timerfd_fget(ufd, &f);
if (ret)
return ret;
- ctx = f.file->private_data;
+ ctx = fd_file(f)->private_data;
spin_lock_irq(&ctx->wqh.lock);
if (ctx->expired && ctx->tintv) {
diff --git a/fs/utimes.c b/fs/utimes.c
index 3701b3946f88..99b26f792b89 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -115,9 +115,9 @@ static int do_utimes_fd(int fd, struct timespec64 *times, int flags)
return -EINVAL;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- error = vfs_utimes(&f.file->f_path, times);
+ error = vfs_utimes(&fd_file(f)->f_path, times);
fdput(f);
return error;
}
diff --git a/fs/xattr.c b/fs/xattr.c
index f8b643f91a98..d4f84f57e703 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -700,15 +700,15 @@ SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
struct fd f = fdget(fd);
int error = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
return error;
- audit_file(f.file);
- error = mnt_want_write_file(f.file);
+ audit_file(fd_file(f));
+ error = mnt_want_write_file(fd_file(f));
if (!error) {
- error = setxattr(file_mnt_idmap(f.file),
- f.file->f_path.dentry, name,
+ error = setxattr(file_mnt_idmap(fd_file(f)),
+ fd_file(f)->f_path.dentry, name,
value, size, flags);
- mnt_drop_write_file(f.file);
+ mnt_drop_write_file(fd_file(f));
}
fdput(f);
return error;
@@ -811,10 +811,10 @@ SYSCALL_DEFINE4(fgetxattr, int, fd, const char __user *, name,
struct fd f = fdget(fd);
ssize_t error = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
return error;
- audit_file(f.file);
- error = getxattr(file_mnt_idmap(f.file), f.file->f_path.dentry,
+ audit_file(fd_file(f));
+ error = getxattr(file_mnt_idmap(fd_file(f)), fd_file(f)->f_path.dentry,
name, value, size);
fdput(f);
return error;
@@ -887,10 +887,10 @@ SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
struct fd f = fdget(fd);
ssize_t error = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
return error;
- audit_file(f.file);
- error = listxattr(f.file->f_path.dentry, list, size);
+ audit_file(fd_file(f));
+ error = listxattr(fd_file(f)->f_path.dentry, list, size);
fdput(f);
return error;
}
@@ -956,14 +956,14 @@ SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
struct fd f = fdget(fd);
int error = -EBADF;
- if (!f.file)
+ if (!fd_file(f))
return error;
- audit_file(f.file);
- error = mnt_want_write_file(f.file);
+ audit_file(fd_file(f));
+ error = mnt_want_write_file(fd_file(f));
if (!error) {
- error = removexattr(file_mnt_idmap(f.file),
- f.file->f_path.dentry, name);
- mnt_drop_write_file(f.file);
+ error = removexattr(file_mnt_idmap(fd_file(f)),
+ fd_file(f)->f_path.dentry, name);
+ mnt_drop_write_file(fd_file(f));
}
fdput(f);
return error;
diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
index c8a655c92c92..9790e0f45d14 100644
--- a/fs/xfs/xfs_exchrange.c
+++ b/fs/xfs/xfs_exchrange.c
@@ -794,9 +794,9 @@ xfs_ioc_exchange_range(
fxr.flags = args.flags;
file1 = fdget(args.file1_fd);
- if (!file1.file)
+ if (!fd_file(file1))
return -EBADF;
- fxr.file1 = file1.file;
+ fxr.file1 = fd_file(file1);
error = xfs_exchange_range(&fxr);
fdput(file1);
diff --git a/fs/xfs/xfs_handle.c b/fs/xfs/xfs_handle.c
index c8785ed59543..445a2daff233 100644
--- a/fs/xfs/xfs_handle.c
+++ b/fs/xfs/xfs_handle.c
@@ -93,9 +93,9 @@ xfs_find_handle(
if (cmd == XFS_IOC_FD_TO_HANDLE) {
f = fdget(hreq->fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- inode = file_inode(f.file);
+ inode = file_inode(fd_file(f));
} else {
error = user_path_at(AT_FDCWD, hreq->path, 0, &path);
if (error)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index f0117188f302..c8b432fb7b40 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1065,33 +1065,33 @@ xfs_ioc_swapext(
/* Pull information for the target fd */
f = fdget((int)sxp->sx_fdtarget);
- if (!f.file) {
+ if (!fd_file(f)) {
error = -EINVAL;
goto out;
}
- if (!(f.file->f_mode & FMODE_WRITE) ||
- !(f.file->f_mode & FMODE_READ) ||
- (f.file->f_flags & O_APPEND)) {
+ if (!(fd_file(f)->f_mode & FMODE_WRITE) ||
+ !(fd_file(f)->f_mode & FMODE_READ) ||
+ (fd_file(f)->f_flags & O_APPEND)) {
error = -EBADF;
goto out_put_file;
}
tmp = fdget((int)sxp->sx_fdtmp);
- if (!tmp.file) {
+ if (!fd_file(tmp)) {
error = -EINVAL;
goto out_put_file;
}
- if (!(tmp.file->f_mode & FMODE_WRITE) ||
- !(tmp.file->f_mode & FMODE_READ) ||
- (tmp.file->f_flags & O_APPEND)) {
+ if (!(fd_file(tmp)->f_mode & FMODE_WRITE) ||
+ !(fd_file(tmp)->f_mode & FMODE_READ) ||
+ (fd_file(tmp)->f_flags & O_APPEND)) {
error = -EBADF;
goto out_put_tmp_file;
}
- if (IS_SWAPFILE(file_inode(f.file)) ||
- IS_SWAPFILE(file_inode(tmp.file))) {
+ if (IS_SWAPFILE(file_inode(fd_file(f))) ||
+ IS_SWAPFILE(file_inode(fd_file(tmp)))) {
error = -EINVAL;
goto out_put_tmp_file;
}
@@ -1101,14 +1101,14 @@ xfs_ioc_swapext(
* before we cast and access them as XFS structures as we have no
* control over what the user passes us here.
*/
- if (f.file->f_op != &xfs_file_operations ||
- tmp.file->f_op != &xfs_file_operations) {
+ if (fd_file(f)->f_op != &xfs_file_operations ||
+ fd_file(tmp)->f_op != &xfs_file_operations) {
error = -EINVAL;
goto out_put_tmp_file;
}
- ip = XFS_I(file_inode(f.file));
- tip = XFS_I(file_inode(tmp.file));
+ ip = XFS_I(file_inode(fd_file(f)));
+ tip = XFS_I(file_inode(fd_file(tmp)));
if (ip->i_mount != tip->i_mount) {
error = -EINVAL;
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index c2d09bc4f976..22cda00cf6a8 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -95,7 +95,7 @@ const volatile void * __must_check_fn(const volatile void *val)
* DEFINE_CLASS(fdget, struct fd, fdput(_T), fdget(fd), int fd)
*
* CLASS(fdget, f)(fd);
- * if (!f.file)
+ * if (!fd_file(f))
* return -EBADF;
*
* // use 'f' without concern
diff --git a/include/linux/file.h b/include/linux/file.h
index 45d0f4800abd..0964408727a7 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -42,10 +42,12 @@ struct fd {
#define FDPUT_FPUT 1
#define FDPUT_POS_UNLOCK 2
+#define fd_file(f) ((f).file)
+
static inline void fdput(struct fd fd)
{
if (fd.flags & FDPUT_FPUT)
- fput(fd.file);
+ fput(fd_file(fd));
}
extern struct file *fget(unsigned int fd);
@@ -79,7 +81,7 @@ static inline struct fd fdget_pos(int fd)
static inline void fdput_pos(struct fd f)
{
if (f.flags & FDPUT_POS_UNLOCK)
- __f_unlock_pos(f.file);
+ __f_unlock_pos(fd_file(f));
fdput(f);
}
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index b3722e5275e7..ffa7d341bd95 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -108,14 +108,14 @@ static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
struct fd f;
f = fdget(p->wq_fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-ENXIO);
- if (!io_is_uring_fops(f.file)) {
+ if (!io_is_uring_fops(fd_file(f))) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- ctx_attach = f.file->private_data;
+ ctx_attach = fd_file(f)->private_data;
sqd = ctx_attach->sq_data;
if (!sqd) {
fdput(f);
@@ -418,9 +418,9 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
struct fd f;
f = fdget(p->wq_fd);
- if (!f.file)
+ if (!fd_file(f))
return -ENXIO;
- if (!io_is_uring_fops(f.file)) {
+ if (!io_is_uring_fops(fd_file(f))) {
fdput(f);
return -EINVAL;
}
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 5eea4dc0509e..9133a52be69b 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1084,20 +1084,20 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
audit_mq_sendrecv(mqdes, msg_len, msg_prio, ts);
f = fdget(mqdes);
- if (unlikely(!f.file)) {
+ if (unlikely(!fd_file(f))) {
ret = -EBADF;
goto out;
}
- inode = file_inode(f.file);
- if (unlikely(f.file->f_op != &mqueue_file_operations)) {
+ inode = file_inode(fd_file(f));
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
ret = -EBADF;
goto out_fput;
}
info = MQUEUE_I(inode);
- audit_file(f.file);
+ audit_file(fd_file(f));
- if (unlikely(!(f.file->f_mode & FMODE_WRITE))) {
+ if (unlikely(!(fd_file(f)->f_mode & FMODE_WRITE))) {
ret = -EBADF;
goto out_fput;
}
@@ -1137,7 +1137,7 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
}
if (info->attr.mq_curmsgs == info->attr.mq_maxmsg) {
- if (f.file->f_flags & O_NONBLOCK) {
+ if (fd_file(f)->f_flags & O_NONBLOCK) {
ret = -EAGAIN;
} else {
wait.task = current;
@@ -1198,20 +1198,20 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
audit_mq_sendrecv(mqdes, msg_len, 0, ts);
f = fdget(mqdes);
- if (unlikely(!f.file)) {
+ if (unlikely(!fd_file(f))) {
ret = -EBADF;
goto out;
}
- inode = file_inode(f.file);
- if (unlikely(f.file->f_op != &mqueue_file_operations)) {
+ inode = file_inode(fd_file(f));
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
ret = -EBADF;
goto out_fput;
}
info = MQUEUE_I(inode);
- audit_file(f.file);
+ audit_file(fd_file(f));
- if (unlikely(!(f.file->f_mode & FMODE_READ))) {
+ if (unlikely(!(fd_file(f)->f_mode & FMODE_READ))) {
ret = -EBADF;
goto out_fput;
}
@@ -1241,7 +1241,7 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
}
if (info->attr.mq_curmsgs == 0) {
- if (f.file->f_flags & O_NONBLOCK) {
+ if (fd_file(f)->f_flags & O_NONBLOCK) {
spin_unlock(&info->lock);
ret = -EAGAIN;
} else {
@@ -1355,11 +1355,11 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
/* and attach it to the socket */
retry:
f = fdget(notification->sigev_signo);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto out;
}
- sock = netlink_getsockbyfilp(f.file);
+ sock = netlink_getsockbyfilp(fd_file(f));
fdput(f);
if (IS_ERR(sock)) {
ret = PTR_ERR(sock);
@@ -1378,13 +1378,13 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
}
f = fdget(mqdes);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto out;
}
- inode = file_inode(f.file);
- if (unlikely(f.file->f_op != &mqueue_file_operations)) {
+ inode = file_inode(fd_file(f));
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
ret = -EBADF;
goto out_fput;
}
@@ -1459,31 +1459,31 @@ static int do_mq_getsetattr(int mqdes, struct mq_attr *new, struct mq_attr *old)
return -EINVAL;
f = fdget(mqdes);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (unlikely(f.file->f_op != &mqueue_file_operations)) {
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
fdput(f);
return -EBADF;
}
- inode = file_inode(f.file);
+ inode = file_inode(fd_file(f));
info = MQUEUE_I(inode);
spin_lock(&info->lock);
if (old) {
*old = info->attr;
- old->mq_flags = f.file->f_flags & O_NONBLOCK;
+ old->mq_flags = fd_file(f)->f_flags & O_NONBLOCK;
}
if (new) {
audit_mq_getsetattr(mqdes, new);
- spin_lock(&f.file->f_lock);
+ spin_lock(&fd_file(f)->f_lock);
if (new->mq_flags & O_NONBLOCK)
- f.file->f_flags |= O_NONBLOCK;
+ fd_file(f)->f_flags |= O_NONBLOCK;
else
- f.file->f_flags &= ~O_NONBLOCK;
- spin_unlock(&f.file->f_lock);
+ fd_file(f)->f_flags &= ~O_NONBLOCK;
+ spin_unlock(&fd_file(f)->f_lock);
inode_set_atime_to_ts(inode, inode_set_ctime_current(inode));
}
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
index b0ef45db207c..0a79aee6523d 100644
--- a/kernel/bpf/bpf_inode_storage.c
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -80,10 +80,10 @@ static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key)
struct bpf_local_storage_data *sdata;
struct fd f = fdget_raw(*(int *)key);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- sdata = inode_storage_lookup(file_inode(f.file), map, true);
+ sdata = inode_storage_lookup(file_inode(fd_file(f)), map, true);
fdput(f);
return sdata ? sdata->data : NULL;
}
@@ -94,14 +94,14 @@ static long bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
struct bpf_local_storage_data *sdata;
struct fd f = fdget_raw(*(int *)key);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (!inode_storage_ptr(file_inode(f.file))) {
+ if (!inode_storage_ptr(file_inode(fd_file(f)))) {
fdput(f);
return -EBADF;
}
- sdata = bpf_local_storage_update(file_inode(f.file),
+ sdata = bpf_local_storage_update(file_inode(fd_file(f)),
(struct bpf_local_storage_map *)map,
value, map_flags, GFP_ATOMIC);
fdput(f);
@@ -126,10 +126,10 @@ static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
struct fd f = fdget_raw(*(int *)key);
int err;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- err = inode_storage_delete(file_inode(f.file), map);
+ err = inode_storage_delete(file_inode(fd_file(f)), map);
fdput(f);
return err;
}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 821063660d9f..a0e812c29c97 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7509,15 +7509,15 @@ struct btf *btf_get_by_fd(int fd)
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &btf_fops) {
+ if (fd_file(f)->f_op != &btf_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- btf = f.file->private_data;
+ btf = fd_file(f)->private_data;
refcount_inc(&btf->refcnt);
fdput(f);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2222c3ff88e7..477bb89f03aa 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -829,7 +829,7 @@ static int bpf_map_release(struct inode *inode, struct file *filp)
static fmode_t map_get_sys_perms(struct bpf_map *map, struct fd f)
{
- fmode_t mode = f.file->f_mode;
+ fmode_t mode = fd_file(f)->f_mode;
/* Our file permissions may have been overridden by global
* map permissions facing syscall side.
@@ -1423,14 +1423,14 @@ static int map_create(union bpf_attr *attr)
*/
struct bpf_map *__bpf_map_get(struct fd f)
{
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &bpf_map_fops) {
+ if (fd_file(f)->f_op != &bpf_map_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- return f.file->private_data;
+ return fd_file(f)->private_data;
}
void bpf_map_inc(struct bpf_map *map)
@@ -1651,7 +1651,7 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
goto free_key;
}
- err = bpf_map_update_value(map, f.file, key, value, attr->flags);
+ err = bpf_map_update_value(map, fd_file(f), key, value, attr->flags);
if (!err)
maybe_wait_bpf_programs(map);
@@ -2409,14 +2409,14 @@ int bpf_prog_new_fd(struct bpf_prog *prog)
static struct bpf_prog *____bpf_prog_get(struct fd f)
{
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &bpf_prog_fops) {
+ if (fd_file(f)->f_op != &bpf_prog_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- return f.file->private_data;
+ return fd_file(f)->private_data;
}
void bpf_prog_add(struct bpf_prog *prog, int i)
@@ -3237,14 +3237,14 @@ struct bpf_link *bpf_link_get_from_fd(u32 ufd)
struct fd f = fdget(ufd);
struct bpf_link *link;
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &bpf_link_fops) {
+ if (fd_file(f)->f_op != &bpf_link_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- link = f.file->private_data;
+ link = fd_file(f)->private_data;
bpf_link_inc(link);
fdput(f);
@@ -4960,19 +4960,19 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
return -EINVAL;
f = fdget(ufd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADFD;
- if (f.file->f_op == &bpf_prog_fops)
- err = bpf_prog_get_info_by_fd(f.file, f.file->private_data, attr,
+ if (fd_file(f)->f_op == &bpf_prog_fops)
+ err = bpf_prog_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
uattr);
- else if (f.file->f_op == &bpf_map_fops)
- err = bpf_map_get_info_by_fd(f.file, f.file->private_data, attr,
+ else if (fd_file(f)->f_op == &bpf_map_fops)
+ err = bpf_map_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
uattr);
- else if (f.file->f_op == &btf_fops)
- err = bpf_btf_get_info_by_fd(f.file, f.file->private_data, attr, uattr);
- else if (f.file->f_op == &bpf_link_fops)
- err = bpf_link_get_info_by_fd(f.file, f.file->private_data,
+ else if (fd_file(f)->f_op == &btf_fops)
+ err = bpf_btf_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr, uattr);
+ else if (fd_file(f)->f_op == &bpf_link_fops)
+ err = bpf_link_get_info_by_fd(fd_file(f), fd_file(f)->private_data,
attr, uattr);
else
err = -EINVAL;
@@ -5193,7 +5193,7 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
else if (cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH)
BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch, map, attr, uattr);
else if (cmd == BPF_MAP_UPDATE_BATCH)
- BPF_DO_BATCH(map->ops->map_update_batch, map, f.file, attr, uattr);
+ BPF_DO_BATCH(map->ops->map_update_batch, map, fd_file(f), attr, uattr);
else
BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr);
err_put:
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
index d6ccf8d00eab..9a1d356e79ed 100644
--- a/kernel/bpf/token.c
+++ b/kernel/bpf/token.c
@@ -122,10 +122,10 @@ int bpf_token_create(union bpf_attr *attr)
int err, fd;
f = fdget(attr->token_create.bpffs_fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- path = f.file->f_path;
+ path = fd_file(f)->f_path;
path_get(&path);
fdput(f);
@@ -235,14 +235,14 @@ struct bpf_token *bpf_token_get_from_fd(u32 ufd)
struct fd f = fdget(ufd);
struct bpf_token *token;
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (f.file->f_op != &bpf_token_fops) {
+ if (fd_file(f)->f_op != &bpf_token_fops) {
fdput(f);
return ERR_PTR(-EINVAL);
}
- token = f.file->private_data;
+ token = fd_file(f)->private_data;
bpf_token_inc(token);
fdput(f);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e32b6972c478..0a6e9f566ca6 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6882,10 +6882,10 @@ struct cgroup *cgroup_v1v2_get_from_fd(int fd)
{
struct cgroup *cgrp;
struct fd f = fdget_raw(fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- cgrp = cgroup_v1v2_get_from_file(f.file);
+ cgrp = cgroup_v1v2_get_from_file(fd_file(f));
fdput(f);
return cgrp;
}
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f0128c5ff278..7acf44111a6e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -933,10 +933,10 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
struct fd f = fdget(fd);
int ret = 0;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- css = css_tryget_online_from_dir(f.file->f_path.dentry,
+ css = css_tryget_online_from_dir(fd_file(f)->f_path.dentry,
&perf_event_cgrp_subsys);
if (IS_ERR(css)) {
ret = PTR_ERR(css);
@@ -5873,10 +5873,10 @@ static const struct file_operations perf_fops;
static inline int perf_fget_light(int fd, struct fd *p)
{
struct fd f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (f.file->f_op != &perf_fops) {
+ if (fd_file(f)->f_op != &perf_fops) {
fdput(f);
return -EBADF;
}
@@ -5936,7 +5936,7 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
ret = perf_fget_light(arg, &output);
if (ret)
return ret;
- output_event = output.file->private_data;
+ output_event = fd_file(output)->private_data;
ret = perf_event_set_output(event, output_event);
fdput(output);
} else {
@@ -12513,7 +12513,7 @@ SYSCALL_DEFINE5(perf_event_open,
err = perf_fget_light(group_fd, &group);
if (err)
goto err_fd;
- group_leader = group.file->private_data;
+ group_leader = fd_file(group)->private_data;
if (flags & PERF_FLAG_FD_OUTPUT)
output_event = group_leader;
if (flags & PERF_FLAG_FD_NO_GROUP)
diff --git a/kernel/module/main.c b/kernel/module/main.c
index d18a94b973e1..98fda13fdca7 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -3208,7 +3208,7 @@ SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
return -EINVAL;
f = fdget(fd);
- err = idempotent_init_module(f.file, uargs, flags);
+ err = idempotent_init_module(fd_file(f), uargs, flags);
fdput(f);
return err;
}
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 6ec3deec68c2..dc952c3b05af 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -550,15 +550,15 @@ SYSCALL_DEFINE2(setns, int, fd, int, flags)
struct nsset nsset = {};
int err = 0;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- if (proc_ns_file(f.file)) {
- ns = get_proc_ns(file_inode(f.file));
+ if (proc_ns_file(fd_file(f))) {
+ ns = get_proc_ns(file_inode(fd_file(f)));
if (flags && (ns->ops->type != flags))
err = -EINVAL;
flags = ns->ops->type;
- } else if (!IS_ERR(pidfd_pid(f.file))) {
+ } else if (!IS_ERR(pidfd_pid(fd_file(f)))) {
err = check_setns_flags(flags);
} else {
err = -EINVAL;
@@ -570,10 +570,10 @@ SYSCALL_DEFINE2(setns, int, fd, int, flags)
if (err)
goto out;
- if (proc_ns_file(f.file))
+ if (proc_ns_file(fd_file(f)))
err = validate_ns(&nsset, ns);
else
- err = validate_nsset(&nsset, pidfd_pid(f.file));
+ err = validate_nsset(&nsset, pidfd_pid(fd_file(f)));
if (!err) {
commit_nsset(&nsset);
perf_event_namespaces(current);
diff --git a/kernel/pid.c b/kernel/pid.c
index da76ed1873f7..2715afb77eab 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -540,13 +540,13 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
struct pid *pid;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- pid = pidfd_pid(f.file);
+ pid = pidfd_pid(fd_file(f));
if (!IS_ERR(pid)) {
get_pid(pid);
- *flags = f.file->f_flags;
+ *flags = fd_file(f)->f_flags;
}
fdput(f);
@@ -755,10 +755,10 @@ SYSCALL_DEFINE3(pidfd_getfd, int, pidfd, int, fd,
return -EINVAL;
f = fdget(pidfd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- pid = pidfd_pid(f.file);
+ pid = pidfd_pid(fd_file(f));
if (IS_ERR(pid))
ret = PTR_ERR(pid);
else
diff --git a/kernel/signal.c b/kernel/signal.c
index 1f9dd41c04be..6c438fd436d8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3914,11 +3914,11 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
return -EINVAL;
f = fdget(pidfd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
/* Is this a pidfd? */
- pid = pidfd_to_pid(f.file);
+ pid = pidfd_to_pid(fd_file(f));
if (IS_ERR(pid)) {
ret = PTR_ERR(pid);
goto err;
@@ -3931,7 +3931,7 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
switch (flags) {
case 0:
/* Infer scope from the type of pidfd. */
- if (f.file->f_flags & PIDFD_THREAD)
+ if (fd_file(f)->f_flags & PIDFD_THREAD)
type = PIDTYPE_PID;
else
type = PIDTYPE_TGID;
diff --git a/kernel/sys.c b/kernel/sys.c
index 3a2df1bd9f64..a4be1e568ff5 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1916,10 +1916,10 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
int err;
exe = fdget(fd);
- if (!exe.file)
+ if (!fd_file(exe))
return -EBADF;
- inode = file_inode(exe.file);
+ inode = file_inode(fd_file(exe));
/*
* Because the original mm->exe_file points to executable file, make
@@ -1927,14 +1927,14 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
* overall picture.
*/
err = -EACCES;
- if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
+ if (!S_ISREG(inode->i_mode) || path_noexec(&fd_file(exe)->f_path))
goto exit;
- err = file_permission(exe.file, MAY_EXEC);
+ err = file_permission(fd_file(exe), MAY_EXEC);
if (err)
goto exit;
- err = replace_mm_exe_file(mm, exe.file);
+ err = replace_mm_exe_file(mm, fd_file(exe));
exit:
fdput(exe);
return err;
diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 4354ea231fab..0700f40c53ac 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -419,7 +419,7 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
fd = nla_get_u32(info->attrs[CGROUPSTATS_CMD_ATTR_FD]);
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return 0;
size = nla_total_size(sizeof(struct cgroupstats));
@@ -440,7 +440,7 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
stats = nla_data(na);
memset(stats, 0, sizeof(*stats));
- rc = cgroupstats_build(stats, f.file->f_path.dentry);
+ rc = cgroupstats_build(stats, fd_file(f)->f_path.dentry);
if (rc < 0) {
nlmsg_free(rep_skb);
goto err;
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index 03b90d7d2175..d36242fd4936 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -666,8 +666,8 @@ struct watch_queue *get_watch_queue(int fd)
struct fd f;
f = fdget(fd);
- if (f.file) {
- pipe = get_pipe_info(f.file, false);
+ if (fd_file(f)) {
+ pipe = get_pipe_info(fd_file(f), false);
if (pipe && pipe->watch_queue) {
wqueue = pipe->watch_queue;
kref_get(&wqueue->usage);
diff --git a/mm/fadvise.c b/mm/fadvise.c
index 6c39d42f16dc..532dee205c6e 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -193,10 +193,10 @@ int ksys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
struct fd f = fdget(fd);
int ret;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
- ret = vfs_fadvise(f.file, offset, len, advice);
+ ret = vfs_fadvise(fd_file(f), offset, len, advice);
fdput(f);
return ret;
diff --git a/mm/filemap.c b/mm/filemap.c
index 382c3d06bfb1..c79c2c773171 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4379,7 +4379,7 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
struct cachestat cs;
pgoff_t first_index, last_index;
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
if (copy_from_user(&csr, cstat_range,
@@ -4389,7 +4389,7 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
}
/* hugetlbfs is not supported */
- if (is_file_hugepages(f.file)) {
+ if (is_file_hugepages(fd_file(f))) {
fdput(f);
return -EOPNOTSUPP;
}
@@ -4403,7 +4403,7 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
last_index =
csr.len == 0 ? ULONG_MAX : (csr.off + csr.len - 1) >> PAGE_SHIFT;
memset(&cs, 0, sizeof(struct cachestat));
- mapping = f.file->f_mapping;
+ mapping = fd_file(f)->f_mapping;
filemap_cachestat(mapping, first_index, last_index, &cs);
fdput(f);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7fad15b2290c..58d000013024 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5299,26 +5299,26 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
INIT_WORK(&event->remove, memcg_event_remove);
efile = fdget(efd);
- if (!efile.file) {
+ if (!fd_file(efile)) {
ret = -EBADF;
goto out_kfree;
}
- event->eventfd = eventfd_ctx_fileget(efile.file);
+ event->eventfd = eventfd_ctx_fileget(fd_file(efile));
if (IS_ERR(event->eventfd)) {
ret = PTR_ERR(event->eventfd);
goto out_put_efile;
}
cfile = fdget(cfd);
- if (!cfile.file) {
+ if (!fd_file(cfile)) {
ret = -EBADF;
goto out_put_eventfd;
}
/* the process need read permission on control file */
/* AV: shouldn't we check that it's been opened for read instead? */
- ret = file_permission(cfile.file, MAY_READ);
+ ret = file_permission(fd_file(cfile), MAY_READ);
if (ret < 0)
goto out_put_cfile;
@@ -5326,7 +5326,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
* The control file must be a regular cgroup1 file. As a regular cgroup
* file can't be renamed, it's safe to access its name afterwards.
*/
- cdentry = cfile.file->f_path.dentry;
+ cdentry = fd_file(cfile)->f_path.dentry;
if (cdentry->d_sb->s_type != &cgroup_fs_type || !d_is_reg(cdentry)) {
ret = -EINVAL;
goto out_put_cfile;
@@ -5378,7 +5378,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
if (ret)
goto out_put_css;
- vfs_poll(efile.file, &event->pt);
+ vfs_poll(fd_file(efile), &event->pt);
spin_lock_irq(&memcg->event_list_lock);
list_add(&event->list, &memcg->event_list);
diff --git a/mm/readahead.c b/mm/readahead.c
index c1b23989d9ca..2be4603488c5 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -726,7 +726,7 @@ ssize_t ksys_readahead(int fd, loff_t offset, size_t count)
ret = -EBADF;
f = fdget(fd);
- if (!f.file || !(f.file->f_mode & FMODE_READ))
+ if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
goto out;
/*
@@ -735,12 +735,12 @@ ssize_t ksys_readahead(int fd, loff_t offset, size_t count)
* on this file, then we must return -EINVAL.
*/
ret = -EINVAL;
- if (!f.file->f_mapping || !f.file->f_mapping->a_ops ||
- (!S_ISREG(file_inode(f.file)->i_mode) &&
- !S_ISBLK(file_inode(f.file)->i_mode)))
+ if (!fd_file(f)->f_mapping || !fd_file(f)->f_mapping->a_ops ||
+ (!S_ISREG(file_inode(fd_file(f))->i_mode) &&
+ !S_ISBLK(file_inode(fd_file(f))->i_mode)))
goto out;
- ret = vfs_fadvise(f.file, offset, count, POSIX_FADV_WILLNEED);
+ ret = vfs_fadvise(fd_file(f), offset, count, POSIX_FADV_WILLNEED);
out:
fdput(f);
return ret;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 4f7a61688d18..fd536e385b83 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -706,11 +706,11 @@ struct net *get_net_ns_by_fd(int fd)
struct fd f = fdget(fd);
struct net *net = ERR_PTR(-EINVAL);
- if (!f.file)
+ if (!fd_file(f))
return ERR_PTR(-EBADF);
- if (proc_ns_file(f.file)) {
- struct ns_common *ns = get_proc_ns(file_inode(f.file));
+ if (proc_ns_file(fd_file(f))) {
+ struct ns_common *ns = get_proc_ns(file_inode(fd_file(f)));
if (ns->ops == &netns_operations)
net = get_net(container_of(ns, struct net, ns));
}
diff --git a/net/socket.c b/net/socket.c
index e416920e9399..50b074f52147 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -556,8 +556,8 @@ static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
struct socket *sock;
*err = -EBADF;
- if (f.file) {
- sock = sock_from_file(f.file);
+ if (fd_file(f)) {
+ sock = sock_from_file(fd_file(f));
if (likely(sock)) {
*fput_needed = f.flags & FDPUT_FPUT;
return sock;
@@ -1996,8 +1996,8 @@ int __sys_accept4(int fd, struct sockaddr __user *upeer_sockaddr,
struct fd f;
f = fdget(fd);
- if (f.file) {
- ret = __sys_accept4_file(f.file, upeer_sockaddr,
+ if (fd_file(f)) {
+ ret = __sys_accept4_file(fd_file(f), upeer_sockaddr,
upeer_addrlen, flags);
fdput(f);
}
@@ -2058,12 +2058,12 @@ int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen)
struct fd f;
f = fdget(fd);
- if (f.file) {
+ if (fd_file(f)) {
struct sockaddr_storage address;
ret = move_addr_to_kernel(uservaddr, addrlen, &address);
if (!ret)
- ret = __sys_connect_file(f.file, &address, addrlen, 0);
+ ret = __sys_connect_file(fd_file(f), &address, addrlen, 0);
fdput(f);
}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index f04f43af651c..e7c1d3ae33fe 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1068,10 +1068,10 @@ void ima_kexec_cmdline(int kernel_fd, const void *buf, int size)
return;
f = fdget(kernel_fd);
- if (!f.file)
+ if (!fd_file(f))
return;
- process_buffer_measurement(file_mnt_idmap(f.file), file_inode(f.file),
+ process_buffer_measurement(file_mnt_idmap(fd_file(f)), file_inode(fd_file(f)),
buf, size, "kexec-cmdline", KEXEC_CMDLINE, 0,
NULL, false, NULL, 0);
fdput(f);
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 03b470f5a85a..44edbf57644d 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -238,19 +238,19 @@ static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
struct landlock_ruleset *ruleset;
ruleset_f = fdget(fd);
- if (!ruleset_f.file)
+ if (!fd_file(ruleset_f))
return ERR_PTR(-EBADF);
/* Checks FD type and access right. */
- if (ruleset_f.file->f_op != &ruleset_fops) {
+ if (fd_file(ruleset_f)->f_op != &ruleset_fops) {
ruleset = ERR_PTR(-EBADFD);
goto out_fdput;
}
- if (!(ruleset_f.file->f_mode & mode)) {
+ if (!(fd_file(ruleset_f)->f_mode & mode)) {
ruleset = ERR_PTR(-EPERM);
goto out_fdput;
}
- ruleset = ruleset_f.file->private_data;
+ ruleset = fd_file(ruleset_f)->private_data;
if (WARN_ON_ONCE(ruleset->num_layers != 1)) {
ruleset = ERR_PTR(-EINVAL);
goto out_fdput;
@@ -277,22 +277,22 @@ static int get_path_from_fd(const s32 fd, struct path *const path)
/* Handles O_PATH. */
f = fdget_raw(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
/*
* Forbids ruleset FDs, internal filesystems (e.g. nsfs), including
* pseudo filesystems that will never be mountable (e.g. sockfs,
* pipefs).
*/
- if ((f.file->f_op == &ruleset_fops) ||
- (f.file->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
- (f.file->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
- d_is_negative(f.file->f_path.dentry) ||
- IS_PRIVATE(d_backing_inode(f.file->f_path.dentry))) {
+ if ((fd_file(f)->f_op == &ruleset_fops) ||
+ (fd_file(f)->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
+ (fd_file(f)->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
+ d_is_negative(fd_file(f)->f_path.dentry) ||
+ IS_PRIVATE(d_backing_inode(fd_file(f)->f_path.dentry))) {
err = -EBADFD;
goto out_fdput;
}
- *path = f.file->f_path;
+ *path = fd_file(f)->f_path;
path_get(path);
out_fdput:
diff --git a/security/loadpin/loadpin.c b/security/loadpin/loadpin.c
index 93fd4d47b334..02144ec39f43 100644
--- a/security/loadpin/loadpin.c
+++ b/security/loadpin/loadpin.c
@@ -296,7 +296,7 @@ static int read_trusted_verity_root_digests(unsigned int fd)
return -EPERM;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EINVAL;
data = kzalloc(SZ_4K, GFP_KERNEL);
@@ -305,7 +305,7 @@ static int read_trusted_verity_root_digests(unsigned int fd)
goto err;
}
- rc = kernel_read_file(f.file, 0, (void **)&data, SZ_4K - 1, NULL, READING_POLICY);
+ rc = kernel_read_file(fd_file(f), 0, (void **)&data, SZ_4K - 1, NULL, READING_POLICY);
if (rc < 0)
goto err;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 521ba56392a0..388fdc226c1a 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2242,12 +2242,12 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd)
bool nonatomic = substream->pcm->nonatomic;
CLASS(fd, f)(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADFD;
- if (!is_pcm_file(f.file))
+ if (!is_pcm_file(fd_file(f)))
return -EBADFD;
- pcm_file = f.file->private_data;
+ pcm_file = fd_file(f)->private_data;
substream1 = pcm_file->substream;
if (substream == substream1)
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 229570059a1b..65efb3735e79 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -327,12 +327,12 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
seqcount_spinlock_init(&irqfd->irq_entry_sc, &kvm->irqfds.lock);
f = fdget(args->fd);
- if (!f.file) {
+ if (!fd_file(f)) {
ret = -EBADF;
goto out;
}
- eventfd = eventfd_ctx_fileget(f.file);
+ eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(eventfd)) {
ret = PTR_ERR(eventfd);
goto fail;
@@ -419,7 +419,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
* Check if there was an event already pending on the eventfd
* before we registered, and trigger it as if we didn't miss it.
*/
- events = vfs_poll(f.file, &irqfd->pt);
+ events = vfs_poll(fd_file(f), &irqfd->pt);
if (events & EPOLLIN)
schedule_work(&irqfd->inject);
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 76b7f6085dcd..388ae471d258 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -194,7 +194,7 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
int ret;
f = fdget(fd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
ret = -ENOENT;
@@ -202,7 +202,7 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
mutex_lock(&kv->lock);
list_for_each_entry(kvf, &kv->file_list, node) {
- if (kvf->file != f.file)
+ if (kvf->file != fd_file(f))
continue;
list_del(&kvf->node);
@@ -240,7 +240,7 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
return -EFAULT;
f = fdget(param.groupfd);
- if (!f.file)
+ if (!fd_file(f))
return -EBADF;
ret = -ENOENT;
@@ -248,7 +248,7 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
mutex_lock(&kv->lock);
list_for_each_entry(kvf, &kv->file_list, node) {
- if (kvf->file != f.file)
+ if (kvf->file != fd_file(f))
continue;
if (!kvf->iommu_group) {
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 04/19] struct fd: representation change
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
2024-06-07 1:59 ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
2024-06-07 1:59 ` [PATCH 03/19] introduce fd_file(), convert all accessors to it Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 5:55 ` Amir Goldstein
2024-06-07 1:59 ` [PATCH 05/19] add struct fd constructors, get rid of __to_fd() Al Viro
` (15 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
The absolute majority of instances comes from fdget() and its
relatives; the underlying primitives actually return a struct file
reference and a couple of flags encoded into an unsigned long - the lower
two bits of file address are always zero, so we can stash the flags
into those. On the way out we use __to_fd() to unpack that unsigned
long into struct fd.
Let's use that representation for struct fd itself - make it
a structure with a single unsigned long member (.word), with the value
equal either to (unsigned long)p | flags, p being an address of some
struct file instance, or to 0 for an empty fd.
Note that we never used a struct fd instance with NULL ->file
and non-zero ->flags; the emptiness had been checked as (!f.file) and
we expected e.g. fdput(empty) to be a no-op. With new representation
we can use (!f.word) for emptiness check; that is enough for compiler
to figure out that (f.word & FDPUT_FPUT) will be false and that fdput(f)
will be a no-op in such case.
For now the new predicate (fd_empty(f)) has no users; all the
existing checks have form (!fd_file(f)). We will convert to fd_empty()
use later; here we only define it (and tell the compiler that it's
unlikely to return true).
This commit only deals with representation change; there will
be followups.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
drivers/infiniband/core/uverbs_cmd.c | 2 +-
fs/overlayfs/file.c | 28 +++++++++++++++-------------
fs/xfs/xfs_handle.c | 2 +-
include/linux/file.h | 22 ++++++++++++++++------
kernel/events/core.c | 2 +-
net/socket.c | 2 +-
6 files changed, 35 insertions(+), 23 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 03ea3afcb31f..efe3cc3debba 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -572,7 +572,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
struct inode *inode = NULL;
int new_xrcd = 0;
struct ib_device *ib_dev;
- struct fd f = {};
+ struct fd f = EMPTY_FD;
int ret;
ret = uverbs_request(attrs, &cmd, sizeof(cmd));
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index c4963d0c5549..458299873780 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -93,11 +93,11 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
bool allow_meta)
{
struct dentry *dentry = file_dentry(file);
+ struct file *private = file->private_data;
struct path realpath;
int err;
- real->flags = 0;
- real->file = file->private_data;
+ real->word = (unsigned long)private;
if (allow_meta) {
ovl_path_real(dentry, &realpath);
@@ -113,16 +113,17 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
return -EIO;
/* Has it been copied up since we'd opened it? */
- if (unlikely(file_inode(real->file) != d_inode(realpath.dentry))) {
- real->flags = FDPUT_FPUT;
- real->file = ovl_open_realfile(file, &realpath);
-
- return PTR_ERR_OR_ZERO(real->file);
+ if (unlikely(file_inode(private) != d_inode(realpath.dentry))) {
+ struct file *f = ovl_open_realfile(file, &realpath);
+ if (IS_ERR(f))
+ return PTR_ERR(f);
+ real->word = (unsigned long)ovl_open_realfile(file, &realpath) | FDPUT_FPUT;
+ return 0;
}
/* Did the flags change since open? */
- if (unlikely((file->f_flags ^ real->file->f_flags) & ~OVL_OPEN_FLAGS))
- return ovl_change_flags(real->file, file->f_flags);
+ if (unlikely((file->f_flags ^ private->f_flags) & ~OVL_OPEN_FLAGS))
+ return ovl_change_flags(private, file->f_flags);
return 0;
}
@@ -130,10 +131,11 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
static int ovl_real_fdget(const struct file *file, struct fd *real)
{
if (d_is_dir(file_dentry(file))) {
- real->flags = 0;
- real->file = ovl_dir_real_file(file, false);
-
- return PTR_ERR_OR_ZERO(real->file);
+ struct file *f = ovl_dir_real_file(file, false);
+ if (IS_ERR(f))
+ return PTR_ERR(f);
+ real->word = (unsigned long)f;
+ return 0;
}
return ovl_real_fdget_meta(file, real, false);
diff --git a/fs/xfs/xfs_handle.c b/fs/xfs/xfs_handle.c
index 445a2daff233..bb250f4246b3 100644
--- a/fs/xfs/xfs_handle.c
+++ b/fs/xfs/xfs_handle.c
@@ -86,7 +86,7 @@ xfs_find_handle(
int hsize;
xfs_handle_t handle;
struct inode *inode;
- struct fd f = {NULL};
+ struct fd f = EMPTY_FD;
struct path path;
int error;
struct xfs_inode *ip;
diff --git a/include/linux/file.h b/include/linux/file.h
index 0964408727a7..39eb10a1bbfc 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -35,18 +35,28 @@ static inline void fput_light(struct file *file, int fput_needed)
fput(file);
}
+/* either a reference to struct file + flags
+ * (cloned vs. borrowed, pos locked), with
+ * flags stored in lower bits of value,
+ * or empty (represented by 0).
+ */
struct fd {
- struct file *file;
- unsigned int flags;
+ unsigned long word;
};
#define FDPUT_FPUT 1
#define FDPUT_POS_UNLOCK 2
-#define fd_file(f) ((f).file)
+#define fd_file(f) ((struct file *)((f).word & ~3))
+static inline bool fd_empty(struct fd f)
+{
+ return unlikely(!f.word);
+}
+
+#define EMPTY_FD (struct fd){0}
static inline void fdput(struct fd fd)
{
- if (fd.flags & FDPUT_FPUT)
+ if (fd.word & FDPUT_FPUT)
fput(fd_file(fd));
}
@@ -60,7 +70,7 @@ extern void __f_unlock_pos(struct file *);
static inline struct fd __to_fd(unsigned long v)
{
- return (struct fd){(struct file *)(v & ~3),v & 3};
+ return (struct fd){v};
}
static inline struct fd fdget(unsigned int fd)
@@ -80,7 +90,7 @@ static inline struct fd fdget_pos(int fd)
static inline void fdput_pos(struct fd f)
{
- if (f.flags & FDPUT_POS_UNLOCK)
+ if (f.word & FDPUT_POS_UNLOCK)
__f_unlock_pos(fd_file(f));
fdput(f);
}
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7acf44111a6e..fd4621cd9c23 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -12438,7 +12438,7 @@ SYSCALL_DEFINE5(perf_event_open,
struct perf_event_attr attr;
struct perf_event_context *ctx;
struct file *event_file = NULL;
- struct fd group = {NULL, 0};
+ struct fd group = EMPTY_FD;
struct task_struct *task = NULL;
struct pmu *pmu;
int event_fd;
diff --git a/net/socket.c b/net/socket.c
index 50b074f52147..a2c509363d4d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -559,7 +559,7 @@ static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
if (fd_file(f)) {
sock = sock_from_file(fd_file(f));
if (likely(sock)) {
- *fput_needed = f.flags & FDPUT_FPUT;
+ *fput_needed = f.word & FDPUT_FPUT;
return sock;
}
*err = -ENOTSOCK;
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 05/19] add struct fd constructors, get rid of __to_fd()
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (2 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 04/19] struct fd: representation change Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 06/19] net/socket.c: switch to CLASS(fd) Al Viro
` (14 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Make __fdget() et.al. return struct fd directly.
New helpers: BORROWED_FD(file) and CLONED_FD(file), for
borrowed and cloned file references resp.
NOTE: this might need tuning; in particular, inline on
__fget_light() is there to keep the code generation same as
before - we probably want to keep it inlined in fdget() et.al.
(especially so in fdget_pos()), but that needs profiling.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/file.c | 26 +++++++++++++-------------
include/linux/file.h | 33 +++++++++++----------------------
2 files changed, 24 insertions(+), 35 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index 8076aef9c210..f78e114a03cc 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -1128,7 +1128,7 @@ EXPORT_SYMBOL(task_lookup_next_fdget_rcu);
* The fput_needed flag returned by fget_light should be passed to the
* corresponding fput_light.
*/
-static unsigned long __fget_light(unsigned int fd, fmode_t mask)
+static inline struct fd __fget_light(unsigned int fd, fmode_t mask)
{
struct files_struct *files = current->files;
struct file *file;
@@ -1145,22 +1145,22 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask)
if (likely(atomic_read_acquire(&files->count) == 1)) {
file = files_lookup_fd_raw(files, fd);
if (!file || unlikely(file->f_mode & mask))
- return 0;
- return (unsigned long)file;
+ return EMPTY_FD;
+ return BORROWED_FD(file);
} else {
file = __fget_files(files, fd, mask);
if (!file)
- return 0;
- return FDPUT_FPUT | (unsigned long)file;
+ return EMPTY_FD;
+ return CLONED_FD(file);
}
}
-unsigned long __fdget(unsigned int fd)
+struct fd fdget(unsigned int fd)
{
return __fget_light(fd, FMODE_PATH);
}
-EXPORT_SYMBOL(__fdget);
+EXPORT_SYMBOL(fdget);
-unsigned long __fdget_raw(unsigned int fd)
+struct fd fdget_raw(unsigned int fd)
{
return __fget_light(fd, 0);
}
@@ -1181,16 +1181,16 @@ static inline bool file_needs_f_pos_lock(struct file *file)
(file_count(file) > 1 || file->f_op->iterate_shared);
}
-unsigned long __fdget_pos(unsigned int fd)
+struct fd fdget_pos(unsigned int fd)
{
- unsigned long v = __fdget(fd);
- struct file *file = (struct file *)(v & ~3);
+ struct fd f = fdget(fd);
+ struct file *file = fd_file(f);
if (file && file_needs_f_pos_lock(file)) {
- v |= FDPUT_POS_UNLOCK;
+ f.word |= FDPUT_POS_UNLOCK;
mutex_lock(&file->f_pos_lock);
}
- return v;
+ return f;
}
void __f_unlock_pos(struct file *f)
diff --git a/include/linux/file.h b/include/linux/file.h
index 39eb10a1bbfc..7bc6e24e86ce 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -53,6 +53,14 @@ static inline bool fd_empty(struct fd f)
}
#define EMPTY_FD (struct fd){0}
+static inline struct fd BORROWED_FD(struct file *f)
+{
+ return (struct fd){(unsigned long)f};
+}
+static inline struct fd CLONED_FD(struct file *f)
+{
+ return (struct fd){(unsigned long)f | FDPUT_FPUT};
+}
static inline void fdput(struct fd fd)
{
@@ -63,30 +71,11 @@ static inline void fdput(struct fd fd)
extern struct file *fget(unsigned int fd);
extern struct file *fget_raw(unsigned int fd);
extern struct file *fget_task(struct task_struct *task, unsigned int fd);
-extern unsigned long __fdget(unsigned int fd);
-extern unsigned long __fdget_raw(unsigned int fd);
-extern unsigned long __fdget_pos(unsigned int fd);
extern void __f_unlock_pos(struct file *);
-static inline struct fd __to_fd(unsigned long v)
-{
- return (struct fd){v};
-}
-
-static inline struct fd fdget(unsigned int fd)
-{
- return __to_fd(__fdget(fd));
-}
-
-static inline struct fd fdget_raw(unsigned int fd)
-{
- return __to_fd(__fdget_raw(fd));
-}
-
-static inline struct fd fdget_pos(int fd)
-{
- return __to_fd(__fdget_pos(fd));
-}
+struct fd fdget(unsigned int fd);
+struct fd fdget_raw(unsigned int fd);
+struct fd fdget_pos(unsigned int fd);
static inline void fdput_pos(struct fd f)
{
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 06/19] net/socket.c: switch to CLASS(fd)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (3 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 05/19] add struct fd constructors, get rid of __to_fd() Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 07/19] introduce struct fderr, convert overlayfs uses to that Al Viro
` (13 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
I strongly suspect that important part in sockfd_lookup_light()
is avoiding needless file refcount operations, not the marginal reduction
of the register pressure from not keeping a struct file pointer in
the caller.
If that's true, we should get the same benefits from straight
fdget()/fdput(). And AFAICS with sane use of CLASS(fd) we can get a
better code generation...
Would be nice if somebody tested it on networking test suites
(including benchmarks)...
sockfd_lookup_light() does fdget(), uses sock_from_file() to
get the associated socket and returns the struct socket reference to
the caller, along with "do we need to fput()" flag. No matching fdput(),
the caller does its equivalent manually, using the fact that sock->file
points to the struct file the socket has come from.
Get rid of that - have the callers do fdget()/fdput() and
use sock_from_file() directly. That kills sockfd_lookup_light()
and fput_light() (no users left).
What's more, we can get rid of explicit fdget()/fdput() by
switching to CLASS(fd, ...) - code generation does not suffer, since
now fdput() inserted on "descriptor is not opened" failure exit
is recognized to be a no-op by compiler.
We could split that commit in two (getting rid of sockd_lookup_light()
and switch to CLASS(fd, ...)), but AFAICS it ends up being harder to read
that way.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
include/linux/file.h | 6 -
net/socket.c | 325 ++++++++++++++++++++-----------------------
2 files changed, 148 insertions(+), 183 deletions(-)
diff --git a/include/linux/file.h b/include/linux/file.h
index 7bc6e24e86ce..744a6315f1ac 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -29,12 +29,6 @@ extern struct file *alloc_file_pseudo_noaccount(struct inode *, struct vfsmount
extern struct file *alloc_file_clone(struct file *, int flags,
const struct file_operations *);
-static inline void fput_light(struct file *file, int fput_needed)
-{
- if (fput_needed)
- fput(file);
-}
-
/* either a reference to struct file + flags
* (cloned vs. borrowed, pos locked), with
* flags stored in lower bits of value,
diff --git a/net/socket.c b/net/socket.c
index a2c509363d4d..ca3fd754fb5b 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -510,7 +510,7 @@ static int sock_map_fd(struct socket *sock, int flags)
struct socket *sock_from_file(struct file *file)
{
- if (file->f_op == &socket_file_ops)
+ if (likely(file->f_op == &socket_file_ops))
return file->private_data; /* set in sock_alloc_file */
return NULL;
@@ -550,24 +550,6 @@ struct socket *sockfd_lookup(int fd, int *err)
}
EXPORT_SYMBOL(sockfd_lookup);
-static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
-{
- struct fd f = fdget(fd);
- struct socket *sock;
-
- *err = -EBADF;
- if (fd_file(f)) {
- sock = sock_from_file(fd_file(f));
- if (likely(sock)) {
- *fput_needed = f.word & FDPUT_FPUT;
- return sock;
- }
- *err = -ENOTSOCK;
- fdput(f);
- }
- return NULL;
-}
-
static ssize_t sockfs_listxattr(struct dentry *dentry, char *buffer,
size_t size)
{
@@ -1834,23 +1816,25 @@ int __sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen)
{
struct socket *sock;
struct sockaddr_storage address;
- int err, fput_needed;
-
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (sock) {
- err = move_addr_to_kernel(umyaddr, addrlen, &address);
- if (!err) {
- err = security_socket_bind(sock,
- (struct sockaddr *)&address,
- addrlen);
- if (!err)
- err = READ_ONCE(sock->ops)->bind(sock,
- (struct sockaddr *)
- &address, addrlen);
- }
- fput_light(sock->file, fput_needed);
- }
- return err;
+ CLASS(fd, f)(fd);
+ int err;
+
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ err = move_addr_to_kernel(umyaddr, addrlen, &address);
+ if (unlikely(err))
+ return err;
+
+ err = security_socket_bind(sock, (struct sockaddr *)&address, addrlen);
+ if (unlikely(err))
+ return err;
+
+ return READ_ONCE(sock->ops)->bind(sock,
+ (struct sockaddr *)&address, addrlen);
}
SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
@@ -1867,21 +1851,24 @@ SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
int __sys_listen(int fd, int backlog)
{
struct socket *sock;
- int err, fput_needed;
+ CLASS(fd, f)(fd);
int somaxconn;
+ int err;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (sock) {
- somaxconn = READ_ONCE(sock_net(sock->sk)->core.sysctl_somaxconn);
- if ((unsigned int)backlog > somaxconn)
- backlog = somaxconn;
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- err = security_socket_listen(sock, backlog);
- if (!err)
- err = READ_ONCE(sock->ops)->listen(sock, backlog);
+ somaxconn = READ_ONCE(sock_net(sock->sk)->core.sysctl_somaxconn);
+ if ((unsigned int)backlog > somaxconn)
+ backlog = somaxconn;
+
+ err = security_socket_listen(sock, backlog);
+ if (!err)
+ err = READ_ONCE(sock->ops)->listen(sock, backlog);
- fput_light(sock->file, fput_needed);
- }
return err;
}
@@ -1992,17 +1979,12 @@ static int __sys_accept4_file(struct file *file, struct sockaddr __user *upeer_s
int __sys_accept4(int fd, struct sockaddr __user *upeer_sockaddr,
int __user *upeer_addrlen, int flags)
{
- int ret = -EBADF;
- struct fd f;
+ CLASS(fd, f)(fd);
- f = fdget(fd);
- if (fd_file(f)) {
- ret = __sys_accept4_file(fd_file(f), upeer_sockaddr,
+ if (fd_empty(f))
+ return -EBADF;
+ return __sys_accept4_file(fd_file(f), upeer_sockaddr,
upeer_addrlen, flags);
- fdput(f);
- }
-
- return ret;
}
SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
@@ -2054,20 +2036,18 @@ int __sys_connect_file(struct file *file, struct sockaddr_storage *address,
int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen)
{
- int ret = -EBADF;
- struct fd f;
+ struct sockaddr_storage address;
+ CLASS(fd, f)(fd);
+ int ret;
- f = fdget(fd);
- if (fd_file(f)) {
- struct sockaddr_storage address;
+ if (fd_empty(f))
+ return -EBADF;
- ret = move_addr_to_kernel(uservaddr, addrlen, &address);
- if (!ret)
- ret = __sys_connect_file(fd_file(f), &address, addrlen, 0);
- fdput(f);
- }
+ ret = move_addr_to_kernel(uservaddr, addrlen, &address);
+ if (ret)
+ return ret;
- return ret;
+ return __sys_connect_file(fd_file(f), &address, addrlen, 0);
}
SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
@@ -2086,26 +2066,25 @@ int __sys_getsockname(int fd, struct sockaddr __user *usockaddr,
{
struct socket *sock;
struct sockaddr_storage address;
- int err, fput_needed;
+ CLASS(fd, f)(fd);
+ int err;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
err = security_socket_getsockname(sock);
if (err)
- goto out_put;
+ return err;
err = READ_ONCE(sock->ops)->getname(sock, (struct sockaddr *)&address, 0);
if (err < 0)
- goto out_put;
- /* "err" is actually length in this case */
- err = move_addr_to_user(&address, err, usockaddr, usockaddr_len);
+ return err;
-out_put:
- fput_light(sock->file, fput_needed);
-out:
- return err;
+ /* "err" is actually length in this case */
+ return move_addr_to_user(&address, err, usockaddr, usockaddr_len);
}
SYSCALL_DEFINE3(getsockname, int, fd, struct sockaddr __user *, usockaddr,
@@ -2124,26 +2103,25 @@ int __sys_getpeername(int fd, struct sockaddr __user *usockaddr,
{
struct socket *sock;
struct sockaddr_storage address;
- int err, fput_needed;
+ CLASS(fd, f)(fd);
+ int err;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (sock != NULL) {
- const struct proto_ops *ops = READ_ONCE(sock->ops);
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- err = security_socket_getpeername(sock);
- if (err) {
- fput_light(sock->file, fput_needed);
- return err;
- }
+ err = security_socket_getpeername(sock);
+ if (err)
+ return err;
- err = ops->getname(sock, (struct sockaddr *)&address, 1);
- if (err >= 0)
- /* "err" is actually length in this case */
- err = move_addr_to_user(&address, err, usockaddr,
- usockaddr_len);
- fput_light(sock->file, fput_needed);
- }
- return err;
+ err = READ_ONCE(sock->ops)->getname(sock, (struct sockaddr *)&address, 1);
+ if (err < 0)
+ return err;
+
+ /* "err" is actually length in this case */
+ return move_addr_to_user(&address, err, usockaddr, usockaddr_len);
}
SYSCALL_DEFINE3(getpeername, int, fd, struct sockaddr __user *, usockaddr,
@@ -2164,14 +2142,17 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags,
struct sockaddr_storage address;
int err;
struct msghdr msg;
- int fput_needed;
err = import_ubuf(ITER_SOURCE, buff, len, &msg.msg_iter);
if (unlikely(err))
return err;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
+
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
msg.msg_name = NULL;
msg.msg_control = NULL;
@@ -2181,7 +2162,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags,
if (addr) {
err = move_addr_to_kernel(addr, addr_len, &address);
if (err < 0)
- goto out_put;
+ return err;
msg.msg_name = (struct sockaddr *)&address;
msg.msg_namelen = addr_len;
}
@@ -2189,12 +2170,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags,
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
msg.msg_flags = flags;
- err = __sock_sendmsg(sock, &msg);
-
-out_put:
- fput_light(sock->file, fput_needed);
-out:
- return err;
+ return __sock_sendmsg(sock, &msg);
}
SYSCALL_DEFINE6(sendto, int, fd, void __user *, buff, size_t, len,
@@ -2229,14 +2205,18 @@ int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags,
};
struct socket *sock;
int err, err2;
- int fput_needed;
err = import_ubuf(ITER_DEST, ubuf, size, &msg.msg_iter);
if (unlikely(err))
return err;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
+
+ CLASS(fd, f)(fd);
+
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
@@ -2248,9 +2228,6 @@ int __sys_recvfrom(int fd, void __user *ubuf, size_t size, unsigned int flags,
if (err2 < 0)
err = err2;
}
-
- fput_light(sock->file, fput_needed);
-out:
return err;
}
@@ -2325,17 +2302,16 @@ int __sys_setsockopt(int fd, int level, int optname, char __user *user_optval,
{
sockptr_t optval = USER_SOCKPTR(user_optval);
bool compat = in_compat_syscall();
- int err, fput_needed;
struct socket *sock;
+ CLASS(fd, f)(fd);
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- return err;
-
- err = do_sock_setsockopt(sock, compat, level, optname, optval, optlen);
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- fput_light(sock->file, fput_needed);
- return err;
+ return do_sock_setsockopt(sock, compat, level, optname, optval, optlen);
}
SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname,
@@ -2391,20 +2367,17 @@ EXPORT_SYMBOL(do_sock_getsockopt);
int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
int __user *optlen)
{
- int err, fput_needed;
struct socket *sock;
- bool compat;
+ CLASS(fd, f)(fd);
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- return err;
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- compat = in_compat_syscall();
- err = do_sock_getsockopt(sock, compat, level, optname,
+ return do_sock_getsockopt(sock, in_compat_syscall(), level, optname,
USER_SOCKPTR(optval), USER_SOCKPTR(optlen));
-
- fput_light(sock->file, fput_needed);
- return err;
}
SYSCALL_DEFINE5(getsockopt, int, fd, int, level, int, optname,
@@ -2430,15 +2403,16 @@ int __sys_shutdown_sock(struct socket *sock, int how)
int __sys_shutdown(int fd, int how)
{
- int err, fput_needed;
struct socket *sock;
+ CLASS(fd, f)(fd);
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (sock != NULL) {
- err = __sys_shutdown_sock(sock, how);
- fput_light(sock->file, fput_needed);
- }
- return err;
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ return __sys_shutdown_sock(sock, how);
}
SYSCALL_DEFINE2(shutdown, int, fd, int, how)
@@ -2654,22 +2628,21 @@ long __sys_sendmsg_sock(struct socket *sock, struct msghdr *msg,
long __sys_sendmsg(int fd, struct user_msghdr __user *msg, unsigned int flags,
bool forbid_cmsg_compat)
{
- int fput_needed, err;
struct msghdr msg_sys;
struct socket *sock;
if (forbid_cmsg_compat && (flags & MSG_CMSG_COMPAT))
return -EINVAL;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
+ CLASS(fd, f)(fd);
- err = ___sys_sendmsg(sock, msg, &msg_sys, flags, NULL, 0);
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- fput_light(sock->file, fput_needed);
-out:
- return err;
+ return ___sys_sendmsg(sock, msg, &msg_sys, flags, NULL, 0);
}
SYSCALL_DEFINE3(sendmsg, int, fd, struct user_msghdr __user *, msg, unsigned int, flags)
@@ -2684,7 +2657,7 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct user_msghdr __user *, msg, unsigned int
int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
unsigned int flags, bool forbid_cmsg_compat)
{
- int fput_needed, err, datagrams;
+ int err, datagrams;
struct socket *sock;
struct mmsghdr __user *entry;
struct compat_mmsghdr __user *compat_entry;
@@ -2700,9 +2673,13 @@ int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
datagrams = 0;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- return err;
+ CLASS(fd, f)(fd);
+
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
used_address.name_len = UINT_MAX;
entry = mmsg;
@@ -2739,8 +2716,6 @@ int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
cond_resched();
}
- fput_light(sock->file, fput_needed);
-
/* We only return an error if no datagrams were able to be sent */
if (datagrams != 0)
return datagrams;
@@ -2862,22 +2837,21 @@ long __sys_recvmsg_sock(struct socket *sock, struct msghdr *msg,
long __sys_recvmsg(int fd, struct user_msghdr __user *msg, unsigned int flags,
bool forbid_cmsg_compat)
{
- int fput_needed, err;
struct msghdr msg_sys;
struct socket *sock;
if (forbid_cmsg_compat && (flags & MSG_CMSG_COMPAT))
return -EINVAL;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
+ CLASS(fd, f)(fd);
- err = ___sys_recvmsg(sock, msg, &msg_sys, flags, 0);
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
- fput_light(sock->file, fput_needed);
-out:
- return err;
+ return ___sys_recvmsg(sock, msg, &msg_sys, flags, 0);
}
SYSCALL_DEFINE3(recvmsg, int, fd, struct user_msghdr __user *, msg,
@@ -2894,7 +2868,7 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg,
unsigned int vlen, unsigned int flags,
struct timespec64 *timeout)
{
- int fput_needed, err, datagrams;
+ int err, datagrams;
struct socket *sock;
struct mmsghdr __user *entry;
struct compat_mmsghdr __user *compat_entry;
@@ -2909,16 +2883,18 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg,
datagrams = 0;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- return err;
+ CLASS(fd, f)(fd);
+
+ if (fd_empty(f))
+ return -EBADF;
+ sock = sock_from_file(fd_file(f));
+ if (unlikely(!sock))
+ return -ENOTSOCK;
if (likely(!(flags & MSG_ERRQUEUE))) {
err = sock_error(sock->sk);
- if (err) {
- datagrams = err;
- goto out_put;
- }
+ if (err)
+ return err;
}
entry = mmsg;
@@ -2975,12 +2951,10 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg,
}
if (err == 0)
- goto out_put;
+ return datagrams;
- if (datagrams == 0) {
- datagrams = err;
- goto out_put;
- }
+ if (datagrams == 0)
+ return err;
/*
* We may return less entries than requested (vlen) if the
@@ -2995,9 +2969,6 @@ static int do_recvmmsg(int fd, struct mmsghdr __user *mmsg,
*/
WRITE_ONCE(sock->sk->sk_err, -err);
}
-out_put:
- fput_light(sock->file, fput_needed);
-
return datagrams;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 07/19] introduce struct fderr, convert overlayfs uses to that
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (4 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 06/19] net/socket.c: switch to CLASS(fd) Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...) Al Viro
` (12 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Similar to struct fd; unlike struct fd, it can represent
error values.
Accessors:
* fd_empty(f): true if f represents an error
* fd_file(f): just as for struct fd it yields a pointer to
struct file if fd_empty(f) is false. If
fd_empty(f) is true, fd_file(f) is guaranteed
_not_ to be an address of any object (IS_ERR()
will be true in that case)
* fd_error(f): if f represents an error, returns that error,
otherwise the return value is junk.
Constructors:
* ERR_FD(-E...): an instance encoding given error [ERR_FDERR, perhaps?]
* BORROWED_FDERR(file): if file points to a struct file instance,
return a struct fderr representing that file
reference with no flags set.
if file is an ERR_PTR(-E...), return a struct
fderr representing that error.
file MUST NOT be NULL.
* CLONED_FDERR(file): similar, but in case when file points to
a struct file instance, set FDPUT_FPUT in flags.
fdput_err() serves as a destructor.
See fs/overlayfs/file.c for example of use.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/overlayfs/file.c | 149 +++++++++++++++++--------------------------
include/linux/file.h | 39 ++++++++++-
2 files changed, 97 insertions(+), 91 deletions(-)
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 458299873780..d57106966084 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -89,58 +89,51 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
return 0;
}
-static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
- bool allow_meta)
+static struct fderr ovl_real_fdget_meta(const struct file *file, bool allow_meta)
{
struct dentry *dentry = file_dentry(file);
struct file *private = file->private_data;
struct path realpath;
int err;
- real->word = (unsigned long)private;
-
if (allow_meta) {
ovl_path_real(dentry, &realpath);
} else {
/* lazy lookup and verify of lowerdata */
err = ovl_verify_lowerdata(dentry);
if (err)
- return err;
+ return ERR_FD(err);
ovl_path_realdata(dentry, &realpath);
}
if (!realpath.dentry)
- return -EIO;
+ return ERR_FD(-EIO);
/* Has it been copied up since we'd opened it? */
if (unlikely(file_inode(private) != d_inode(realpath.dentry))) {
- struct file *f = ovl_open_realfile(file, &realpath);
- if (IS_ERR(f))
- return PTR_ERR(f);
- real->word = (unsigned long)ovl_open_realfile(file, &realpath) | FDPUT_FPUT;
- return 0;
+ return CLONED_FDERR(ovl_open_realfile(file, &realpath));
}
/* Did the flags change since open? */
- if (unlikely((file->f_flags ^ private->f_flags) & ~OVL_OPEN_FLAGS))
- return ovl_change_flags(private, file->f_flags);
+ if (unlikely((file->f_flags ^ private->f_flags) & ~OVL_OPEN_FLAGS)) {
+ err = ovl_change_flags(private, file->f_flags);
+ if (err)
+ return ERR_FD(err);
+ }
- return 0;
+ return BORROWED_FDERR(private);
}
-static int ovl_real_fdget(const struct file *file, struct fd *real)
+static struct fderr ovl_real_fdget(const struct file *file)
{
- if (d_is_dir(file_dentry(file))) {
- struct file *f = ovl_dir_real_file(file, false);
- if (IS_ERR(f))
- return PTR_ERR(f);
- real->word = (unsigned long)f;
- return 0;
- }
+ if (d_is_dir(file_dentry(file)))
+ return BORROWED_FDERR(ovl_dir_real_file(file, false));
- return ovl_real_fdget_meta(file, real, false);
+ return ovl_real_fdget_meta(file, false);
}
+DEFINE_CLASS(fd_real, struct fderr, fdput_err(_T), ovl_real_fdget(file), struct file *file)
+
static int ovl_open(struct inode *inode, struct file *file)
{
struct dentry *dentry = file_dentry(file);
@@ -183,7 +176,6 @@ static int ovl_release(struct inode *inode, struct file *file)
static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
{
struct inode *inode = file_inode(file);
- struct fd real;
const struct cred *old_cred;
loff_t ret;
@@ -199,9 +191,9 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
return vfs_setpos(file, 0, 0);
}
- ret = ovl_real_fdget(file, &real);
- if (ret)
- return ret;
+ CLASS(fd_real, real)(file);
+ if (fd_empty(real))
+ return fd_error(real);
/*
* Overlay file f_pos is the master copy that is preserved
@@ -220,8 +212,6 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
file->f_pos = fd_file(real)->f_pos;
ovl_inode_unlock(inode);
- fdput(real);
-
return ret;
}
@@ -262,8 +252,6 @@ static void ovl_file_accessed(struct file *file)
static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
{
struct file *file = iocb->ki_filp;
- struct fd real;
- ssize_t ret;
struct backing_file_ctx ctx = {
.cred = ovl_creds(file_inode(file)->i_sb),
.user_file = file,
@@ -273,22 +261,18 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
if (!iov_iter_count(iter))
return 0;
- ret = ovl_real_fdget(file, &real);
- if (ret)
- return ret;
-
- ret = backing_file_read_iter(fd_file(real), iter, iocb, iocb->ki_flags,
- &ctx);
- fdput(real);
+ CLASS(fd_real, real)(file);
+ if (fd_empty(real))
+ return fd_error(real);
- return ret;
+ return backing_file_read_iter(fd_file(real), iter, iocb, iocb->ki_flags,
+ &ctx);
}
static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
{
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
- struct fd real;
ssize_t ret;
int ifl = iocb->ki_flags;
struct backing_file_ctx ctx = {
@@ -304,9 +288,11 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
/* Update mode */
ovl_copyattr(inode);
- ret = ovl_real_fdget(file, &real);
- if (ret)
+ CLASS(fd_real, real)(file);
+ if (fd_empty(real)) {
+ ret = fd_error(real);
goto out_unlock;
+ }
if (!ovl_should_sync(OVL_FS(inode->i_sb)))
ifl &= ~(IOCB_DSYNC | IOCB_SYNC);
@@ -317,7 +303,6 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
*/
ifl &= ~IOCB_DIO_CALLER_COMP;
ret = backing_file_write_iter(fd_file(real), iter, iocb, ifl, &ctx);
- fdput(real);
out_unlock:
inode_unlock(inode);
@@ -329,22 +314,18 @@ static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len,
unsigned int flags)
{
- struct fd real;
- ssize_t ret;
+ CLASS(fd_real, real)(in);
struct backing_file_ctx ctx = {
.cred = ovl_creds(file_inode(in)->i_sb),
.user_file = in,
.accessed = ovl_file_accessed,
};
- ret = ovl_real_fdget(in, &real);
- if (ret)
- return ret;
-
- ret = backing_file_splice_read(fd_file(real), ppos, pipe, len, flags, &ctx);
- fdput(real);
+ if (fd_empty(real))
+ return fd_error(real);
- return ret;
+ return backing_file_splice_read(fd_file(real), ppos, pipe, len, flags,
+ &ctx);
}
/*
@@ -358,7 +339,6 @@ static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
loff_t *ppos, size_t len, unsigned int flags)
{
- struct fd real;
struct inode *inode = file_inode(out);
ssize_t ret;
struct backing_file_ctx ctx = {
@@ -371,13 +351,13 @@ static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
/* Update mode */
ovl_copyattr(inode);
- ret = ovl_real_fdget(out, &real);
- if (ret)
+ CLASS(fd_real, real)(out);
+ if (fd_empty(real)) {
+ ret = fd_error(real);
goto out_unlock;
+ }
ret = backing_file_splice_write(pipe, fd_file(real), ppos, len, flags, &ctx);
- fdput(real);
-
out_unlock:
inode_unlock(inode);
@@ -386,7 +366,7 @@ static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
{
- struct fd real;
+ struct fderr real;
const struct cred *old_cred;
int ret;
@@ -394,9 +374,9 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
if (ret <= 0)
return ret;
- ret = ovl_real_fdget_meta(file, &real, !datasync);
- if (ret)
- return ret;
+ real = ovl_real_fdget_meta(file, !datasync);
+ if (fd_empty(real))
+ return fd_error(real);
/* Don't sync lower file for fear of receiving EROFS error */
if (file_inode(fd_file(real)) == ovl_inode_upper(file_inode(file))) {
@@ -405,7 +385,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
revert_creds(old_cred);
}
- fdput(real);
+ fdput_err(real);
return ret;
}
@@ -425,7 +405,6 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
{
struct inode *inode = file_inode(file);
- struct fd real;
const struct cred *old_cred;
int ret;
@@ -436,9 +415,11 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
if (ret)
goto out_unlock;
- ret = ovl_real_fdget(file, &real);
- if (ret)
+ CLASS(fd_real, real)(file);
+ if (fd_empty(real)) {
+ ret = fd_error(real);
goto out_unlock;
+ }
old_cred = ovl_override_creds(file_inode(file)->i_sb);
ret = vfs_fallocate(fd_file(real), mode, offset, len);
@@ -447,8 +428,6 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
/* Update size */
ovl_file_modified(file);
- fdput(real);
-
out_unlock:
inode_unlock(inode);
@@ -457,20 +436,17 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
{
- struct fd real;
+ CLASS(fd_real, real)(file);
const struct cred *old_cred;
int ret;
- ret = ovl_real_fdget(file, &real);
- if (ret)
- return ret;
+ if (fd_empty(real))
+ return fd_error(real);
old_cred = ovl_override_creds(file_inode(file)->i_sb);
ret = vfs_fadvise(fd_file(real), offset, len, advice);
revert_creds(old_cred);
- fdput(real);
-
return ret;
}
@@ -485,7 +461,6 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
loff_t len, unsigned int flags, enum ovl_copyop op)
{
struct inode *inode_out = file_inode(file_out);
- struct fd real_in, real_out;
const struct cred *old_cred;
loff_t ret;
@@ -498,13 +473,15 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
goto out_unlock;
}
- ret = ovl_real_fdget(file_out, &real_out);
- if (ret)
+ CLASS(fd_real, real_out)(file_out);
+ if (fd_empty(real_out)) {
+ ret = fd_error(real_out);
goto out_unlock;
+ }
- ret = ovl_real_fdget(file_in, &real_in);
- if (ret) {
- fdput(real_out);
+ CLASS(fd_real, real_in)(file_in);
+ if (fd_empty(real_in)) {
+ ret = fd_error(real_in);
goto out_unlock;
}
@@ -531,9 +508,6 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
/* Update size */
ovl_file_modified(file_out);
- fdput(real_in);
- fdput(real_out);
-
out_unlock:
inode_unlock(inode_out);
@@ -577,21 +551,18 @@ static loff_t ovl_remap_file_range(struct file *file_in, loff_t pos_in,
static int ovl_flush(struct file *file, fl_owner_t id)
{
- struct fd real;
+ CLASS(fd_real, real)(file);
const struct cred *old_cred;
- int err;
+ int err = 0;
- err = ovl_real_fdget(file, &real);
- if (err)
- return err;
+ if (fd_empty(real))
+ return fd_error(real);
if (fd_file(real)->f_op->flush) {
old_cred = ovl_override_creds(file_inode(file)->i_sb);
err = fd_file(real)->f_op->flush(fd_file(real), id);
revert_creds(old_cred);
}
- fdput(real);
-
return err;
}
diff --git a/include/linux/file.h b/include/linux/file.h
index 744a6315f1ac..6571ef345d35 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -10,6 +10,7 @@
#include <linux/types.h>
#include <linux/posix_types.h>
#include <linux/errno.h>
+#include <linux/err.h>
#include <linux/cleanup.h>
struct file;
@@ -37,13 +38,26 @@ extern struct file *alloc_file_clone(struct file *, int flags,
struct fd {
unsigned long word;
};
+
+/* either a reference to struct file + flags
+ * (cloned vs. borrowed, pos locked), with
+ * flags stored in lower bits of value,
+ * or an error (represented by small negative value).
+ */
+struct fderr {
+ unsigned long word;
+};
+
#define FDPUT_FPUT 1
#define FDPUT_POS_UNLOCK 2
+#define fd_empty(f) _Generic((f), \
+ struct fd: unlikely(!(f).word), \
+ struct fderr: IS_ERR_VALUE((f).word))
#define fd_file(f) ((struct file *)((f).word & ~3))
-static inline bool fd_empty(struct fd f)
+static inline long fd_error(struct fderr f)
{
- return unlikely(!f.word);
+ return (long)f.word;
}
#define EMPTY_FD (struct fd){0}
@@ -56,12 +70,33 @@ static inline struct fd CLONED_FD(struct file *f)
return (struct fd){(unsigned long)f | FDPUT_FPUT};
}
+static inline struct fderr ERR_FD(long n)
+{
+ return (struct fderr){(unsigned long)n};
+}
+static inline struct fderr BORROWED_FDERR(struct file *f)
+{
+ return (struct fderr){(unsigned long)f};
+}
+static inline struct fderr CLONED_FDERR(struct file *f)
+{
+ if (IS_ERR(f))
+ return BORROWED_FDERR(f);
+ return (struct fderr){(unsigned long)f | FDPUT_FPUT};
+}
+
static inline void fdput(struct fd fd)
{
if (fd.word & FDPUT_FPUT)
fput(fd_file(fd));
}
+static inline void fdput_err(struct fderr fd)
+{
+ if (!fd_empty(fd) && fd.word & FDPUT_FPUT)
+ fput(fd_file(fd));
+}
+
extern struct file *fget(unsigned int fd);
extern struct file *fget_raw(unsigned int fd);
extern struct file *fget_task(struct task_struct *task, unsigned int fd);
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (5 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 07/19] introduce struct fderr, convert overlayfs uses to that Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:20 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 09/19] css_set_fork(): " Al Viro
` (11 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
arch/arm/kernel/sys_oabi-compat.c | 10 +++-----
fs/fcntl.c | 42 +++++++++++++------------------
fs/namei.c | 13 +++-------
fs/open.c | 13 +++-------
fs/quota/quota.c | 12 +++------
fs/stat.c | 10 +++-----
fs/statfs.c | 12 ++++-----
kernel/bpf/bpf_inode_storage.c | 25 ++++++------------
kernel/cgroup/cgroup.c | 9 +++----
security/landlock/syscalls.c | 19 +++++---------
10 files changed, 58 insertions(+), 107 deletions(-)
diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index f5781ff54a5c..2944721e82a2 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -235,12 +235,12 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd,
unsigned long arg)
{
void __user *argp = (void __user *)arg;
- struct fd f = fdget_raw(fd);
+ CLASS(fd_raw, f)(fd);
struct flock64 flock;
- long err = -EBADF;
+ long err;
- if (!fd_file(f))
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
switch (cmd) {
case F_GETLK64:
@@ -271,8 +271,6 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd,
err = sys_fcntl64(fd, cmd, arg);
break;
}
- fdput(f);
-out:
return err;
}
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 2b5616762354..f96328ee3853 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -476,24 +476,21 @@ static int check_fcntl_cmd(unsigned cmd)
SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
{
- struct fd f = fdget_raw(fd);
- long err = -EBADF;
+ CLASS(fd_raw, f)(fd);
+ long err;
- if (!fd_file(f))
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
- goto out1;
+ return -EBADF;
}
err = security_file_fcntl(fd_file(f), cmd, arg);
if (!err)
err = do_fcntl(fd, cmd, arg, fd_file(f));
-out1:
- fdput(f);
-out:
return err;
}
@@ -502,21 +499,21 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
unsigned long, arg)
{
void __user *argp = (void __user *)arg;
- struct fd f = fdget_raw(fd);
+ CLASS(fd_raw, f)(fd);
struct flock64 flock;
- long err = -EBADF;
+ long err;
- if (!fd_file(f))
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
- goto out1;
+ return -EBADF;
}
err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
- goto out1;
+ return err;
switch (cmd) {
case F_GETLK64:
@@ -541,9 +538,6 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
err = do_fcntl(fd, cmd, arg, fd_file(f));
break;
}
-out1:
- fdput(f);
-out:
return err;
}
#endif
@@ -639,21 +633,21 @@ static int fixup_compat_flock(struct flock *flock)
static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
compat_ulong_t arg)
{
- struct fd f = fdget_raw(fd);
+ CLASS(fd_raw, f)(fd);
struct flock flock;
- long err = -EBADF;
+ long err;
- if (!fd_file(f))
- return err;
+ if (fd_empty(f))
+ return -EBADF;
if (unlikely(fd_file(f)->f_mode & FMODE_PATH)) {
if (!check_fcntl_cmd(cmd))
- goto out_put;
+ return -EBADF;
}
err = security_file_fcntl(fd_file(f), cmd, arg);
if (err)
- goto out_put;
+ return err;
switch (cmd) {
case F_GETLK:
@@ -696,8 +690,6 @@ static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
err = do_fcntl(fd, cmd, arg, fd_file(f));
break;
}
-out_put:
- fdput(f);
return err;
}
diff --git a/fs/namei.c b/fs/namei.c
index 72736b6328a6..ae916b49a25f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2416,26 +2416,22 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
}
} else {
/* Caller must check execute permissions on the starting path component */
- struct fd f = fdget_raw(nd->dfd);
+ CLASS(fd_raw, f)(nd->dfd);
struct dentry *dentry;
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
if (flags & LOOKUP_LINKAT_EMPTY) {
if (fd_file(f)->f_cred != current_cred() &&
- !ns_capable(fd_file(f)->f_cred->user_ns, CAP_DAC_READ_SEARCH)) {
- fdput(f);
+ !ns_capable(fd_file(f)->f_cred->user_ns, CAP_DAC_READ_SEARCH))
return ERR_PTR(-ENOENT);
- }
}
dentry = fd_file(f)->f_path.dentry;
- if (*s && unlikely(!d_can_lookup(dentry))) {
- fdput(f);
+ if (*s && unlikely(!d_can_lookup(dentry)))
return ERR_PTR(-ENOTDIR);
- }
nd->path = fd_file(f)->f_path;
if (flags & LOOKUP_RCU) {
@@ -2445,7 +2441,6 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
path_get(&nd->path);
nd->inode = nd->path.dentry->d_inode;
}
- fdput(f);
}
/* For scoped-lookups we need to set the root to the dirfd as well. */
diff --git a/fs/open.c b/fs/open.c
index 29e0ec819bc4..4cb5e12e84a5 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -577,23 +577,18 @@ SYSCALL_DEFINE1(chdir, const char __user *, filename)
SYSCALL_DEFINE1(fchdir, unsigned int, fd)
{
- struct fd f = fdget_raw(fd);
+ CLASS(fd_raw, f)(fd);
int error;
- error = -EBADF;
- if (!fd_file(f))
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
- error = -ENOTDIR;
if (!d_can_lookup(fd_file(f)->f_path.dentry))
- goto out_putf;
+ return -ENOTDIR;
error = file_permission(fd_file(f), MAY_EXEC | MAY_CHDIR);
if (!error)
set_fs_pwd(current->fs, &fd_file(f)->f_path);
-out_putf:
- fdput(f);
-out:
return error;
}
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 290157bc7bec..7c2b75a44485 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -976,21 +976,19 @@ SYSCALL_DEFINE4(quotactl_fd, unsigned int, fd, unsigned int, cmd,
struct super_block *sb;
unsigned int cmds = cmd >> SUBCMDSHIFT;
unsigned int type = cmd & SUBCMDMASK;
- struct fd f;
+ CLASS(fd_raw, f)(fd);
int ret;
- f = fdget_raw(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- ret = -EINVAL;
if (type >= MAXQUOTAS)
- goto out;
+ return -EINVAL;
if (quotactl_cmd_write(cmds)) {
ret = mnt_want_write(fd_file(f)->f_path.mnt);
if (ret)
- goto out;
+ return ret;
}
sb = fd_file(f)->f_path.mnt->mnt_sb;
@@ -1008,7 +1006,5 @@ SYSCALL_DEFINE4(quotactl_fd, unsigned int, fd, unsigned int, cmd,
if (quotactl_cmd_write(cmds))
mnt_drop_write(fd_file(f)->f_path.mnt);
-out:
- fdput(f);
return ret;
}
diff --git a/fs/stat.c b/fs/stat.c
index 740e5997da09..e593fbbfed83 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -189,15 +189,11 @@ EXPORT_SYMBOL(vfs_getattr);
*/
int vfs_fstat(int fd, struct kstat *stat)
{
- struct fd f;
- int error;
+ CLASS(fd_raw, f)(fd);
- f = fdget_raw(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- error = vfs_getattr(&fd_file(f)->f_path, stat, STATX_BASIC_STATS, 0);
- fdput(f);
- return error;
+ return vfs_getattr(&fd_file(f)->f_path, stat, STATX_BASIC_STATS, 0);
}
int getname_statx_lookup_flags(int flags)
diff --git a/fs/statfs.c b/fs/statfs.c
index 9c7bb27e7932..a45ac85e6048 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -114,13 +114,11 @@ int user_statfs(const char __user *pathname, struct kstatfs *st)
int fd_statfs(int fd, struct kstatfs *st)
{
- struct fd f = fdget_raw(fd);
- int error = -EBADF;
- if (fd_file(f)) {
- error = vfs_statfs(&fd_file(f)->f_path, st);
- fdput(f);
- }
- return error;
+ CLASS(fd_raw, f)(fd);
+
+ if (fd_empty(f))
+ return -EBADF;
+ return vfs_statfs(&fd_file(f)->f_path, st);
}
static int do_statfs_native(struct kstatfs *st, struct statfs __user *p)
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
index 0a79aee6523d..a2c05db49ebc 100644
--- a/kernel/bpf/bpf_inode_storage.c
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -78,13 +78,12 @@ void bpf_inode_storage_free(struct inode *inode)
static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key)
{
struct bpf_local_storage_data *sdata;
- struct fd f = fdget_raw(*(int *)key);
+ CLASS(fd_raw, f)(*(int *)key);
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
sdata = inode_storage_lookup(file_inode(fd_file(f)), map, true);
- fdput(f);
return sdata ? sdata->data : NULL;
}
@@ -92,19 +91,16 @@ static long bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
void *value, u64 map_flags)
{
struct bpf_local_storage_data *sdata;
- struct fd f = fdget_raw(*(int *)key);
+ CLASS(fd_raw, f)(*(int *)key);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- if (!inode_storage_ptr(file_inode(fd_file(f)))) {
- fdput(f);
+ if (!inode_storage_ptr(file_inode(fd_file(f))))
return -EBADF;
- }
sdata = bpf_local_storage_update(file_inode(fd_file(f)),
(struct bpf_local_storage_map *)map,
value, map_flags, GFP_ATOMIC);
- fdput(f);
return PTR_ERR_OR_ZERO(sdata);
}
@@ -123,15 +119,10 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
static long bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
{
- struct fd f = fdget_raw(*(int *)key);
- int err;
-
- if (!fd_file(f))
+ CLASS(fd_raw, f)(*(int *)key);
+ if (fd_empty(f))
return -EBADF;
-
- err = inode_storage_delete(file_inode(fd_file(f)), map);
- fdput(f);
- return err;
+ return inode_storage_delete(file_inode(fd_file(f)), map);
}
/* *gfp_flags* is a hidden argument provided by the verifier */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0a6e9f566ca6..d53673ccaefc 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6880,14 +6880,11 @@ EXPORT_SYMBOL_GPL(cgroup_get_from_path);
*/
struct cgroup *cgroup_v1v2_get_from_fd(int fd)
{
- struct cgroup *cgrp;
- struct fd f = fdget_raw(fd);
- if (!fd_file(f))
+ CLASS(fd_raw, f)(fd);
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- cgrp = cgroup_v1v2_get_from_file(fd_file(f));
- fdput(f);
- return cgrp;
+ return cgroup_v1v2_get_from_file(fd_file(f));
}
/**
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 44edbf57644d..97b3df540dc7 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -269,15 +269,12 @@ static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
*/
static int get_path_from_fd(const s32 fd, struct path *const path)
{
- struct fd f;
- int err = 0;
+ CLASS(fd_raw, f)(fd);
BUILD_BUG_ON(!__same_type(
fd, ((struct landlock_path_beneath_attr *)NULL)->parent_fd));
- /* Handles O_PATH. */
- f = fdget_raw(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
/*
* Forbids ruleset FDs, internal filesystems (e.g. nsfs), including
@@ -288,16 +285,12 @@ static int get_path_from_fd(const s32 fd, struct path *const path)
(fd_file(f)->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
(fd_file(f)->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
d_is_negative(fd_file(f)->f_path.dentry) ||
- IS_PRIVATE(d_backing_inode(fd_file(f)->f_path.dentry))) {
- err = -EBADFD;
- goto out_fdput;
- }
+ IS_PRIVATE(d_backing_inode(fd_file(f)->f_path.dentry)))
+ return -EBADFD;
+
*path = fd_file(f)->f_path;
path_get(path);
-
-out_fdput:
- fdput(f);
- return err;
+ return 0;
}
static int add_rule_path_beneath(struct landlock_ruleset *const ruleset,
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 09/19] css_set_fork(): switch to CLASS(fd_raw, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (6 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...) Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:21 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 10/19] introduce "fd_pos" class Al Viro
` (10 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
reference acquired there by fget_raw() is not stashed anywhere -
we could as well borrow instead.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
kernel/cgroup/cgroup.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index d53673ccaefc..b3a5ae2807b4 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6394,7 +6394,6 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
struct cgroup *dst_cgrp = NULL;
struct css_set *cset;
struct super_block *sb;
- struct file *f;
if (kargs->flags & CLONE_INTO_CGROUP)
cgroup_lock();
@@ -6411,14 +6410,14 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
return 0;
}
- f = fget_raw(kargs->cgroup);
- if (!f) {
+ CLASS(fd_raw, f)(kargs->cgroup);
+ if (fd_empty(f)) {
ret = -EBADF;
goto err;
}
- sb = f->f_path.dentry->d_sb;
+ sb = fd_file(f)->f_path.dentry->d_sb;
- dst_cgrp = cgroup_get_from_file(f);
+ dst_cgrp = cgroup_get_from_file(fd_file(f));
if (IS_ERR(dst_cgrp)) {
ret = PTR_ERR(dst_cgrp);
dst_cgrp = NULL;
@@ -6466,15 +6465,12 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
}
put_css_set(cset);
- fput(f);
kargs->cgrp = dst_cgrp;
return ret;
err:
cgroup_threadgroup_change_end(current);
cgroup_unlock();
- if (f)
- fput(f);
if (dst_cgrp)
cgroup_put(dst_cgrp);
put_css_set(cset);
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 10/19] introduce "fd_pos" class
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (7 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 09/19] css_set_fork(): " Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:21 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...) Al Viro
` (9 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
fdget_pos() for constructor, fdput_pos() for cleanup, users of
fd..._pos() converted.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
arch/alpha/kernel/osf_sys.c | 5 ++---
fs/read_write.c | 34 +++++++++++++---------------------
fs/readdir.c | 28 ++++++++++------------------
include/linux/file.h | 1 +
4 files changed, 26 insertions(+), 42 deletions(-)
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 56fea57f9642..09d82f0963d0 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -152,7 +152,7 @@ SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd,
long __user *, basep)
{
int error;
- struct fd arg = fdget_pos(fd);
+ CLASS(fd_pos, arg)(fd);
struct osf_dirent_callback buf = {
.ctx.actor = osf_filldir,
.dirent = dirent,
@@ -160,7 +160,7 @@ SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd,
.count = count
};
- if (!fd_file(arg))
+ if (fd_empty(arg))
return -EBADF;
error = iterate_dir(fd_file(arg), &buf.ctx);
@@ -169,7 +169,6 @@ SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd,
if (count != buf.count)
error = count - buf.count;
- fdput_pos(arg);
return error;
}
diff --git a/fs/read_write.c b/fs/read_write.c
index 415b509df359..6d49202e2507 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -293,8 +293,8 @@ EXPORT_SYMBOL(vfs_llseek);
static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
{
off_t retval;
- struct fd f = fdget_pos(fd);
- if (!fd_file(f))
+ CLASS(fd_pos, f)(fd);
+ if (fd_empty(f))
return -EBADF;
retval = -EINVAL;
@@ -304,7 +304,6 @@ static off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
if (res != (loff_t)retval)
retval = -EOVERFLOW; /* LFS: should only happen on 32 bit platforms */
}
- fdput_pos(f);
return retval;
}
@@ -327,15 +326,14 @@ SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
unsigned int, whence)
{
int retval;
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
loff_t offset;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- retval = -EINVAL;
if (whence > SEEK_MAX)
- goto out_putf;
+ return -EINVAL;
offset = vfs_llseek(fd_file(f), ((loff_t) offset_high << 32) | offset_low,
whence);
@@ -346,8 +344,6 @@ SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
if (!copy_to_user(result, &offset, sizeof(offset)))
retval = 0;
}
-out_putf:
- fdput_pos(f);
return retval;
}
#endif
@@ -607,10 +603,10 @@ static inline loff_t *file_ppos(struct file *file)
ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
{
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
ssize_t ret = -EBADF;
- if (fd_file(f)) {
+ if (!fd_empty(f)) {
loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
@@ -619,7 +615,6 @@ ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
ret = vfs_read(fd_file(f), buf, count, ppos);
if (ret >= 0 && ppos)
fd_file(f)->f_pos = pos;
- fdput_pos(f);
}
return ret;
}
@@ -631,10 +626,10 @@ SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
{
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
ssize_t ret = -EBADF;
- if (fd_file(f)) {
+ if (!fd_empty(f)) {
loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
@@ -643,7 +638,6 @@ ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
ret = vfs_write(fd_file(f), buf, count, ppos);
if (ret >= 0 && ppos)
fd_file(f)->f_pos = pos;
- fdput_pos(f);
}
return ret;
@@ -982,10 +976,10 @@ static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, rwf_t flags)
{
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
ssize_t ret = -EBADF;
- if (fd_file(f)) {
+ if (!fd_empty(f)) {
loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
@@ -994,7 +988,6 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
ret = vfs_readv(fd_file(f), vec, vlen, ppos, flags);
if (ret >= 0 && ppos)
fd_file(f)->f_pos = pos;
- fdput_pos(f);
}
if (ret > 0)
@@ -1006,10 +999,10 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, rwf_t flags)
{
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
ssize_t ret = -EBADF;
- if (fd_file(f)) {
+ if (!fd_empty(f)) {
loff_t pos, *ppos = file_ppos(fd_file(f));
if (ppos) {
pos = *ppos;
@@ -1018,7 +1011,6 @@ static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
ret = vfs_writev(fd_file(f), vec, vlen, ppos, flags);
if (ret >= 0 && ppos)
fd_file(f)->f_pos = pos;
- fdput_pos(f);
}
if (ret > 0)
diff --git a/fs/readdir.c b/fs/readdir.c
index eea0e6e9abcf..80888fa467f3 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -221,20 +221,19 @@ SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
struct old_linux_dirent __user *, dirent, unsigned int, count)
{
int error;
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
struct readdir_callback buf = {
.ctx.actor = fillonedir,
.dirent = dirent
};
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = iterate_dir(fd_file(f), &buf.ctx);
if (buf.result)
error = buf.result;
- fdput_pos(f);
return error;
}
@@ -311,7 +310,7 @@ static bool filldir(struct dir_context *ctx, const char *name, int namlen,
SYSCALL_DEFINE3(getdents, unsigned int, fd,
struct linux_dirent __user *, dirent, unsigned int, count)
{
- struct fd f;
+ CLASS(fd_pos, f)(fd);
struct getdents_callback buf = {
.ctx.actor = filldir,
.count = count,
@@ -319,8 +318,7 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
};
int error;
- f = fdget_pos(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = iterate_dir(fd_file(f), &buf.ctx);
@@ -335,7 +333,6 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
else
error = count - buf.count;
}
- fdput_pos(f);
return error;
}
@@ -394,7 +391,7 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen,
SYSCALL_DEFINE3(getdents64, unsigned int, fd,
struct linux_dirent64 __user *, dirent, unsigned int, count)
{
- struct fd f;
+ CLASS(fd_pos, f)(fd);
struct getdents_callback64 buf = {
.ctx.actor = filldir64,
.count = count,
@@ -402,8 +399,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd,
};
int error;
- f = fdget_pos(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = iterate_dir(fd_file(f), &buf.ctx);
@@ -419,7 +415,6 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd,
else
error = count - buf.count;
}
- fdput_pos(f);
return error;
}
@@ -479,20 +474,19 @@ COMPAT_SYSCALL_DEFINE3(old_readdir, unsigned int, fd,
struct compat_old_linux_dirent __user *, dirent, unsigned int, count)
{
int error;
- struct fd f = fdget_pos(fd);
+ CLASS(fd_pos, f)(fd);
struct compat_readdir_callback buf = {
.ctx.actor = compat_fillonedir,
.dirent = dirent
};
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = iterate_dir(fd_file(f), &buf.ctx);
if (buf.result)
error = buf.result;
- fdput_pos(f);
return error;
}
@@ -562,7 +556,7 @@ static bool compat_filldir(struct dir_context *ctx, const char *name, int namlen
COMPAT_SYSCALL_DEFINE3(getdents, unsigned int, fd,
struct compat_linux_dirent __user *, dirent, unsigned int, count)
{
- struct fd f;
+ CLASS(fd_pos, f)(fd);
struct compat_getdents_callback buf = {
.ctx.actor = compat_filldir,
.current_dir = dirent,
@@ -570,8 +564,7 @@ COMPAT_SYSCALL_DEFINE3(getdents, unsigned int, fd,
};
int error;
- f = fdget_pos(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = iterate_dir(fd_file(f), &buf.ctx);
@@ -586,7 +579,6 @@ COMPAT_SYSCALL_DEFINE3(getdents, unsigned int, fd,
else
error = count - buf.count;
}
- fdput_pos(f);
return error;
}
#endif
diff --git a/include/linux/file.h b/include/linux/file.h
index 6571ef345d35..847f7cb26d4a 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -115,6 +115,7 @@ static inline void fdput_pos(struct fd f)
DEFINE_CLASS(fd, struct fd, fdput(_T), fdget(fd), int fd)
DEFINE_CLASS(fd_raw, struct fd, fdput(_T), fdget_raw(fd), int fd)
+DEFINE_CLASS(fd_pos, struct fd, fdput_pos(_T), fdget_pos(fd), int fd)
extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (8 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 10/19] introduce "fd_pos" class Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:26 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 12/19] bpf: switch to CLASS(fd, ...) Al Viro
` (8 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
low-hanging fruit; that's another likely source of conflicts
over the cycle and it might make a lot of sense to split;
fortunately, it can be split pretty much on per-function
basis - chunks are independent from each other.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
arch/powerpc/kvm/book3s_64_vio.c | 7 +-
arch/powerpc/kvm/powerpc.c | 24 ++---
arch/powerpc/platforms/cell/spu_syscalls.c | 13 +--
arch/x86/kernel/cpu/sgx/main.c | 10 +-
arch/x86/kvm/svm/sev.c | 39 +++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 23 ++---
drivers/gpu/drm/drm_syncobj.c | 9 +-
drivers/infiniband/core/ucma.c | 19 ++--
drivers/media/mc/mc-request.c | 18 ++--
drivers/media/rc/lirc_dev.c | 13 +--
drivers/vfio/group.c | 6 +-
drivers/vfio/virqfd.c | 16 +--
drivers/virt/acrn/irqfd.c | 10 +-
drivers/xen/privcmd.c | 31 ++----
fs/btrfs/ioctl.c | 5 +-
fs/coda/inode.c | 11 +-
fs/eventfd.c | 9 +-
fs/eventpoll.c | 38 ++-----
fs/ext4/ioctl.c | 21 ++--
fs/f2fs/file.c | 15 +--
fs/fhandle.c | 5 +-
fs/fsopen.c | 19 ++--
fs/fuse/dev.c | 6 +-
fs/ioctl.c | 23 ++---
fs/kernel_read_file.c | 12 +--
fs/locks.c | 15 +--
fs/namespace.c | 47 +++------
fs/notify/fanotify/fanotify_user.c | 44 +++-----
fs/notify/inotify/inotify_user.c | 38 +++----
fs/ocfs2/cluster/heartbeat.c | 13 +--
fs/open.c | 48 ++++-----
fs/read_write.c | 111 ++++++++-------------
fs/remap_range.c | 11 +-
fs/select.c | 13 +--
fs/signalfd.c | 9 +-
fs/smb/client/ioctl.c | 11 +-
fs/splice.c | 47 ++++-----
fs/sync.c | 29 ++----
fs/utimes.c | 11 +-
fs/xattr.c | 40 +++-----
fs/xfs/xfs_exchrange.c | 10 +-
fs/xfs/xfs_ioctl.c | 69 ++++---------
io_uring/sqpoll.c | 29 ++----
ipc/mqueue.c | 76 +++++---------
kernel/bpf/btf.c | 11 +-
kernel/bpf/syscall.c | 32 ++----
kernel/bpf/token.c | 15 +--
kernel/events/core.c | 14 +--
kernel/nsproxy.c | 5 +-
kernel/pid.c | 20 ++--
kernel/signal.c | 29 ++----
kernel/sys.c | 15 +--
kernel/taskstats.c | 18 ++--
kernel/watch_queue.c | 6 +-
mm/fadvise.c | 10 +-
mm/filemap.c | 17 +---
mm/memcontrol.c | 29 ++----
mm/readahead.c | 17 +---
net/core/net_namespace.c | 10 +-
security/integrity/ima/ima_main.c | 7 +-
security/landlock/syscalls.c | 26 ++---
security/loadpin/loadpin.c | 8 +-
virt/kvm/eventfd.c | 15 +--
virt/kvm/vfio.c | 14 +--
64 files changed, 454 insertions(+), 937 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index ce8f9539af2b..742aa58a7c7e 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -115,10 +115,9 @@ long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
struct iommu_table_group *table_group;
long i;
struct kvmppc_spapr_tce_iommu_table *stit;
- struct fd f;
+ CLASS(fd, f)(tablefd);
- f = fdget(tablefd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
rcu_read_lock();
@@ -130,8 +129,6 @@ long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
}
rcu_read_unlock();
- fdput(f);
-
if (!found)
return -EINVAL;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index fd62144e497e..e31412069117 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1933,12 +1933,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
#endif
#ifdef CONFIG_KVM_MPIC
case KVM_CAP_IRQ_MPIC: {
- struct fd f;
+ CLASS(fd, f)(cap->args[0]);
struct kvm_device *dev;
r = -EBADF;
- f = fdget(cap->args[0]);
- if (!fd_file(f))
+ if (fd_empty(f))
break;
r = -EPERM;
@@ -1946,18 +1945,16 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
if (dev)
r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
- fdput(f);
break;
}
#endif
#ifdef CONFIG_KVM_XICS
case KVM_CAP_IRQ_XICS: {
- struct fd f;
+ CLASS(fd, f)(cap->args[0]);
struct kvm_device *dev;
r = -EBADF;
- f = fdget(cap->args[0]);
- if (!fd_file(f))
+ if (fd_empty(f))
break;
r = -EPERM;
@@ -1968,34 +1965,27 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
else
r = kvmppc_xics_connect_vcpu(dev, vcpu, cap->args[1]);
}
-
- fdput(f);
break;
}
#endif /* CONFIG_KVM_XICS */
#ifdef CONFIG_KVM_XIVE
case KVM_CAP_PPC_IRQ_XIVE: {
- struct fd f;
+ CLASS(fd, f)(cap->args[0]);
struct kvm_device *dev;
r = -EBADF;
- f = fdget(cap->args[0]);
- if (!fd_file(f))
+ if (fd_empty(f))
break;
r = -ENXIO;
- if (!xive_enabled()) {
- fdput(f);
+ if (!xive_enabled())
break;
- }
r = -EPERM;
dev = kvm_device_from_filp(fd_file(f));
if (dev)
r = kvmppc_xive_native_connect_vcpu(dev, vcpu,
cap->args[1]);
-
- fdput(f);
break;
}
#endif /* CONFIG_KVM_XIVE */
diff --git a/arch/powerpc/platforms/cell/spu_syscalls.c b/arch/powerpc/platforms/cell/spu_syscalls.c
index cd7d42fc12a6..ca602376d025 100644
--- a/arch/powerpc/platforms/cell/spu_syscalls.c
+++ b/arch/powerpc/platforms/cell/spu_syscalls.c
@@ -64,12 +64,10 @@ SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
return -ENOSYS;
if (flags & SPU_CREATE_AFFINITY_SPU) {
- struct fd neighbor = fdget(neighbor_fd);
+ CLASS(fd, neighbor)(neighbor_fd);
ret = -EBADF;
- if (fd_file(neighbor)) {
+ if (!fd_empty(neighbor))
ret = calls->create_thread(name, flags, mode, fd_file(neighbor));
- fdput(neighbor);
- }
} else
ret = calls->create_thread(name, flags, mode, NULL);
@@ -80,7 +78,6 @@ SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
SYSCALL_DEFINE3(spu_run,int, fd, __u32 __user *, unpc, __u32 __user *, ustatus)
{
long ret;
- struct fd arg;
struct spufs_calls *calls;
calls = spufs_calls_get();
@@ -88,11 +85,9 @@ SYSCALL_DEFINE3(spu_run,int, fd, __u32 __user *, unpc, __u32 __user *, ustatus)
return -ENOSYS;
ret = -EBADF;
- arg = fdget(fd);
- if (fd_file(arg)) {
+ CLASS(fd, arg)(fd);
+ if (!fd_empty(arg))
ret = calls->spu_run(fd_file(arg), unpc, ustatus);
- fdput(arg);
- }
spufs_calls_put(calls);
return ret;
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index d01deb386395..d91284001162 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -893,19 +893,15 @@ static struct miscdevice sgx_dev_provision = {
int sgx_set_attribute(unsigned long *allowed_attributes,
unsigned int attribute_fd)
{
- struct fd f = fdget(attribute_fd);
+ CLASS(fd, f)(attribute_fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EINVAL;
- if (fd_file(f)->f_op != &sgx_provision_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &sgx_provision_fops)
return -EINVAL;
- }
*allowed_attributes |= SGX_ATTR_PROVISIONKEY;
-
- fdput(f);
return 0;
}
EXPORT_SYMBOL_GPL(sgx_set_attribute);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6a8154d6935a..197c80b809dc 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -373,17 +373,12 @@ static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
static int __sev_issue_cmd(int fd, int id, void *data, int *error)
{
- struct fd f;
- int ret;
+ CLASS(fd, f)(fd);
- f = fdget(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- ret = sev_issue_cmd_external_user(fd_file(f), id, data, error);
-
- fdput(f);
- return ret;
+ return sev_issue_cmd_external_user(fd_file(f), id, data, error);
}
static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
@@ -1908,23 +1903,21 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
{
struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
struct kvm_sev_info *src_sev, *cg_cleanup_sev;
- struct fd f = fdget(source_fd);
+ CLASS(fd, f)(source_fd);
struct kvm *source_kvm;
bool charged = false;
int ret;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- if (!file_is_kvm(fd_file(f))) {
- ret = -EBADF;
- goto out_fput;
- }
+ if (!file_is_kvm(fd_file(f)))
+ return -EBADF;
source_kvm = fd_file(f)->private_data;
ret = sev_lock_two_vms(kvm, source_kvm);
if (ret)
- goto out_fput;
+ return ret;
if (kvm->arch.vm_type != source_kvm->arch.vm_type ||
sev_guest(kvm) || !sev_guest(source_kvm)) {
@@ -1971,8 +1964,6 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
cg_cleanup_sev->misc_cg = NULL;
out_unlock:
sev_unlock_two_vms(kvm, source_kvm);
-out_fput:
- fdput(f);
return ret;
}
@@ -2209,23 +2200,21 @@ int sev_mem_enc_unregister_region(struct kvm *kvm,
int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
{
- struct fd f = fdget(source_fd);
+ CLASS(fd, f)(source_fd);
struct kvm *source_kvm;
struct kvm_sev_info *source_sev, *mirror_sev;
int ret;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- if (!file_is_kvm(fd_file(f))) {
- ret = -EBADF;
- goto e_source_fput;
- }
+ if (!file_is_kvm(fd_file(f)))
+ return -EBADF;
source_kvm = fd_file(f)->private_data;
ret = sev_lock_two_vms(kvm, source_kvm);
if (ret)
- goto e_source_fput;
+ return ret;
/*
* Mirrors of mirrors should work, but let's not get silly. Also
@@ -2268,8 +2257,6 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
e_unlock:
sev_unlock_two_vms(kvm, source_kvm);
-e_source_fput:
- fdput(f);
return ret;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index a9298cb8d19a..570634654489 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -36,21 +36,19 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
int fd,
int32_t priority)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct amdgpu_fpriv *fpriv;
struct amdgpu_ctx_mgr *mgr;
struct amdgpu_ctx *ctx;
uint32_t id;
int r;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EINVAL;
r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
- if (r) {
- fdput(f);
+ if (r)
return r;
- }
mgr = &fpriv->ctx_mgr;
mutex_lock(&mgr->lock);
@@ -58,7 +56,6 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
amdgpu_ctx_priority_override(ctx, priority);
mutex_unlock(&mgr->lock);
- fdput(f);
return 0;
}
@@ -67,31 +64,25 @@ static int amdgpu_sched_context_priority_override(struct amdgpu_device *adev,
unsigned ctx_id,
int32_t priority)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct amdgpu_fpriv *fpriv;
struct amdgpu_ctx *ctx;
int r;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EINVAL;
r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
- if (r) {
- fdput(f);
+ if (r)
return r;
- }
ctx = amdgpu_ctx_get(fpriv, ctx_id);
- if (!ctx) {
- fdput(f);
+ if (!ctx)
return -EINVAL;
- }
amdgpu_ctx_priority_override(ctx, priority);
amdgpu_ctx_put(ctx);
- fdput(f);
-
return 0;
}
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 7fb31ca3b5fc..4eaebd69253b 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -712,16 +712,14 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
int fd, u32 *handle)
{
struct drm_syncobj *syncobj;
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
int ret;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EINVAL;
- if (fd_file(f)->f_op != &drm_syncobj_file_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &drm_syncobj_file_fops)
return -EINVAL;
- }
/* take a reference to put in the idr */
syncobj = fd_file(f)->private_data;
@@ -739,7 +737,6 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
} else
drm_syncobj_put(syncobj);
- fdput(f);
return ret;
}
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index dc57d07a1f45..cc95718fd24b 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1615,7 +1615,6 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
struct ucma_event *uevent, *tmp;
struct ucma_context *ctx;
LIST_HEAD(event_list);
- struct fd f;
struct ucma_file *cur_file;
int ret = 0;
@@ -1623,21 +1622,17 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
return -EFAULT;
/* Get current fd to protect against it being closed */
- f = fdget(cmd.fd);
- if (!fd_file(f))
+ CLASS(fd, f)(cmd.fd);
+ if (fd_empty(f))
return -ENOENT;
- if (fd_file(f)->f_op != &ucma_fops) {
- ret = -EINVAL;
- goto file_put;
- }
+ if (fd_file(f)->f_op != &ucma_fops)
+ return -EINVAL;
cur_file = fd_file(f)->private_data;
/* Validate current fd and prevent destruction of id. */
ctx = ucma_get_ctx(cur_file, cmd.id);
- if (IS_ERR(ctx)) {
- ret = PTR_ERR(ctx);
- goto file_put;
- }
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
rdma_lock_handler(ctx->cm_id);
/*
@@ -1678,8 +1673,6 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
err_unlock:
rdma_unlock_handler(ctx->cm_id);
ucma_put_ctx(ctx);
-file_put:
- fdput(f);
return ret;
}
diff --git a/drivers/media/mc/mc-request.c b/drivers/media/mc/mc-request.c
index e064914c476e..df39c8c11e9a 100644
--- a/drivers/media/mc/mc-request.c
+++ b/drivers/media/mc/mc-request.c
@@ -246,22 +246,21 @@ static const struct file_operations request_fops = {
struct media_request *
media_request_get_by_fd(struct media_device *mdev, int request_fd)
{
- struct fd f;
struct media_request *req;
if (!mdev || !mdev->ops ||
!mdev->ops->req_validate || !mdev->ops->req_queue)
return ERR_PTR(-EBADR);
- f = fdget(request_fd);
- if (!fd_file(f))
- goto err_no_req_fd;
+ CLASS(fd, f)(request_fd);
+ if (fd_empty(f))
+ goto err;
if (fd_file(f)->f_op != &request_fops)
- goto err_fput;
+ goto err;
req = fd_file(f)->private_data;
if (req->mdev != mdev)
- goto err_fput;
+ goto err;
/*
* Note: as long as someone has an open filehandle of the request,
@@ -272,14 +271,9 @@ media_request_get_by_fd(struct media_device *mdev, int request_fd)
* before media_request_get() is called.
*/
media_request_get(req);
- fdput(f);
-
return req;
-err_fput:
- fdput(f);
-
-err_no_req_fd:
+err:
dev_dbg(mdev->dev, "cannot find request_fd %d\n", request_fd);
return ERR_PTR(-EINVAL);
}
diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
index b8dfd530fab7..af6337489d21 100644
--- a/drivers/media/rc/lirc_dev.c
+++ b/drivers/media/rc/lirc_dev.c
@@ -816,28 +816,23 @@ void __exit lirc_dev_exit(void)
struct rc_dev *rc_dev_get_from_fd(int fd, bool write)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct lirc_fh *fh;
struct rc_dev *dev;
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &lirc_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &lirc_fops)
return ERR_PTR(-EINVAL);
- }
- if (write && !(fd_file(f)->f_mode & FMODE_WRITE)) {
- fdput(f);
+ if (write && !(fd_file(f)->f_mode & FMODE_WRITE))
return ERR_PTR(-EPERM);
- }
fh = fd_file(f)->private_data;
dev = fh->rc;
get_device(&dev->dev);
- fdput(f);
return dev;
}
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index f0d77e3c7196..8bad9b11c844 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -104,15 +104,14 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
{
struct vfio_container *container;
struct iommufd_ctx *iommufd;
- struct fd f;
int ret;
int fd;
if (get_user(fd, arg))
return -EFAULT;
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EBADF;
mutex_lock(&group->group_lock);
@@ -153,7 +152,6 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
out_unlock:
mutex_unlock(&group->group_lock);
- fdput(f);
return ret;
}
diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
index d22881245e89..aa2891f97508 100644
--- a/drivers/vfio/virqfd.c
+++ b/drivers/vfio/virqfd.c
@@ -113,7 +113,6 @@ int vfio_virqfd_enable(void *opaque,
void (*thread)(void *, void *),
void *data, struct virqfd **pvirqfd, int fd)
{
- struct fd irqfd;
struct eventfd_ctx *ctx;
struct virqfd *virqfd;
int ret = 0;
@@ -133,8 +132,8 @@ int vfio_virqfd_enable(void *opaque,
INIT_WORK(&virqfd->inject, virqfd_inject);
INIT_WORK(&virqfd->flush_inject, virqfd_flush_inject);
- irqfd = fdget(fd);
- if (!fd_file(irqfd)) {
+ CLASS(fd, irqfd)(fd);
+ if (fd_empty(irqfd)) {
ret = -EBADF;
goto err_fd;
}
@@ -142,7 +141,7 @@ int vfio_virqfd_enable(void *opaque,
ctx = eventfd_ctx_fileget(fd_file(irqfd));
if (IS_ERR(ctx)) {
ret = PTR_ERR(ctx);
- goto err_ctx;
+ goto err_fd;
}
virqfd->eventfd = ctx;
@@ -181,18 +180,9 @@ int vfio_virqfd_enable(void *opaque,
if ((!handler || handler(opaque, data)) && thread)
schedule_work(&virqfd->inject);
}
-
- /*
- * Do not drop the file until the irqfd is fully initialized,
- * otherwise we might race against the EPOLLHUP.
- */
- fdput(irqfd);
-
return 0;
err_busy:
eventfd_ctx_put(ctx);
-err_ctx:
- fdput(irqfd);
err_fd:
kfree(virqfd);
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
index 9994d818bb7e..f4d37ceea2ab 100644
--- a/drivers/virt/acrn/irqfd.c
+++ b/drivers/virt/acrn/irqfd.c
@@ -112,7 +112,6 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
struct eventfd_ctx *eventfd = NULL;
struct hsm_irqfd *irqfd, *tmp;
__poll_t events;
- struct fd f;
int ret = 0;
irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
@@ -124,8 +123,8 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
INIT_LIST_HEAD(&irqfd->list);
INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work);
- f = fdget(args->fd);
- if (!fd_file(f)) {
+ CLASS(fd, f)(args->fd);
+ if (fd_empty(f)) {
ret = -EBADF;
goto out;
}
@@ -133,7 +132,7 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(eventfd)) {
ret = PTR_ERR(eventfd);
- goto fail;
+ goto out;
}
irqfd->eventfd = eventfd;
@@ -162,13 +161,10 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
if (events & EPOLLIN)
acrn_irqfd_inject(irqfd);
- fdput(f);
return 0;
fail:
if (eventfd && !IS_ERR(eventfd))
eventfd_ctx_put(eventfd);
-
- fdput(f);
out:
kfree(irqfd);
return ret;
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index c35c2455aa61..ba772a6347f6 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -930,7 +930,6 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
{
struct privcmd_kernel_irqfd *kirqfd, *tmp;
__poll_t events;
- struct fd f;
void *dm_op;
int ret;
@@ -949,8 +948,8 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
kirqfd->dom = irqfd->dom;
INIT_WORK(&kirqfd->shutdown, irqfd_shutdown);
- f = fdget(irqfd->fd);
- if (!fd_file(f)) {
+ CLASS(fd, f)(irqfd->fd);
+ if (fd_empty(f)) {
ret = -EBADF;
goto error_kfree;
}
@@ -958,7 +957,7 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
kirqfd->eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(kirqfd->eventfd)) {
ret = PTR_ERR(kirqfd->eventfd);
- goto error_fd_put;
+ goto error_kfree;
}
/*
@@ -989,19 +988,11 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
if (events & EPOLLIN)
irqfd_inject(kirqfd);
- /*
- * Do not drop the file until the kirqfd is fully initialized, otherwise
- * we might race against the EPOLLHUP.
- */
- fdput(f);
return 0;
error_eventfd:
eventfd_ctx_put(kirqfd->eventfd);
-error_fd_put:
- fdput(f);
-
error_kfree:
kfree(kirqfd);
return ret;
@@ -1310,7 +1301,6 @@ static int privcmd_ioeventfd_assign(struct privcmd_ioeventfd *ioeventfd)
struct privcmd_kernel_ioeventfd *kioeventfd;
struct privcmd_kernel_ioreq *kioreq;
unsigned long flags;
- struct fd f;
int ret;
/* Check for range overflow */
@@ -1330,14 +1320,15 @@ static int privcmd_ioeventfd_assign(struct privcmd_ioeventfd *ioeventfd)
if (!kioeventfd)
return -ENOMEM;
- f = fdget(ioeventfd->event_fd);
- if (!fd_file(f)) {
- ret = -EBADF;
- goto error_kfree;
- }
+ {
+ CLASS(fd, f)(ioeventfd->event_fd);
+ if (fd_empty(f)) {
+ ret = -EBADF;
+ goto error_kfree;
+ }
- kioeventfd->eventfd = eventfd_ctx_fileget(fd_file(f));
- fdput(f);
+ kioeventfd->eventfd = eventfd_ctx_fileget(fd_file(f));
+ }
if (IS_ERR(kioeventfd->eventfd)) {
ret = PTR_ERR(kioeventfd->eventfd);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 2dccb7f90e4d..1703ba0b07e6 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1309,9 +1309,9 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
ret = btrfs_mksubvol(&file->f_path, idmap, name,
namelen, NULL, readonly, inherit);
} else {
- struct fd src = fdget(fd);
+ CLASS(fd, src)(fd);
struct inode *src_inode;
- if (!fd_file(src)) {
+ if (fd_empty(src)) {
ret = -EINVAL;
goto out_drop_write;
}
@@ -1342,7 +1342,6 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
BTRFS_I(src_inode)->root,
readonly, inherit);
}
- fdput(src);
}
out_drop_write:
mnt_drop_write_file(file);
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 7d56b6d1e4c3..293cf5e6dfeb 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -122,22 +122,17 @@ static const struct fs_parameter_spec coda_param_specs[] = {
static int coda_parse_fd(struct fs_context *fc, int fd)
{
struct coda_fs_context *ctx = fc->fs_private;
- struct fd f;
+ CLASS(fd, f)(fd);
struct inode *inode;
int idx;
- f = fdget(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
inode = file_inode(fd_file(f));
- if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR) {
- fdput(f);
+ if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR)
return invalf(fc, "code: Not coda psdev");
- }
idx = iminor(inode);
- fdput(f);
-
if (idx < 0 || idx >= MAX_CODADEVS)
return invalf(fc, "coda: Bad minor number");
ctx->idx = idx;
diff --git a/fs/eventfd.c b/fs/eventfd.c
index 22c934f3a080..76129bfcd663 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -347,13 +347,10 @@ EXPORT_SYMBOL_GPL(eventfd_fget);
*/
struct eventfd_ctx *eventfd_ctx_fdget(int fd)
{
- struct eventfd_ctx *ctx;
- struct fd f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- ctx = eventfd_ctx_fileget(fd_file(f));
- fdput(f);
- return ctx;
+ return eventfd_ctx_fileget(fd_file(f));
}
EXPORT_SYMBOL_GPL(eventfd_ctx_fdget);
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 28d1a754cf33..1e63f3b03ca5 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2259,25 +2259,22 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
{
int error;
int full_check = 0;
- struct fd f, tf;
struct eventpoll *ep;
struct epitem *epi;
struct eventpoll *tep = NULL;
- error = -EBADF;
- f = fdget(epfd);
- if (!fd_file(f))
- goto error_return;
+ CLASS(fd, f)(epfd);
+ if (fd_empty(f))
+ return -EBADF;
/* Get the "struct file *" for the target file */
- tf = fdget(fd);
- if (!fd_file(tf))
- goto error_fput;
+ CLASS(fd, tf)(fd);
+ if (fd_empty(tf))
+ return -EBADF;
/* The target file descriptor must support poll */
- error = -EPERM;
if (!file_can_poll(fd_file(tf)))
- goto error_tgt_fput;
+ return -EPERM;
/* Check if EPOLLWAKEUP is allowed */
if (ep_op_has_event(op))
@@ -2396,12 +2393,6 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
loop_check_gen++;
mutex_unlock(&epnested_mutex);
}
-
- fdput(tf);
-error_fput:
- fdput(f);
-error_return:
-
return error;
}
@@ -2429,8 +2420,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
static int do_epoll_wait(int epfd, struct epoll_event __user *events,
int maxevents, struct timespec64 *to)
{
- int error;
- struct fd f;
struct eventpoll *ep;
/* The maximum number of event must be greater than zero */
@@ -2442,17 +2431,16 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
return -EFAULT;
/* Get the "struct file *" for the eventpoll file */
- f = fdget(epfd);
- if (!fd_file(f))
+ CLASS(fd, f)(epfd);
+ if (fd_empty(f))
return -EBADF;
/*
* We have to check that the file structure underneath the fd
* the user passed to us _is_ an eventpoll file.
*/
- error = -EINVAL;
if (!is_file_epoll(fd_file(f)))
- goto error_fput;
+ return -EINVAL;
/*
* At this point it is safe to assume that the "private_data" contains
@@ -2461,11 +2449,7 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
ep = fd_file(f)->private_data;
/* Time to fish for events ... */
- error = ep_poll(ep, events, maxevents, to);
-
-error_fput:
- fdput(f);
- return error;
+ return ep_poll(ep, events, maxevents, to);
}
SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events,
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 63eaa1fa2556..5503c92cdb6d 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -1330,7 +1330,6 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case EXT4_IOC_MOVE_EXT: {
struct move_extent me;
- struct fd donor;
int err;
if (!(filp->f_mode & FMODE_READ) ||
@@ -1342,30 +1341,26 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
return -EFAULT;
me.moved_len = 0;
- donor = fdget(me.donor_fd);
- if (!fd_file(donor))
+ CLASS(fd, donor)(me.donor_fd);
+ if (fd_empty(donor))
return -EBADF;
- if (!(fd_file(donor)->f_mode & FMODE_WRITE)) {
- err = -EBADF;
- goto mext_out;
- }
+ if (!(fd_file(donor)->f_mode & FMODE_WRITE))
+ return -EBADF;
if (ext4_has_feature_bigalloc(sb)) {
ext4_msg(sb, KERN_ERR,
"Online defrag not supported with bigalloc");
- err = -EOPNOTSUPP;
- goto mext_out;
+ return -EOPNOTSUPP;
} else if (IS_DAX(inode)) {
ext4_msg(sb, KERN_ERR,
"Online defrag not supported with DAX");
- err = -EOPNOTSUPP;
- goto mext_out;
+ return -EOPNOTSUPP;
}
err = mnt_want_write_file(filp);
if (err)
- goto mext_out;
+ return err;
err = ext4_move_extents(filp, fd_file(donor), me.orig_start,
me.donor_start, me.len, &me.moved_len);
@@ -1374,8 +1369,6 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
if (copy_to_user((struct move_extent __user *)arg,
&me, sizeof(me)))
err = -EFAULT;
-mext_out:
- fdput(donor);
return err;
}
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 89db22f9488b..c70191b8b345 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2962,32 +2962,27 @@ static int f2fs_move_file_range(struct file *file_in, loff_t pos_in,
static int __f2fs_ioc_move_range(struct file *filp,
struct f2fs_move_range *range)
{
- struct fd dst;
int err;
if (!(filp->f_mode & FMODE_READ) ||
!(filp->f_mode & FMODE_WRITE))
return -EBADF;
- dst = fdget(range->dst_fd);
- if (!fd_file(dst))
+ CLASS(fd, dst)(range->dst_fd);
+ if (fd_empty(dst))
return -EBADF;
- if (!(fd_file(dst)->f_mode & FMODE_WRITE)) {
- err = -EBADF;
- goto err_out;
- }
+ if (!(fd_file(dst)->f_mode & FMODE_WRITE))
+ return -EBADF;
err = mnt_want_write_file(filp);
if (err)
- goto err_out;
+ return err;
err = f2fs_move_file_range(filp, range->pos_in, fd_file(dst),
range->pos_out, range->len);
mnt_drop_write_file(filp);
-err_out:
- fdput(dst);
return err;
}
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 94fc4126eaa4..c6ae83e456b2 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -125,11 +125,10 @@ static struct vfsmount *get_vfsmount_from_fd(int fd)
mnt = mntget(fs->pwd.mnt);
spin_unlock(&fs->lock);
} else {
- struct fd f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
mnt = mntget(fd_file(f)->f_path.mnt);
- fdput(f);
}
return mnt;
}
diff --git a/fs/fsopen.c b/fs/fsopen.c
index cf9f37b2a5d4..f779836c7288 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -354,7 +354,6 @@ SYSCALL_DEFINE5(fsconfig,
int, aux)
{
struct fs_context *fc;
- struct fd f;
int ret;
int lookup_flags = 0;
@@ -397,12 +396,11 @@ SYSCALL_DEFINE5(fsconfig,
return -EOPNOTSUPP;
}
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EBADF;
- ret = -EINVAL;
if (fd_file(f)->f_op != &fscontext_fops)
- goto out_f;
+ return -EINVAL;
fc = fd_file(f)->private_data;
if (fc->ops == &legacy_fs_context_ops) {
@@ -411,17 +409,14 @@ SYSCALL_DEFINE5(fsconfig,
case FSCONFIG_SET_PATH:
case FSCONFIG_SET_PATH_EMPTY:
case FSCONFIG_SET_FD:
- ret = -EOPNOTSUPP;
- goto out_f;
+ return -EOPNOTSUPP;
}
}
if (_key) {
param.key = strndup_user(_key, 256);
- if (IS_ERR(param.key)) {
- ret = PTR_ERR(param.key);
- goto out_f;
- }
+ if (IS_ERR(param.key))
+ return PTR_ERR(param.key);
}
switch (cmd) {
@@ -500,7 +495,5 @@ SYSCALL_DEFINE5(fsconfig,
}
out_key:
kfree(param.key);
-out_f:
- fdput(f);
return ret;
}
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 991b9ae8e7c9..27826116a4fb 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -2315,13 +2315,12 @@ static long fuse_dev_ioctl_clone(struct file *file, __u32 __user *argp)
int res;
int oldfd;
struct fuse_dev *fud = NULL;
- struct fd f;
if (get_user(oldfd, argp))
return -EFAULT;
- f = fdget(oldfd);
- if (!fd_file(f))
+ CLASS(fd, f)(oldfd);
+ if (fd_empty(f))
return -EINVAL;
/*
@@ -2338,7 +2337,6 @@ static long fuse_dev_ioctl_clone(struct file *file, __u32 __user *argp)
mutex_unlock(&fuse_mutex);
}
- fdput(f);
return res;
}
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 6e0c954388d4..638a36be31c1 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -231,11 +231,11 @@ static int ioctl_fiemap(struct file *filp, struct fiemap __user *ufiemap)
static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
u64 off, u64 olen, u64 destoff)
{
- struct fd src_file = fdget(srcfd);
+ CLASS(fd, src_file)(srcfd);
loff_t cloned;
int ret;
- if (!fd_file(src_file))
+ if (fd_empty(src_file))
return -EBADF;
cloned = vfs_clone_file_range(fd_file(src_file), off, dst_file, destoff,
olen, 0);
@@ -245,7 +245,6 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
ret = -EINVAL;
else
ret = 0;
- fdput(src_file);
return ret;
}
@@ -892,22 +891,20 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd,
SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
int error;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = security_file_ioctl(fd_file(f), cmd, arg);
if (error)
- goto out;
+ return error;
error = do_vfs_ioctl(fd_file(f), fd, cmd, arg);
if (error == -ENOIOCTLCMD)
error = vfs_ioctl(fd_file(f), cmd, arg);
-out:
- fdput(f);
return error;
}
@@ -950,15 +947,15 @@ EXPORT_SYMBOL(compat_ptr_ioctl);
COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
compat_ulong_t, arg)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
int error;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
error = security_file_ioctl_compat(fd_file(f), cmd, arg);
if (error)
- goto out;
+ return error;
switch (cmd) {
/* FICLONE takes an int argument, so don't use compat_ptr() */
@@ -1009,10 +1006,6 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
error = -ENOTTY;
break;
}
-
- out:
- fdput(f);
-
return error;
}
#endif
diff --git a/fs/kernel_read_file.c b/fs/kernel_read_file.c
index 9ff37ae650ea..de32c95d823d 100644
--- a/fs/kernel_read_file.c
+++ b/fs/kernel_read_file.c
@@ -175,15 +175,11 @@ ssize_t kernel_read_file_from_fd(int fd, loff_t offset, void **buf,
size_t buf_size, size_t *file_size,
enum kernel_read_file_id id)
{
- struct fd f = fdget(fd);
- ssize_t ret = -EBADF;
+ CLASS(fd, f)(fd);
- if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
- goto out;
+ if (fd_empty(f) || !(fd_file(f)->f_mode & FMODE_READ))
+ return -EBADF;
- ret = kernel_read_file(fd_file(f), offset, buf, buf_size, file_size, id);
-out:
- fdput(f);
- return ret;
+ return kernel_read_file(fd_file(f), offset, buf, buf_size, file_size, id);
}
EXPORT_SYMBOL_GPL(kernel_read_file_from_fd);
diff --git a/fs/locks.c b/fs/locks.c
index ee8e1925dc42..4170035a2bc8 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2132,7 +2132,6 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
{
int can_sleep, error, type;
struct file_lock fl;
- struct fd f;
/*
* LOCK_MAND locks were broken for a long time in that they never
@@ -2151,19 +2150,18 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
if (type < 0)
return type;
- error = -EBADF;
- f = fdget(fd);
- if (!fd_file(f))
- return error;
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
if (type != F_UNLCK && !(fd_file(f)->f_mode & (FMODE_READ | FMODE_WRITE)))
- goto out_putf;
+ return -EBADF;
flock_make_lock(fd_file(f), &fl, type);
error = security_file_lock(fd_file(f), fl.c.flc_type);
if (error)
- goto out_putf;
+ return error;
can_sleep = !(cmd & LOCK_NB);
if (can_sleep)
@@ -2177,9 +2175,6 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
error = locks_lock_file_wait(fd_file(f), &fl);
locks_release_private(&fl);
- out_putf:
- fdput(f);
-
return error;
}
diff --git a/fs/namespace.c b/fs/namespace.c
index 7c8248aca8bd..6fbb70272ebd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3948,7 +3948,6 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
struct file *file;
struct path newmount;
struct mount *mnt;
- struct fd f;
unsigned int mnt_flags = 0;
long ret;
@@ -3976,19 +3975,18 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
return -EINVAL;
}
- f = fdget(fs_fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fs_fd);
+ if (fd_empty(f))
return -EBADF;
- ret = -EINVAL;
if (fd_file(f)->f_op != &fscontext_fops)
- goto err_fsfd;
+ return -EINVAL;
fc = fd_file(f)->private_data;
ret = mutex_lock_interruptible(&fc->uapi_mutex);
if (ret < 0)
- goto err_fsfd;
+ return ret;
/* There must be a valid superblock or we can't mount it */
ret = -EINVAL;
@@ -4055,8 +4053,6 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
path_put(&newmount);
err_unlock:
mutex_unlock(&fc->uapi_mutex);
-err_fsfd:
- fdput(f);
return ret;
}
@@ -4507,10 +4503,8 @@ static int do_mount_setattr(struct path *path, struct mount_kattr *kattr)
static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
struct mount_kattr *kattr, unsigned int flags)
{
- int err = 0;
struct ns_common *ns;
struct user_namespace *mnt_userns;
- struct fd f;
if (!((attr->attr_set | attr->attr_clr) & MOUNT_ATTR_IDMAP))
return 0;
@@ -4526,20 +4520,16 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
if (attr->userns_fd > INT_MAX)
return -EINVAL;
- f = fdget(attr->userns_fd);
- if (!fd_file(f))
+ CLASS(fd, f)(attr->userns_fd);
+ if (fd_empty(f))
return -EBADF;
- if (!proc_ns_file(fd_file(f))) {
- err = -EINVAL;
- goto out_fput;
- }
+ if (!proc_ns_file(fd_file(f)))
+ return -EINVAL;
ns = get_proc_ns(file_inode(fd_file(f)));
- if (ns->ops->type != CLONE_NEWUSER) {
- err = -EINVAL;
- goto out_fput;
- }
+ if (ns->ops->type != CLONE_NEWUSER)
+ return -EINVAL;
/*
* The initial idmapping cannot be used to create an idmapped
@@ -4550,22 +4540,15 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
* result.
*/
mnt_userns = container_of(ns, struct user_namespace, ns);
- if (mnt_userns == &init_user_ns) {
- err = -EPERM;
- goto out_fput;
- }
+ if (mnt_userns == &init_user_ns)
+ return -EPERM;
/* We're not controlling the target namespace. */
- if (!ns_capable(mnt_userns, CAP_SYS_ADMIN)) {
- err = -EPERM;
- goto out_fput;
- }
+ if (!ns_capable(mnt_userns, CAP_SYS_ADMIN))
+ return -EPERM;
kattr->mnt_userns = get_user_ns(mnt_userns);
-
-out_fput:
- fdput(f);
- return err;
+ return 0;
}
static int build_mount_kattr(const struct mount_attr *attr, size_t usize,
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 13454e5fd3fb..7b5abc1b8c8f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1003,22 +1003,17 @@ static int fanotify_find_path(int dfd, const char __user *filename,
dfd, filename, flags);
if (filename == NULL) {
- struct fd f = fdget(dfd);
+ CLASS(fd, f)(dfd);
- ret = -EBADF;
- if (!fd_file(f))
- goto out;
+ if (fd_empty(f))
+ return -EBADF;
- ret = -ENOTDIR;
if ((flags & FAN_MARK_ONLYDIR) &&
- !(S_ISDIR(file_inode(fd_file(f))->i_mode))) {
- fdput(f);
- goto out;
- }
+ !(S_ISDIR(file_inode(fd_file(f))->i_mode)))
+ return -ENOTDIR;
*path = fd_file(f)->f_path;
path_get(path);
- fdput(f);
} else {
unsigned int lookup_flags = 0;
@@ -1682,7 +1677,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
struct inode *inode = NULL;
struct vfsmount *mnt = NULL;
struct fsnotify_group *group;
- struct fd f;
struct path path;
struct fan_fsid __fsid, *fsid = NULL;
u32 valid_mask = FANOTIFY_EVENTS | FANOTIFY_EVENT_FLAGS;
@@ -1752,14 +1746,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
umask = FANOTIFY_EVENT_FLAGS;
}
- f = fdget(fanotify_fd);
- if (unlikely(!fd_file(f)))
+ CLASS(fd, f)(fanotify_fd);
+ if (fd_empty(f))
return -EBADF;
/* verify that this is indeed an fanotify instance */
- ret = -EINVAL;
if (unlikely(fd_file(f)->f_op != &fanotify_fops))
- goto fput_and_out;
+ return -EINVAL;
group = fd_file(f)->private_data;
/*
@@ -1767,23 +1760,21 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
* marks. This also includes setting up such marks by a group that
* was initialized by an unprivileged user.
*/
- ret = -EPERM;
if ((!capable(CAP_SYS_ADMIN) ||
FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
mark_type != FAN_MARK_INODE)
- goto fput_and_out;
+ return -EPERM;
/*
* Permission events require minimum priority FAN_CLASS_CONTENT.
*/
- ret = -EINVAL;
if (mask & FANOTIFY_PERM_EVENTS &&
group->priority < FSNOTIFY_PRIO_CONTENT)
- goto fput_and_out;
+ return -EINVAL;
if (mask & FAN_FS_ERROR &&
mark_type != FAN_MARK_FILESYSTEM)
- goto fput_and_out;
+ return -EINVAL;
/*
* Evictable is only relevant for inode marks, because only inode object
@@ -1791,7 +1782,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
*/
if (flags & FAN_MARK_EVICTABLE &&
mark_type != FAN_MARK_INODE)
- goto fput_and_out;
+ return -EINVAL;
/*
* Events that do not carry enough information to report
@@ -1803,7 +1794,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
if (mask & ~(FANOTIFY_FD_EVENTS|FANOTIFY_EVENT_FLAGS) &&
(!fid_mode || mark_type == FAN_MARK_MOUNT))
- goto fput_and_out;
+ return -EINVAL;
/*
* FAN_RENAME uses special info type records to report the old and
@@ -1811,23 +1802,22 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
* useful and was not implemented.
*/
if (mask & FAN_RENAME && !(fid_mode & FAN_REPORT_NAME))
- goto fput_and_out;
+ return -EINVAL;
if (mark_cmd == FAN_MARK_FLUSH) {
- ret = 0;
if (mark_type == FAN_MARK_MOUNT)
fsnotify_clear_vfsmount_marks_by_group(group);
else if (mark_type == FAN_MARK_FILESYSTEM)
fsnotify_clear_sb_marks_by_group(group);
else
fsnotify_clear_inode_marks_by_group(group);
- goto fput_and_out;
+ return 0;
}
ret = fanotify_find_path(dfd, pathname, &path, flags,
(mask & ALL_FSNOTIFY_EVENTS), obj_type);
if (ret)
- goto fput_and_out;
+ return ret;
if (mark_cmd == FAN_MARK_ADD) {
ret = fanotify_events_supported(group, &path, mask, flags);
@@ -1906,8 +1896,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
path_put_and_out:
path_put(&path);
-fput_and_out:
- fdput(f);
return ret;
}
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index c7e451d5bd51..f46ec6afee3c 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -732,7 +732,6 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
struct fsnotify_group *group;
struct inode *inode;
struct path path;
- struct fd f;
int ret;
unsigned flags = 0;
@@ -752,21 +751,17 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
if (unlikely(!(mask & ALL_INOTIFY_BITS)))
return -EINVAL;
- f = fdget(fd);
- if (unlikely(!fd_file(f)))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EBADF;
/* IN_MASK_ADD and IN_MASK_CREATE don't make sense together */
- if (unlikely((mask & IN_MASK_ADD) && (mask & IN_MASK_CREATE))) {
- ret = -EINVAL;
- goto fput_and_out;
- }
+ if (unlikely((mask & IN_MASK_ADD) && (mask & IN_MASK_CREATE)))
+ return -EINVAL;
/* verify that this is indeed an inotify instance */
- if (unlikely(fd_file(f)->f_op != &inotify_fops)) {
- ret = -EINVAL;
- goto fput_and_out;
- }
+ if (unlikely(fd_file(f)->f_op != &inotify_fops))
+ return -EINVAL;
if (!(mask & IN_DONT_FOLLOW))
flags |= LOOKUP_FOLLOW;
@@ -776,7 +771,7 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
ret = inotify_find_inode(pathname, &path, flags,
(mask & IN_ALL_EVENTS));
if (ret)
- goto fput_and_out;
+ return ret;
/* inode held in place by reference to path; group by fget on fd */
inode = path.dentry->d_inode;
@@ -785,8 +780,6 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
/* create/update an inode mark */
ret = inotify_update_watch(group, inode, mask);
path_put(&path);
-fput_and_out:
- fdput(f);
return ret;
}
@@ -794,33 +787,26 @@ SYSCALL_DEFINE2(inotify_rm_watch, int, fd, __s32, wd)
{
struct fsnotify_group *group;
struct inotify_inode_mark *i_mark;
- struct fd f;
- int ret = -EINVAL;
+ CLASS(fd, f)(fd);
- f = fdget(fd);
- if (unlikely(!fd_file(f)))
+ if (fd_empty(f))
return -EBADF;
/* verify that this is indeed an inotify instance */
if (unlikely(fd_file(f)->f_op != &inotify_fops))
- goto out;
+ return -EINVAL;
group = fd_file(f)->private_data;
i_mark = inotify_idr_find(group, wd);
if (unlikely(!i_mark))
- goto out;
-
- ret = 0;
+ return -EINVAL;
fsnotify_destroy_mark(&i_mark->fsn_mark, group);
/* match ref taken by inotify_idr_find */
fsnotify_put_mark(&i_mark->fsn_mark);
-
-out:
- fdput(f);
- return ret;
+ return 0;
}
/*
diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 4b9f45d7049e..08d9ac1b137f 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -1765,7 +1765,6 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
long fd;
int sectsize;
char *p = (char *)page;
- struct fd f;
ssize_t ret = -EINVAL;
int live_threshold;
@@ -1784,23 +1783,23 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
if (fd < 0 || fd >= INT_MAX)
goto out;
- f = fdget(fd);
- if (fd_file(f) == NULL)
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
goto out;
if (reg->hr_blocks == 0 || reg->hr_start_block == 0 ||
reg->hr_block_bytes == 0)
- goto out2;
+ goto out;
if (!S_ISBLK(fd_file(f)->f_mapping->host->i_mode))
- goto out2;
+ goto out;
reg->hr_bdev_file = bdev_file_open_by_dev(fd_file(f)->f_mapping->host->i_rdev,
BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
if (IS_ERR(reg->hr_bdev_file)) {
ret = PTR_ERR(reg->hr_bdev_file);
reg->hr_bdev_file = NULL;
- goto out2;
+ goto out;
}
sectsize = bdev_logical_block_size(reg_bdev(reg));
@@ -1906,8 +1905,6 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
fput(reg->hr_bdev_file);
reg->hr_bdev_file = NULL;
}
-out2:
- fdput(f);
out:
return ret;
}
diff --git a/fs/open.c b/fs/open.c
index 4cb5e12e84a5..71e166e0907c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -187,19 +187,13 @@ long do_ftruncate(struct file *file, loff_t length, int small)
long do_sys_ftruncate(unsigned int fd, loff_t length, int small)
{
- struct fd f;
- int error;
-
if (length < 0)
return -EINVAL;
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EBADF;
- error = do_ftruncate(fd_file(f), length, small);
-
- fdput(f);
- return error;
+ return do_ftruncate(fd_file(f), length, small);
}
SYSCALL_DEFINE2(ftruncate, unsigned int, fd, unsigned long, length)
@@ -346,14 +340,12 @@ EXPORT_SYMBOL_GPL(vfs_fallocate);
int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len)
{
- struct fd f = fdget(fd);
- int error = -EBADF;
+ CLASS(fd, f)(fd);
- if (fd_file(f)) {
- error = vfs_fallocate(fd_file(f), mode, offset, len);
- fdput(f);
- }
- return error;
+ if (fd_empty(f))
+ return -EBADF;
+
+ return vfs_fallocate(fd_file(f), mode, offset, len);
}
SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len)
@@ -663,14 +655,12 @@ int vfs_fchmod(struct file *file, umode_t mode)
SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
{
- struct fd f = fdget(fd);
- int err = -EBADF;
+ CLASS(fd, f)(fd);
- if (fd_file(f)) {
- err = vfs_fchmod(fd_file(f), mode);
- fdput(f);
- }
- return err;
+ if (fd_empty(f))
+ return -EBADF;
+
+ return vfs_fchmod(fd_file(f), mode);
}
static int do_fchmodat(int dfd, const char __user *filename, umode_t mode,
@@ -857,14 +847,12 @@ int vfs_fchown(struct file *file, uid_t user, gid_t group)
int ksys_fchown(unsigned int fd, uid_t user, gid_t group)
{
- struct fd f = fdget(fd);
- int error = -EBADF;
+ CLASS(fd, f)(fd);
- if (fd_file(f)) {
- error = vfs_fchown(fd_file(f), user, group);
- fdput(f);
- }
- return error;
+ if (fd_empty(f))
+ return -EBADF;
+
+ return vfs_fchown(fd_file(f), user, group);
}
SYSCALL_DEFINE3(fchown, unsigned int, fd, uid_t, user, gid_t, group)
diff --git a/fs/read_write.c b/fs/read_write.c
index 6d49202e2507..4d3067381915 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -652,21 +652,17 @@ SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
loff_t pos)
{
- struct fd f;
- ssize_t ret = -EBADF;
-
if (pos < 0)
return -EINVAL;
- f = fdget(fd);
- if (fd_file(f)) {
- ret = -ESPIPE;
- if (fd_file(f)->f_mode & FMODE_PREAD)
- ret = vfs_read(fd_file(f), buf, count, &pos);
- fdput(f);
- }
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
- return ret;
+ if (fd_file(f)->f_mode & FMODE_PREAD)
+ return vfs_read(fd_file(f), buf, count, &pos);
+
+ return -ESPIPE;
}
SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf,
@@ -686,21 +682,17 @@ COMPAT_SYSCALL_DEFINE5(pread64, unsigned int, fd, char __user *, buf,
ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
size_t count, loff_t pos)
{
- struct fd f;
- ssize_t ret = -EBADF;
-
if (pos < 0)
return -EINVAL;
- f = fdget(fd);
- if (fd_file(f)) {
- ret = -ESPIPE;
- if (fd_file(f)->f_mode & FMODE_PWRITE)
- ret = vfs_write(fd_file(f), buf, count, &pos);
- fdput(f);
- }
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
- return ret;
+ if (fd_file(f)->f_mode & FMODE_PWRITE)
+ return vfs_write(fd_file(f), buf, count, &pos);
+
+ return -ESPIPE;
}
SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf,
@@ -1028,18 +1020,16 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags)
{
- struct fd f;
ssize_t ret = -EBADF;
if (pos < 0)
return -EINVAL;
- f = fdget(fd);
- if (fd_file(f)) {
+ CLASS(fd, f)(fd);
+ if (!fd_empty(f)) {
ret = -ESPIPE;
if (fd_file(f)->f_mode & FMODE_PREAD)
ret = vfs_readv(fd_file(f), vec, vlen, &pos, flags);
- fdput(f);
}
if (ret > 0)
@@ -1051,18 +1041,16 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, loff_t pos, rwf_t flags)
{
- struct fd f;
ssize_t ret = -EBADF;
if (pos < 0)
return -EINVAL;
- f = fdget(fd);
- if (fd_file(f)) {
+ CLASS(fd, f)(fd);
+ if (!fd_empty(f)) {
ret = -ESPIPE;
if (fd_file(f)->f_mode & FMODE_PWRITE)
ret = vfs_writev(fd_file(f), vec, vlen, &pos, flags);
- fdput(f);
}
if (ret > 0)
@@ -1214,7 +1202,6 @@ COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
size_t count, loff_t max)
{
- struct fd in, out;
struct inode *in_inode, *out_inode;
struct pipe_inode_info *opipe;
loff_t pos;
@@ -1225,35 +1212,32 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
/*
* Get input file, and verify that it is ok..
*/
- retval = -EBADF;
- in = fdget(in_fd);
- if (!fd_file(in))
- goto out;
+ CLASS(fd, in)(in_fd);
+ if (fd_empty(in))
+ return -EBADF;
if (!(fd_file(in)->f_mode & FMODE_READ))
- goto fput_in;
- retval = -ESPIPE;
+ return -EBADF;
if (!ppos) {
pos = fd_file(in)->f_pos;
} else {
pos = *ppos;
if (!(fd_file(in)->f_mode & FMODE_PREAD))
- goto fput_in;
+ return -ESPIPE;
}
retval = rw_verify_area(READ, fd_file(in), &pos, count);
if (retval < 0)
- goto fput_in;
+ return retval;
if (count > MAX_RW_COUNT)
count = MAX_RW_COUNT;
/*
* Get output file, and verify that it is ok..
*/
- retval = -EBADF;
- out = fdget(out_fd);
- if (!fd_file(out))
- goto fput_in;
+ CLASS(fd, out)(out_fd);
+ if (fd_empty(out))
+ return -EBADF;
if (!(fd_file(out)->f_mode & FMODE_WRITE))
- goto fput_out;
+ return -EBADF;
in_inode = file_inode(fd_file(in));
out_inode = file_inode(fd_file(out));
out_pos = fd_file(out)->f_pos;
@@ -1262,9 +1246,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
if (unlikely(pos + count > max)) {
- retval = -EOVERFLOW;
if (pos >= max)
- goto fput_out;
+ return -EOVERFLOW;
count = max - pos;
}
@@ -1283,7 +1266,7 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
if (!opipe) {
retval = rw_verify_area(WRITE, fd_file(out), &out_pos, count);
if (retval < 0)
- goto fput_out;
+ return retval;
retval = do_splice_direct(fd_file(in), &pos, fd_file(out), &out_pos,
count, fl);
} else {
@@ -1309,12 +1292,6 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
inc_syscw(current);
if (pos > max)
retval = -EOVERFLOW;
-
-fput_out:
- fdput(out);
-fput_in:
- fdput(in);
-out:
return retval;
}
@@ -1570,36 +1547,32 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
{
loff_t pos_in;
loff_t pos_out;
- struct fd f_in;
- struct fd f_out;
ssize_t ret = -EBADF;
- f_in = fdget(fd_in);
- if (!fd_file(f_in))
- goto out2;
+ CLASS(fd, f_in)(fd_in);
+ if (fd_empty(f_in))
+ return -EBADF;
- f_out = fdget(fd_out);
- if (!fd_file(f_out))
- goto out1;
+ CLASS(fd, f_out)(fd_out);
+ if (fd_empty(f_out))
+ return -EBADF;
- ret = -EFAULT;
if (off_in) {
if (copy_from_user(&pos_in, off_in, sizeof(loff_t)))
- goto out;
+ return -EFAULT;
} else {
pos_in = fd_file(f_in)->f_pos;
}
if (off_out) {
if (copy_from_user(&pos_out, off_out, sizeof(loff_t)))
- goto out;
+ return -EFAULT;
} else {
pos_out = fd_file(f_out)->f_pos;
}
- ret = -EINVAL;
if (flags != 0)
- goto out;
+ return -EINVAL;
ret = vfs_copy_file_range(fd_file(f_in), pos_in, fd_file(f_out), pos_out, len,
flags);
@@ -1621,12 +1594,6 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
fd_file(f_out)->f_pos = pos_out;
}
}
-
-out:
- fdput(f_out);
-out1:
- fdput(f_in);
-out2:
return ret;
}
diff --git a/fs/remap_range.c b/fs/remap_range.c
index 4403d5c68fcb..26afbbbfb10c 100644
--- a/fs/remap_range.c
+++ b/fs/remap_range.c
@@ -536,20 +536,19 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
}
for (i = 0, info = same->info; i < count; i++, info++) {
- struct fd dst_fd = fdget(info->dest_fd);
- struct file *dst_file = fd_file(dst_fd);
+ CLASS(fd, dst_fd)(info->dest_fd);
- if (!dst_file) {
+ if (fd_empty(dst_fd)) {
info->status = -EBADF;
goto next_loop;
}
if (info->reserved) {
info->status = -EINVAL;
- goto next_fdput;
+ goto next_loop;
}
- deduped = vfs_dedupe_file_range_one(file, off, dst_file,
+ deduped = vfs_dedupe_file_range_one(file, off, fd_file(dst_fd),
info->dest_offset, len,
REMAP_FILE_CAN_SHORTEN);
if (deduped == -EBADE)
@@ -559,8 +558,6 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
else
info->bytes_deduped = len;
-next_fdput:
- fdput(dst_fd);
next_loop:
if (fatal_signal_pending(current))
break;
diff --git a/fs/select.c b/fs/select.c
index 97e1009dde00..0befca98af60 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -525,19 +525,16 @@ static noinline_for_stack int do_select(int n, fd_set_bits *fds, struct timespec
}
for (j = 0; j < BITS_PER_LONG; ++j, ++i, bit <<= 1) {
- struct fd f;
if (i >= n)
break;
if (!(bit & all_bits))
continue;
mask = EPOLLNVAL;
- f = fdget(i);
- if (fd_file(f)) {
+ CLASS(fd, f)(i);
+ if (!fd_empty(f)) {
wait_key_set(wait, in, out, bit,
busy_flag);
mask = vfs_poll(fd_file(f), wait);
-
- fdput(f);
}
if ((mask & POLLIN_SET) && (in & bit)) {
res_in |= bit;
@@ -858,13 +855,12 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
{
int fd = pollfd->fd;
__poll_t mask = 0, filter;
- struct fd f;
if (fd < 0)
goto out;
mask = EPOLLNVAL;
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
goto out;
/* userland u16 ->events contains POLL... bitmap */
@@ -874,7 +870,6 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
if (mask & busy_flag)
*can_busy_poll = true;
mask &= filter; /* Mask out unneeded events. */
- fdput(f);
out:
/* ... and so does ->revents */
diff --git a/fs/signalfd.c b/fs/signalfd.c
index c39cf00ab28a..cc7af00b8527 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -292,20 +292,17 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
*/
fd_install(ufd, file);
} else {
- struct fd f = fdget(ufd);
- if (!fd_file(f))
+ CLASS(fd, f)(ufd);
+ if (fd_empty(f))
return -EBADF;
ctx = fd_file(f)->private_data;
- if (fd_file(f)->f_op != &signalfd_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &signalfd_fops)
return -EINVAL;
- }
spin_lock_irq(¤t->sighand->siglock);
ctx->sigmask = *mask;
spin_unlock_irq(¤t->sighand->siglock);
wake_up(¤t->sighand->signalfd_wqh);
- fdput(f);
}
return ufd;
diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c
index 94bf2e5014d9..6d9df3646df3 100644
--- a/fs/smb/client/ioctl.c
+++ b/fs/smb/client/ioctl.c
@@ -72,7 +72,6 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
unsigned long srcfd)
{
int rc;
- struct fd src_file;
struct inode *src_inode;
cifs_dbg(FYI, "ioctl copychunk range\n");
@@ -89,8 +88,8 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
return rc;
}
- src_file = fdget(srcfd);
- if (!fd_file(src_file)) {
+ CLASS(fd, src_file)(srcfd);
+ if (fd_empty(src_file)) {
rc = -EBADF;
goto out_drop_write;
}
@@ -98,20 +97,18 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
if (fd_file(src_file)->f_op->unlocked_ioctl != cifs_ioctl) {
rc = -EBADF;
cifs_dbg(VFS, "src file seems to be from a different filesystem type\n");
- goto out_fput;
+ goto out_drop_write;
}
src_inode = file_inode(fd_file(src_file));
rc = -EINVAL;
if (S_ISDIR(src_inode->i_mode))
- goto out_fput;
+ goto out_drop_write;
rc = cifs_file_copychunk_range(xid, fd_file(src_file), 0, dst_file, 0,
src_inode->i_size, 0);
if (rc > 0)
rc = 0;
-out_fput:
- fdput(src_file);
out_drop_write:
mnt_drop_write_file(dst_file);
return rc;
diff --git a/fs/splice.c b/fs/splice.c
index 06232d7e505f..42aa7bc46be5 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1626,8 +1626,6 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
error = vmsplice_to_user(fd_file(f), &iter, flags);
kfree(iov);
-out_fdput:
- fdput(f);
return error;
}
@@ -1635,27 +1633,22 @@ SYSCALL_DEFINE6(splice, int, fd_in, loff_t __user *, off_in,
int, fd_out, loff_t __user *, off_out,
size_t, len, unsigned int, flags)
{
- struct fd in, out;
- ssize_t error;
-
if (unlikely(!len))
return 0;
if (unlikely(flags & ~SPLICE_F_ALL))
return -EINVAL;
- error = -EBADF;
- in = fdget(fd_in);
- if (fd_file(in)) {
- out = fdget(fd_out);
- if (fd_file(out)) {
- error = __do_splice(fd_file(in), off_in, fd_file(out), off_out,
+ CLASS(fd, in)(fd_in);
+ if (fd_empty(in))
+ return -EBADF;
+
+ CLASS(fd, out)(fd_out);
+ if (fd_empty(out))
+ return -EBADF;
+
+ return __do_splice(fd_file(in), off_in, fd_file(out), off_out,
len, flags);
- fdput(out);
- }
- fdput(in);
- }
- return error;
}
/*
@@ -2005,25 +1998,19 @@ ssize_t do_tee(struct file *in, struct file *out, size_t len,
SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags)
{
- struct fd in, out;
- ssize_t error;
-
if (unlikely(flags & ~SPLICE_F_ALL))
return -EINVAL;
if (unlikely(!len))
return 0;
- error = -EBADF;
- in = fdget(fdin);
- if (fd_file(in)) {
- out = fdget(fdout);
- if (fd_file(out)) {
- error = do_tee(fd_file(in), fd_file(out), len, flags);
- fdput(out);
- }
- fdput(in);
- }
+ CLASS(fd, in)(fdin);
+ if (fd_empty(in))
+ return -EBADF;
- return error;
+ CLASS(fd, out)(fdout);
+ if (fd_empty(out))
+ return -EBADF;
+
+ return do_tee(fd_file(in), fd_file(out), len, flags);
}
diff --git a/fs/sync.c b/fs/sync.c
index 67df255eb189..2955cd4c77a3 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -148,11 +148,11 @@ void emergency_sync(void)
*/
SYSCALL_DEFINE1(syncfs, int, fd)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct super_block *sb;
int ret, ret2;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
sb = fd_file(f)->f_path.dentry->d_sb;
@@ -162,7 +162,6 @@ SYSCALL_DEFINE1(syncfs, int, fd)
ret2 = errseq_check_and_advance(&sb->s_wb_err, &fd_file(f)->f_sb_err);
- fdput(f);
return ret ? ret : ret2;
}
@@ -205,14 +204,12 @@ EXPORT_SYMBOL(vfs_fsync);
static int do_fsync(unsigned int fd, int datasync)
{
- struct fd f = fdget(fd);
- int ret = -EBADF;
+ CLASS(fd, f)(fd);
- if (fd_file(f)) {
- ret = vfs_fsync(fd_file(f), datasync);
- fdput(f);
- }
- return ret;
+ if (fd_empty(f))
+ return -EBADF;
+
+ return vfs_fsync(fd_file(f), datasync);
}
SYSCALL_DEFINE1(fsync, unsigned int, fd)
@@ -355,16 +352,12 @@ int sync_file_range(struct file *file, loff_t offset, loff_t nbytes,
int ksys_sync_file_range(int fd, loff_t offset, loff_t nbytes,
unsigned int flags)
{
- int ret;
- struct fd f;
+ CLASS(fd, f)(fd);
- ret = -EBADF;
- f = fdget(fd);
- if (fd_file(f))
- ret = sync_file_range(fd_file(f), offset, nbytes, flags);
+ if (fd_empty(f))
+ return -EBADF;
- fdput(f);
- return ret;
+ return sync_file_range(fd_file(f), offset, nbytes, flags);
}
SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes,
diff --git a/fs/utimes.c b/fs/utimes.c
index 99b26f792b89..c7c7958e57b2 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -108,18 +108,13 @@ static int do_utimes_path(int dfd, const char __user *filename,
static int do_utimes_fd(int fd, struct timespec64 *times, int flags)
{
- struct fd f;
- int error;
-
if (flags)
return -EINVAL;
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EBADF;
- error = vfs_utimes(&fd_file(f)->f_path, times);
- fdput(f);
- return error;
+ return vfs_utimes(&fd_file(f)->f_path, times);
}
/*
diff --git a/fs/xattr.c b/fs/xattr.c
index d4f84f57e703..980dc1710e97 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -697,11 +697,11 @@ SYSCALL_DEFINE5(lsetxattr, const char __user *, pathname,
SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
const void __user *,value, size_t, size, int, flags)
{
- struct fd f = fdget(fd);
- int error = -EBADF;
+ CLASS(fd, f)(fd);
+ int error;
- if (!fd_file(f))
- return error;
+ if (fd_empty(f))
+ return -EBADF;
audit_file(fd_file(f));
error = mnt_want_write_file(fd_file(f));
if (!error) {
@@ -710,7 +710,6 @@ SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
value, size, flags);
mnt_drop_write_file(fd_file(f));
}
- fdput(f);
return error;
}
@@ -808,16 +807,13 @@ SYSCALL_DEFINE4(lgetxattr, const char __user *, pathname,
SYSCALL_DEFINE4(fgetxattr, int, fd, const char __user *, name,
void __user *, value, size_t, size)
{
- struct fd f = fdget(fd);
- ssize_t error = -EBADF;
+ CLASS(fd, f)(fd);
- if (!fd_file(f))
- return error;
+ if (fd_empty(f))
+ return -EBADF;
audit_file(fd_file(f));
- error = getxattr(file_mnt_idmap(fd_file(f)), fd_file(f)->f_path.dentry,
+ return getxattr(file_mnt_idmap(fd_file(f)), fd_file(f)->f_path.dentry,
name, value, size);
- fdput(f);
- return error;
}
/*
@@ -884,15 +880,12 @@ SYSCALL_DEFINE3(llistxattr, const char __user *, pathname, char __user *, list,
SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
{
- struct fd f = fdget(fd);
- ssize_t error = -EBADF;
+ CLASS(fd, f)(fd);
- if (!fd_file(f))
- return error;
+ if (fd_empty(f))
+ return -EBADF;
audit_file(fd_file(f));
- error = listxattr(fd_file(f)->f_path.dentry, list, size);
- fdput(f);
- return error;
+ return listxattr(fd_file(f)->f_path.dentry, list, size);
}
/*
@@ -953,11 +946,11 @@ SYSCALL_DEFINE2(lremovexattr, const char __user *, pathname,
SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
{
- struct fd f = fdget(fd);
- int error = -EBADF;
+ CLASS(fd, f)(fd);
+ int error;
- if (!fd_file(f))
- return error;
+ if (fd_empty(f))
+ return -EBADF;
audit_file(fd_file(f));
error = mnt_want_write_file(fd_file(f));
if (!error) {
@@ -965,7 +958,6 @@ SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
fd_file(f)->f_path.dentry, name);
mnt_drop_write_file(fd_file(f));
}
- fdput(f);
return error;
}
diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
index 9790e0f45d14..35b9b58a4f6f 100644
--- a/fs/xfs/xfs_exchrange.c
+++ b/fs/xfs/xfs_exchrange.c
@@ -778,8 +778,6 @@ xfs_ioc_exchange_range(
.file2 = file,
};
struct xfs_exchange_range args;
- struct fd file1;
- int error;
if (copy_from_user(&args, argp, sizeof(args)))
return -EFAULT;
@@ -793,12 +791,10 @@ xfs_ioc_exchange_range(
fxr.length = args.length;
fxr.flags = args.flags;
- file1 = fdget(args.file1_fd);
- if (!fd_file(file1))
+ CLASS(fd, file1)(args.file1_fd);
+ if (fd_empty(file1))
return -EBADF;
fxr.file1 = fd_file(file1);
- error = xfs_exchange_range(&fxr);
- fdput(file1);
- return error;
+ return xfs_exchange_range(&fxr);
}
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index c8b432fb7b40..4458ddf5dec5 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1060,41 +1060,29 @@ xfs_ioc_swapext(
xfs_swapext_t *sxp)
{
xfs_inode_t *ip, *tip;
- struct fd f, tmp;
- int error = 0;
/* Pull information for the target fd */
- f = fdget((int)sxp->sx_fdtarget);
- if (!fd_file(f)) {
- error = -EINVAL;
- goto out;
- }
+ CLASS(fd, f)((int)sxp->sx_fdtarget);
+ if (fd_empty(f))
+ return -EINVAL;
if (!(fd_file(f)->f_mode & FMODE_WRITE) ||
!(fd_file(f)->f_mode & FMODE_READ) ||
- (fd_file(f)->f_flags & O_APPEND)) {
- error = -EBADF;
- goto out_put_file;
- }
+ (fd_file(f)->f_flags & O_APPEND))
+ return -EBADF;
- tmp = fdget((int)sxp->sx_fdtmp);
- if (!fd_file(tmp)) {
- error = -EINVAL;
- goto out_put_file;
- }
+ CLASS(fd, tmp)((int)sxp->sx_fdtmp);
+ if (fd_empty(tmp))
+ return -EINVAL;
if (!(fd_file(tmp)->f_mode & FMODE_WRITE) ||
!(fd_file(tmp)->f_mode & FMODE_READ) ||
- (fd_file(tmp)->f_flags & O_APPEND)) {
- error = -EBADF;
- goto out_put_tmp_file;
- }
+ (fd_file(tmp)->f_flags & O_APPEND))
+ return -EBADF;
if (IS_SWAPFILE(file_inode(fd_file(f))) ||
- IS_SWAPFILE(file_inode(fd_file(tmp)))) {
- error = -EINVAL;
- goto out_put_tmp_file;
- }
+ IS_SWAPFILE(file_inode(fd_file(tmp))))
+ return -EINVAL;
/*
* We need to ensure that the fds passed in point to XFS inodes
@@ -1102,37 +1090,22 @@ xfs_ioc_swapext(
* control over what the user passes us here.
*/
if (fd_file(f)->f_op != &xfs_file_operations ||
- fd_file(tmp)->f_op != &xfs_file_operations) {
- error = -EINVAL;
- goto out_put_tmp_file;
- }
+ fd_file(tmp)->f_op != &xfs_file_operations)
+ return -EINVAL;
ip = XFS_I(file_inode(fd_file(f)));
tip = XFS_I(file_inode(fd_file(tmp)));
- if (ip->i_mount != tip->i_mount) {
- error = -EINVAL;
- goto out_put_tmp_file;
- }
-
- if (ip->i_ino == tip->i_ino) {
- error = -EINVAL;
- goto out_put_tmp_file;
- }
+ if (ip->i_mount != tip->i_mount)
+ return -EINVAL;
- if (xfs_is_shutdown(ip->i_mount)) {
- error = -EIO;
- goto out_put_tmp_file;
- }
+ if (ip->i_ino == tip->i_ino)
+ return -EINVAL;
- error = xfs_swap_extents(ip, tip, sxp);
+ if (xfs_is_shutdown(ip->i_mount))
+ return -EIO;
- out_put_tmp_file:
- fdput(tmp);
- out_put_file:
- fdput(f);
- out:
- return error;
+ return xfs_swap_extents(ip, tip, sxp);
}
static int
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index ffa7d341bd95..26b1c14d5967 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -105,29 +105,21 @@ static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
{
struct io_ring_ctx *ctx_attach;
struct io_sq_data *sqd;
- struct fd f;
+ CLASS(fd, f)(p->wq_fd);
- f = fdget(p->wq_fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-ENXIO);
- if (!io_is_uring_fops(fd_file(f))) {
- fdput(f);
+ if (!io_is_uring_fops(fd_file(f)))
return ERR_PTR(-EINVAL);
- }
ctx_attach = fd_file(f)->private_data;
sqd = ctx_attach->sq_data;
- if (!sqd) {
- fdput(f);
+ if (!sqd)
return ERR_PTR(-EINVAL);
- }
- if (sqd->task_tgid != current->tgid) {
- fdput(f);
+ if (sqd->task_tgid != current->tgid)
return ERR_PTR(-EPERM);
- }
refcount_inc(&sqd->refs);
- fdput(f);
return sqd;
}
@@ -415,16 +407,11 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
/* Retain compatibility with failing for an invalid attach attempt */
if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
IORING_SETUP_ATTACH_WQ) {
- struct fd f;
-
- f = fdget(p->wq_fd);
- if (!fd_file(f))
+ CLASS(fd, f)(p->wq_fd);
+ if (fd_empty(f))
return -ENXIO;
- if (!io_is_uring_fops(fd_file(f))) {
- fdput(f);
+ if (!io_is_uring_fops(fd_file(f)))
return -EINVAL;
- }
- fdput(f);
}
if (ctx->flags & IORING_SETUP_SQPOLL) {
struct task_struct *tsk;
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 9133a52be69b..c72ef725e845 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1062,7 +1062,6 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
size_t msg_len, unsigned int msg_prio,
struct timespec64 *ts)
{
- struct fd f;
struct inode *inode;
struct ext_wait_queue wait;
struct ext_wait_queue *receiver;
@@ -1083,37 +1082,27 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
audit_mq_sendrecv(mqdes, msg_len, msg_prio, ts);
- f = fdget(mqdes);
- if (unlikely(!fd_file(f))) {
- ret = -EBADF;
- goto out;
- }
+ CLASS(fd, f)(mqdes);
+ if (fd_empty(f))
+ return -EBADF;
inode = file_inode(fd_file(f));
- if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
- ret = -EBADF;
- goto out_fput;
- }
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
+ return -EBADF;
info = MQUEUE_I(inode);
audit_file(fd_file(f));
- if (unlikely(!(fd_file(f)->f_mode & FMODE_WRITE))) {
- ret = -EBADF;
- goto out_fput;
- }
+ if (unlikely(!(fd_file(f)->f_mode & FMODE_WRITE)))
+ return -EBADF;
- if (unlikely(msg_len > info->attr.mq_msgsize)) {
- ret = -EMSGSIZE;
- goto out_fput;
- }
+ if (unlikely(msg_len > info->attr.mq_msgsize))
+ return -EMSGSIZE;
/* First try to allocate memory, before doing anything with
* existing queues. */
msg_ptr = load_msg(u_msg_ptr, msg_len);
- if (IS_ERR(msg_ptr)) {
- ret = PTR_ERR(msg_ptr);
- goto out_fput;
- }
+ if (IS_ERR(msg_ptr))
+ return PTR_ERR(msg_ptr);
msg_ptr->m_ts = msg_len;
msg_ptr->m_type = msg_prio;
@@ -1171,9 +1160,6 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
out_free:
if (ret)
free_msg(msg_ptr);
-out_fput:
- fdput(f);
-out:
return ret;
}
@@ -1183,7 +1169,6 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
{
ssize_t ret;
struct msg_msg *msg_ptr;
- struct fd f;
struct inode *inode;
struct mqueue_inode_info *info;
struct ext_wait_queue wait;
@@ -1197,30 +1182,22 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
audit_mq_sendrecv(mqdes, msg_len, 0, ts);
- f = fdget(mqdes);
- if (unlikely(!fd_file(f))) {
- ret = -EBADF;
- goto out;
- }
+ CLASS(fd, f)(mqdes);
+ if (fd_empty(f))
+ return -EBADF;
inode = file_inode(fd_file(f));
- if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
- ret = -EBADF;
- goto out_fput;
- }
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
+ return -EBADF;
info = MQUEUE_I(inode);
audit_file(fd_file(f));
- if (unlikely(!(fd_file(f)->f_mode & FMODE_READ))) {
- ret = -EBADF;
- goto out_fput;
- }
+ if (unlikely(!(fd_file(f)->f_mode & FMODE_READ)))
+ return -EBADF;
/* checks if buffer is big enough */
- if (unlikely(msg_len < info->attr.mq_msgsize)) {
- ret = -EMSGSIZE;
- goto out_fput;
- }
+ if (unlikely(msg_len < info->attr.mq_msgsize))
+ return -EMSGSIZE;
/*
* msg_insert really wants us to have a valid, spare node struct so
@@ -1274,9 +1251,6 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
}
free_msg(msg_ptr);
}
-out_fput:
- fdput(f);
-out:
return ret;
}
@@ -1451,21 +1425,18 @@ SYSCALL_DEFINE2(mq_notify, mqd_t, mqdes,
static int do_mq_getsetattr(int mqdes, struct mq_attr *new, struct mq_attr *old)
{
- struct fd f;
struct inode *inode;
struct mqueue_inode_info *info;
if (new && (new->mq_flags & (~O_NONBLOCK)))
return -EINVAL;
- f = fdget(mqdes);
- if (!fd_file(f))
+ CLASS(fd, f)(mqdes);
+ if (fd_empty(f))
return -EBADF;
- if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
- fdput(f);
+ if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
return -EBADF;
- }
inode = file_inode(fd_file(f));
info = MQUEUE_I(inode);
@@ -1489,7 +1460,6 @@ static int do_mq_getsetattr(int mqdes, struct mq_attr *new, struct mq_attr *old)
}
spin_unlock(&info->lock);
- fdput(f);
return 0;
}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a0e812c29c97..d0adca07b0b5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7505,21 +7505,16 @@ int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
struct btf *btf_get_by_fd(int fd)
{
struct btf *btf;
- struct fd f;
+ CLASS(fd, f)(fd);
- f = fdget(fd);
-
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &btf_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &btf_fops)
return ERR_PTR(-EINVAL);
- }
btf = fd_file(f)->private_data;
refcount_inc(&btf->refcnt);
- fdput(f);
return btf;
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 477bb89f03aa..fd833a2b7c1b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3234,20 +3234,16 @@ int bpf_link_new_fd(struct bpf_link *link)
struct bpf_link *bpf_link_get_from_fd(u32 ufd)
{
- struct fd f = fdget(ufd);
+ CLASS(fd, f)(ufd);
struct bpf_link *link;
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &bpf_link_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &bpf_link_fops)
return ERR_PTR(-EINVAL);
- }
link = fd_file(f)->private_data;
bpf_link_inc(link);
- fdput(f);
-
return link;
}
EXPORT_SYMBOL(bpf_link_get_from_fd);
@@ -4952,33 +4948,25 @@ static int bpf_link_get_info_by_fd(struct file *file,
static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
union bpf_attr __user *uattr)
{
- int ufd = attr->info.bpf_fd;
- struct fd f;
- int err;
-
if (CHECK_ATTR(BPF_OBJ_GET_INFO_BY_FD))
return -EINVAL;
- f = fdget(ufd);
- if (!fd_file(f))
+ CLASS(fd, f)(attr->info.bpf_fd);
+ if (fd_empty(f))
return -EBADFD;
if (fd_file(f)->f_op == &bpf_prog_fops)
- err = bpf_prog_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
+ return bpf_prog_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
uattr);
else if (fd_file(f)->f_op == &bpf_map_fops)
- err = bpf_map_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
+ return bpf_map_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
uattr);
else if (fd_file(f)->f_op == &btf_fops)
- err = bpf_btf_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr, uattr);
+ return bpf_btf_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr, uattr);
else if (fd_file(f)->f_op == &bpf_link_fops)
- err = bpf_link_get_info_by_fd(fd_file(f), fd_file(f)->private_data,
+ return bpf_link_get_info_by_fd(fd_file(f), fd_file(f)->private_data,
attr, uattr);
- else
- err = -EINVAL;
-
- fdput(f);
- return err;
+ return -EINVAL;
}
#define BPF_BTF_LOAD_LAST_FIELD btf_token_fd
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
index 9a1d356e79ed..4164feec9a3b 100644
--- a/kernel/bpf/token.c
+++ b/kernel/bpf/token.c
@@ -117,17 +117,15 @@ int bpf_token_create(union bpf_attr *attr)
struct inode *inode;
struct file *file;
struct path path;
- struct fd f;
+ CLASS(fd, f)(attr->token_create.bpffs_fd);
umode_t mode;
int err, fd;
- f = fdget(attr->token_create.bpffs_fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
path = fd_file(f)->f_path;
path_get(&path);
- fdput(f);
if (path.dentry != path.mnt->mnt_sb->s_root) {
err = -EINVAL;
@@ -232,19 +230,16 @@ int bpf_token_create(union bpf_attr *attr)
struct bpf_token *bpf_token_get_from_fd(u32 ufd)
{
- struct fd f = fdget(ufd);
+ CLASS(fd, f)(ufd);
struct bpf_token *token;
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &bpf_token_fops) {
- fdput(f);
+ if (fd_file(f)->f_op != &bpf_token_fops)
return ERR_PTR(-EINVAL);
- }
token = fd_file(f)->private_data;
bpf_token_inc(token);
- fdput(f);
return token;
}
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fd4621cd9c23..bc4910442642 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -930,22 +930,20 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
{
struct perf_cgroup *cgrp;
struct cgroup_subsys_state *css;
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
int ret = 0;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
css = css_tryget_online_from_dir(fd_file(f)->f_path.dentry,
&perf_event_cgrp_subsys);
- if (IS_ERR(css)) {
- ret = PTR_ERR(css);
- goto out;
- }
+ if (IS_ERR(css))
+ return PTR_ERR(css);
ret = perf_cgroup_ensure_storage(event, css);
if (ret)
- goto out;
+ return ret;
cgrp = container_of(css, struct perf_cgroup, css);
event->cgrp = cgrp;
@@ -959,8 +957,6 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
perf_detach_cgroup(event);
ret = -EINVAL;
}
-out:
- fdput(f);
return ret;
}
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index dc952c3b05af..c9d97ed20122 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -545,12 +545,12 @@ static void commit_nsset(struct nsset *nsset)
SYSCALL_DEFINE2(setns, int, fd, int, flags)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct ns_common *ns = NULL;
struct nsset nsset = {};
int err = 0;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
if (proc_ns_file(fd_file(f))) {
@@ -580,7 +580,6 @@ SYSCALL_DEFINE2(setns, int, fd, int, flags)
}
put_nsset(&nsset);
out:
- fdput(f);
return err;
}
diff --git a/kernel/pid.c b/kernel/pid.c
index 2715afb77eab..115448e89c3e 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -536,11 +536,10 @@ EXPORT_SYMBOL_GPL(find_ge_pid);
struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
{
- struct fd f;
+ CLASS(fd, f)(fd);
struct pid *pid;
- f = fdget(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
pid = pidfd_pid(fd_file(f));
@@ -548,8 +547,6 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
get_pid(pid);
*flags = fd_file(f)->f_flags;
}
-
- fdput(f);
return pid;
}
@@ -747,23 +744,18 @@ SYSCALL_DEFINE3(pidfd_getfd, int, pidfd, int, fd,
unsigned int, flags)
{
struct pid *pid;
- struct fd f;
- int ret;
/* flags is currently unused - make sure it's unset */
if (flags)
return -EINVAL;
- f = fdget(pidfd);
- if (!fd_file(f))
+ CLASS(fd, f)(pidfd);
+ if (fd_empty(f))
return -EBADF;
pid = pidfd_pid(fd_file(f));
if (IS_ERR(pid))
- ret = PTR_ERR(pid);
- else
- ret = pidfd_getfd(pid, fd);
+ return PTR_ERR(pid);
- fdput(f);
- return ret;
+ return pidfd_getfd(pid, fd);
}
diff --git a/kernel/signal.c b/kernel/signal.c
index 6c438fd436d8..9f4949ac8a3c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3900,7 +3900,6 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
siginfo_t __user *, info, unsigned int, flags)
{
int ret;
- struct fd f;
struct pid *pid;
kernel_siginfo_t kinfo;
enum pid_type type;
@@ -3913,20 +3912,17 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
if (hweight32(flags & PIDFD_SEND_SIGNAL_FLAGS) > 1)
return -EINVAL;
- f = fdget(pidfd);
- if (!fd_file(f))
+ CLASS(fd, f)(pidfd);
+ if (fd_empty(f))
return -EBADF;
/* Is this a pidfd? */
pid = pidfd_to_pid(fd_file(f));
- if (IS_ERR(pid)) {
- ret = PTR_ERR(pid);
- goto err;
- }
+ if (IS_ERR(pid))
+ return PTR_ERR(pid);
- ret = -EINVAL;
if (!access_pidfd_pidns(pid))
- goto err;
+ return -EINVAL;
switch (flags) {
case 0:
@@ -3950,28 +3946,23 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
if (info) {
ret = copy_siginfo_from_user_any(&kinfo, info);
if (unlikely(ret))
- goto err;
+ return ret;
- ret = -EINVAL;
if (unlikely(sig != kinfo.si_signo))
- goto err;
+ return -EINVAL;
/* Only allow sending arbitrary signals to yourself. */
- ret = -EPERM;
if ((task_pid(current) != pid || type > PIDTYPE_TGID) &&
(kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL))
- goto err;
+ return -EPERM;
} else {
prepare_kill_siginfo(sig, &kinfo, type);
}
if (type == PIDTYPE_PGID)
- ret = kill_pgrp_info(sig, &kinfo, pid);
+ return kill_pgrp_info(sig, &kinfo, pid);
else
- ret = kill_pid_info_type(sig, &kinfo, pid, type);
-err:
- fdput(f);
- return ret;
+ return kill_pid_info_type(sig, &kinfo, pid, type);
}
static int
diff --git a/kernel/sys.c b/kernel/sys.c
index a4be1e568ff5..243d58916899 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1911,12 +1911,11 @@ SYSCALL_DEFINE1(umask, int, mask)
static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
{
- struct fd exe;
+ CLASS(fd, exe)(fd);
struct inode *inode;
int err;
- exe = fdget(fd);
- if (!fd_file(exe))
+ if (fd_empty(exe))
return -EBADF;
inode = file_inode(fd_file(exe));
@@ -1926,18 +1925,14 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
* sure that this one is executable as well, to avoid breaking an
* overall picture.
*/
- err = -EACCES;
if (!S_ISREG(inode->i_mode) || path_noexec(&fd_file(exe)->f_path))
- goto exit;
+ return -EACCES;
err = file_permission(fd_file(exe), MAY_EXEC);
if (err)
- goto exit;
+ return err;
- err = replace_mm_exe_file(mm, fd_file(exe));
-exit:
- fdput(exe);
- return err;
+ return replace_mm_exe_file(mm, fd_file(exe));
}
/*
diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 0700f40c53ac..0cd680ccc7e5 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -411,15 +411,14 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
struct nlattr *na;
size_t size;
u32 fd;
- struct fd f;
na = info->attrs[CGROUPSTATS_CMD_ATTR_FD];
if (!na)
return -EINVAL;
fd = nla_get_u32(info->attrs[CGROUPSTATS_CMD_ATTR_FD]);
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return 0;
size = nla_total_size(sizeof(struct cgroupstats));
@@ -427,14 +426,13 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
rc = prepare_reply(info, CGROUPSTATS_CMD_NEW, &rep_skb,
size);
if (rc < 0)
- goto err;
+ return rc;
na = nla_reserve(rep_skb, CGROUPSTATS_TYPE_CGROUP_STATS,
sizeof(struct cgroupstats));
if (na == NULL) {
nlmsg_free(rep_skb);
- rc = -EMSGSIZE;
- goto err;
+ return -EMSGSIZE;
}
stats = nla_data(na);
@@ -443,14 +441,10 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
rc = cgroupstats_build(stats, fd_file(f)->f_path.dentry);
if (rc < 0) {
nlmsg_free(rep_skb);
- goto err;
+ return rc;
}
- rc = send_reply(rep_skb, info);
-
-err:
- fdput(f);
- return rc;
+ return send_reply(rep_skb, info);
}
static int cmd_attr_register_cpumask(struct genl_info *info)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index d36242fd4936..1895fbc32bcb 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -663,16 +663,14 @@ struct watch_queue *get_watch_queue(int fd)
{
struct pipe_inode_info *pipe;
struct watch_queue *wqueue = ERR_PTR(-EINVAL);
- struct fd f;
+ CLASS(fd, f)(fd);
- f = fdget(fd);
- if (fd_file(f)) {
+ if (!fd_empty(f)) {
pipe = get_pipe_info(fd_file(f), false);
if (pipe && pipe->watch_queue) {
wqueue = pipe->watch_queue;
kref_get(&wqueue->usage);
}
- fdput(f);
}
return wqueue;
diff --git a/mm/fadvise.c b/mm/fadvise.c
index 532dee205c6e..588fe76c5a14 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -190,16 +190,12 @@ EXPORT_SYMBOL(vfs_fadvise);
int ksys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
{
- struct fd f = fdget(fd);
- int ret;
+ CLASS(fd, f)(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
- ret = vfs_fadvise(fd_file(f), offset, len, advice);
-
- fdput(f);
- return ret;
+ return vfs_fadvise(fd_file(f), offset, len, advice);
}
SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice)
diff --git a/mm/filemap.c b/mm/filemap.c
index c79c2c773171..13b2f133796d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4373,31 +4373,25 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
struct cachestat_range __user *, cstat_range,
struct cachestat __user *, cstat, unsigned int, flags)
{
- struct fd f = fdget(fd);
+ CLASS(fd, f)(fd);
struct address_space *mapping;
struct cachestat_range csr;
struct cachestat cs;
pgoff_t first_index, last_index;
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
if (copy_from_user(&csr, cstat_range,
- sizeof(struct cachestat_range))) {
- fdput(f);
+ sizeof(struct cachestat_range)))
return -EFAULT;
- }
/* hugetlbfs is not supported */
- if (is_file_hugepages(fd_file(f))) {
- fdput(f);
+ if (is_file_hugepages(fd_file(f)))
return -EOPNOTSUPP;
- }
- if (flags != 0) {
- fdput(f);
+ if (flags != 0)
return -EINVAL;
- }
first_index = csr.off >> PAGE_SHIFT;
last_index =
@@ -4405,7 +4399,6 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
memset(&cs, 0, sizeof(struct cachestat));
mapping = fd_file(f)->f_mapping;
filemap_cachestat(mapping, first_index, last_index, &cs);
- fdput(f);
if (copy_to_user(cstat, &cs, sizeof(struct cachestat)))
return -EFAULT;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 58d000013024..46240137fee3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5266,8 +5266,6 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
struct mem_cgroup_event *event;
struct cgroup_subsys_state *cfile_css;
unsigned int efd, cfd;
- struct fd efile;
- struct fd cfile;
struct dentry *cdentry;
const char *name;
char *endp;
@@ -5298,8 +5296,8 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
init_waitqueue_func_entry(&event->wait, memcg_event_wake);
INIT_WORK(&event->remove, memcg_event_remove);
- efile = fdget(efd);
- if (!fd_file(efile)) {
+ CLASS(fd, efile)(efd);
+ if (fd_empty(efile)) {
ret = -EBADF;
goto out_kfree;
}
@@ -5307,11 +5305,11 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
event->eventfd = eventfd_ctx_fileget(fd_file(efile));
if (IS_ERR(event->eventfd)) {
ret = PTR_ERR(event->eventfd);
- goto out_put_efile;
+ goto out_kfree;
}
- cfile = fdget(cfd);
- if (!fd_file(cfile)) {
+ CLASS(fd, cfile)(cfd);
+ if (fd_empty(cfile)) {
ret = -EBADF;
goto out_put_eventfd;
}
@@ -5320,7 +5318,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
/* AV: shouldn't we check that it's been opened for read instead? */
ret = file_permission(fd_file(cfile), MAY_READ);
if (ret < 0)
- goto out_put_cfile;
+ goto out_put_eventfd;
/*
* The control file must be a regular cgroup1 file. As a regular cgroup
@@ -5329,7 +5327,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
cdentry = fd_file(cfile)->f_path.dentry;
if (cdentry->d_sb->s_type != &cgroup_fs_type || !d_is_reg(cdentry)) {
ret = -EINVAL;
- goto out_put_cfile;
+ goto out_put_eventfd;
}
/*
@@ -5356,7 +5354,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
event->unregister_event = memsw_cgroup_usage_unregister_event;
} else {
ret = -EINVAL;
- goto out_put_cfile;
+ goto out_put_eventfd;
}
/*
@@ -5368,10 +5366,10 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
&memory_cgrp_subsys);
ret = -EINVAL;
if (IS_ERR(cfile_css))
- goto out_put_cfile;
+ goto out_put_eventfd;
if (cfile_css != css) {
css_put(cfile_css);
- goto out_put_cfile;
+ goto out_put_eventfd;
}
ret = event->register_event(memcg, event->eventfd, buf);
@@ -5384,19 +5382,12 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
list_add(&event->list, &memcg->event_list);
spin_unlock_irq(&memcg->event_list_lock);
- fdput(cfile);
- fdput(efile);
-
return nbytes;
out_put_css:
css_put(css);
-out_put_cfile:
- fdput(cfile);
out_put_eventfd:
eventfd_ctx_put(event->eventfd);
-out_put_efile:
- fdput(efile);
out_kfree:
kfree(event);
diff --git a/mm/readahead.c b/mm/readahead.c
index 2be4603488c5..3ce1269b972a 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -721,29 +721,22 @@ EXPORT_SYMBOL_GPL(page_cache_async_ra);
ssize_t ksys_readahead(int fd, loff_t offset, size_t count)
{
- ssize_t ret;
- struct fd f;
+ CLASS(fd, f)(fd);
- ret = -EBADF;
- f = fdget(fd);
- if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
- goto out;
+ if (fd_empty(f) || !(fd_file(f)->f_mode & FMODE_READ))
+ return -EBADF;
/*
* The readahead() syscall is intended to run only on files
* that can execute readahead. If readahead is not possible
* on this file, then we must return -EINVAL.
*/
- ret = -EINVAL;
if (!fd_file(f)->f_mapping || !fd_file(f)->f_mapping->a_ops ||
(!S_ISREG(file_inode(fd_file(f))->i_mode) &&
!S_ISBLK(file_inode(fd_file(f))->i_mode)))
- goto out;
+ return -EINVAL;
- ret = vfs_fadvise(fd_file(f), offset, count, POSIX_FADV_WILLNEED);
-out:
- fdput(f);
- return ret;
+ return vfs_fadvise(fd_file(f), offset, count, POSIX_FADV_WILLNEED);
}
SYSCALL_DEFINE3(readahead, int, fd, loff_t, offset, size_t, count)
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index fd536e385b83..e1e56cbf5cbf 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -703,20 +703,18 @@ EXPORT_SYMBOL_GPL(get_net_ns);
struct net *get_net_ns_by_fd(int fd)
{
- struct fd f = fdget(fd);
- struct net *net = ERR_PTR(-EINVAL);
+ CLASS(fd, f)(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return ERR_PTR(-EBADF);
if (proc_ns_file(fd_file(f))) {
struct ns_common *ns = get_proc_ns(file_inode(fd_file(f)));
if (ns->ops == &netns_operations)
- net = get_net(container_of(ns, struct net, ns));
+ return get_net(container_of(ns, struct net, ns));
}
- fdput(f);
- return net;
+ return ERR_PTR(-EINVAL);
}
EXPORT_SYMBOL_GPL(get_net_ns_by_fd);
#endif
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index e7c1d3ae33fe..e6b4cdd6e84c 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -1062,19 +1062,16 @@ int process_buffer_measurement(struct mnt_idmap *idmap,
*/
void ima_kexec_cmdline(int kernel_fd, const void *buf, int size)
{
- struct fd f;
-
if (!buf || !size)
return;
- f = fdget(kernel_fd);
- if (!fd_file(f))
+ CLASS(fd, f)(kernel_fd);
+ if (fd_empty(f))
return;
process_buffer_measurement(file_mnt_idmap(fd_file(f)), file_inode(fd_file(f)),
buf, size, "kexec-cmdline", KEXEC_CMDLINE, 0,
NULL, false, NULL, 0);
- fdput(f);
}
/**
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 97b3df540dc7..a661d373ea92 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -234,31 +234,21 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
const fmode_t mode)
{
- struct fd ruleset_f;
+ CLASS(fd, ruleset_f)(fd);
struct landlock_ruleset *ruleset;
- ruleset_f = fdget(fd);
- if (!fd_file(ruleset_f))
+ if (fd_empty(ruleset_f))
return ERR_PTR(-EBADF);
/* Checks FD type and access right. */
- if (fd_file(ruleset_f)->f_op != &ruleset_fops) {
- ruleset = ERR_PTR(-EBADFD);
- goto out_fdput;
- }
- if (!(fd_file(ruleset_f)->f_mode & mode)) {
- ruleset = ERR_PTR(-EPERM);
- goto out_fdput;
- }
+ if (fd_file(ruleset_f)->f_op != &ruleset_fops)
+ return ERR_PTR(-EBADFD);
+ if (!(fd_file(ruleset_f)->f_mode & mode))
+ return ERR_PTR(-EPERM);
ruleset = fd_file(ruleset_f)->private_data;
- if (WARN_ON_ONCE(ruleset->num_layers != 1)) {
- ruleset = ERR_PTR(-EINVAL);
- goto out_fdput;
- }
+ if (WARN_ON_ONCE(ruleset->num_layers != 1))
+ return ERR_PTR(-EINVAL);
landlock_get_ruleset(ruleset);
-
-out_fdput:
- fdput(ruleset_f);
return ruleset;
}
diff --git a/security/loadpin/loadpin.c b/security/loadpin/loadpin.c
index 02144ec39f43..68252452b66c 100644
--- a/security/loadpin/loadpin.c
+++ b/security/loadpin/loadpin.c
@@ -283,7 +283,6 @@ enum loadpin_securityfs_interface_index {
static int read_trusted_verity_root_digests(unsigned int fd)
{
- struct fd f;
void *data;
int rc;
char *p, *d;
@@ -295,8 +294,8 @@ static int read_trusted_verity_root_digests(unsigned int fd)
if (!list_empty(&dm_verity_loadpin_trusted_root_digests))
return -EPERM;
- f = fdget(fd);
- if (!fd_file(f))
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
return -EINVAL;
data = kzalloc(SZ_4K, GFP_KERNEL);
@@ -359,7 +358,6 @@ static int read_trusted_verity_root_digests(unsigned int fd)
}
kfree(data);
- fdput(f);
return 0;
@@ -379,8 +377,6 @@ static int read_trusted_verity_root_digests(unsigned int fd)
/* disallow further attempts after reading a corrupt/invalid file */
deny_reading_verity_digests = true;
- fdput(f);
-
return rc;
}
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 65efb3735e79..70bc0d1f5f6a 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -303,7 +303,6 @@ static int
kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
{
struct kvm_kernel_irqfd *irqfd, *tmp;
- struct fd f;
struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
int ret;
__poll_t events;
@@ -326,8 +325,8 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
seqcount_spinlock_init(&irqfd->irq_entry_sc, &kvm->irqfds.lock);
- f = fdget(args->fd);
- if (!fd_file(f)) {
+ CLASS(fd, f)(args->fd);
+ if (fd_empty(f)) {
ret = -EBADF;
goto out;
}
@@ -335,7 +334,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
eventfd = eventfd_ctx_fileget(fd_file(f));
if (IS_ERR(eventfd)) {
ret = PTR_ERR(eventfd);
- goto fail;
+ goto out;
}
irqfd->eventfd = eventfd;
@@ -439,12 +438,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
#endif
srcu_read_unlock(&kvm->irq_srcu, idx);
-
- /*
- * do not drop the file until the irqfd is fully initialized, otherwise
- * we might race against the EPOLLHUP
- */
- fdput(f);
return 0;
fail:
@@ -457,8 +450,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
if (eventfd && !IS_ERR(eventfd))
eventfd_ctx_put(eventfd);
- fdput(f);
-
out:
kfree(irqfd);
return ret;
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 388ae471d258..72aa1fdeb699 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -190,11 +190,10 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
{
struct kvm_vfio *kv = dev->private;
struct kvm_vfio_file *kvf;
- struct fd f;
+ CLASS(fd, f)(fd);
int ret;
- f = fdget(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADF;
ret = -ENOENT;
@@ -220,9 +219,6 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
kvm_vfio_update_coherency(dev);
mutex_unlock(&kv->lock);
-
- fdput(f);
-
return ret;
}
@@ -233,14 +229,13 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
struct kvm_vfio_spapr_tce param;
struct kvm_vfio *kv = dev->private;
struct kvm_vfio_file *kvf;
- struct fd f;
int ret;
if (copy_from_user(¶m, arg, sizeof(struct kvm_vfio_spapr_tce)))
return -EFAULT;
- f = fdget(param.groupfd);
- if (!fd_file(f))
+ CLASS(fd, f)(param.groupfd);
+ if (fd_empty(f))
return -EBADF;
ret = -ENOENT;
@@ -266,7 +261,6 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
err_fdput:
mutex_unlock(&kv->lock);
- fdput(f);
return ret;
}
#endif
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 12/19] bpf: switch to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (9 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...) Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:27 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 13/19] convert vmsplice() " Al Viro
` (7 subsequent siblings)
18 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Calling conventions for __bpf_map_get() would be more convenient
if it left fpdut() on failure to callers. Makes for simpler logics
in the callers.
Among other things, the proof of memory safety no longer has to
rely upon file->private_data never being ERR_PTR(...) for bpffs files.
Original calling conventions made it impossible for the caller to tell
whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found
the file not be a bpf map one (in which case it would've done fdput())
or because it found that ERR_PTR(-EINVAL) in file->private_data of a
bpf map file (in which case fdput() would _not_ have been done).
With that calling conventions change it's easy to switch all
bpffs users to CLASS(fd, ...)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
include/linux/bpf.h | 9 ++-
kernel/bpf/map_in_map.c | 37 +++-------
kernel/bpf/syscall.c | 149 ++++++++++++----------------------------
kernel/bpf/verifier.c | 20 +-----
net/core/sock_map.c | 23 ++-----
5 files changed, 72 insertions(+), 166 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5e694a308081..a98f21fc4ac9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2226,7 +2226,14 @@ void __bpf_obj_drop_impl(void *p, const struct btf_record *rec, bool percpu);
struct bpf_map *bpf_map_get(u32 ufd);
struct bpf_map *bpf_map_get_with_uref(u32 ufd);
-struct bpf_map *__bpf_map_get(struct fd f);
+
+struct bpf_map *bpf_file_to_map(struct file *f);
+static inline struct bpf_map *__bpf_map_get(struct fd f)
+{
+ if (fd_empty(f))
+ return ERR_PTR(-EBADF);
+ return bpf_file_to_map(fd_file(f));
+}
void bpf_map_inc(struct bpf_map *map);
void bpf_map_inc_with_uref(struct bpf_map *map);
struct bpf_map *__bpf_map_inc_not_zero(struct bpf_map *map, bool uref);
diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c
index b4f18c85d7bc..21536904b42e 100644
--- a/kernel/bpf/map_in_map.c
+++ b/kernel/bpf/map_in_map.c
@@ -11,24 +11,18 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
{
struct bpf_map *inner_map, *inner_map_meta;
u32 inner_map_meta_size;
- struct fd f;
- int ret;
+ CLASS(fd, f)(inner_map_ufd);
- f = fdget(inner_map_ufd);
inner_map = __bpf_map_get(f);
if (IS_ERR(inner_map))
return inner_map;
/* Does not support >1 level map-in-map */
- if (inner_map->inner_map_meta) {
- ret = -EINVAL;
- goto put;
- }
+ if (inner_map->inner_map_meta)
+ return ERR_PTR(-EINVAL);
- if (!inner_map->ops->map_meta_equal) {
- ret = -ENOTSUPP;
- goto put;
- }
+ if (!inner_map->ops->map_meta_equal)
+ return ERR_PTR(-ENOTSUPP);
inner_map_meta_size = sizeof(*inner_map_meta);
/* In some cases verifier needs to access beyond just base map. */
@@ -36,10 +30,8 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
inner_map_meta_size = sizeof(struct bpf_array);
inner_map_meta = kzalloc(inner_map_meta_size, GFP_USER);
- if (!inner_map_meta) {
- ret = -ENOMEM;
- goto put;
- }
+ if (!inner_map_meta)
+ return ERR_PTR(-ENOMEM);
inner_map_meta->map_type = inner_map->map_type;
inner_map_meta->key_size = inner_map->key_size;
@@ -53,8 +45,9 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
* invalid/empty/valid, but ERR_PTR in case of errors. During
* equality NULL or IS_ERR is equivalent.
*/
- ret = PTR_ERR(inner_map_meta->record);
- goto free;
+ struct bpf_map *ret = ERR_CAST(inner_map_meta->record);
+ kfree(inner_map_meta);
+ return ret;
}
/* Note: We must use the same BTF, as we also used btf_record_dup above
* which relies on BTF being same for both maps, as some members like
@@ -78,13 +71,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
inner_map_meta->bypass_spec_v1 = inner_map->bypass_spec_v1;
}
- fdput(f);
return inner_map_meta;
-free:
- kfree(inner_map_meta);
-put:
- fdput(f);
- return ERR_PTR(ret);
}
void bpf_map_meta_free(struct bpf_map *map_meta)
@@ -110,9 +97,8 @@ void *bpf_map_fd_get_ptr(struct bpf_map *map,
int ufd)
{
struct bpf_map *inner_map, *inner_map_meta;
- struct fd f;
+ CLASS(fd, f)(ufd);
- f = fdget(ufd);
inner_map = __bpf_map_get(f);
if (IS_ERR(inner_map))
return inner_map;
@@ -123,7 +109,6 @@ void *bpf_map_fd_get_ptr(struct bpf_map *map,
else
inner_map = ERR_PTR(-EINVAL);
- fdput(f);
return inner_map;
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index fd833a2b7c1b..4b3d770673dd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1418,19 +1418,11 @@ static int map_create(union bpf_attr *attr)
return err;
}
-/* if error is returned, fd is released.
- * On success caller should complete fd access with matching fdput()
- */
-struct bpf_map *__bpf_map_get(struct fd f)
+struct bpf_map *bpf_file_to_map(struct file *f)
{
- if (!fd_file(f))
- return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &bpf_map_fops) {
- fdput(f);
+ if (unlikely(f->f_op != &bpf_map_fops))
return ERR_PTR(-EINVAL);
- }
-
- return fd_file(f)->private_data;
+ return f->private_data;
}
void bpf_map_inc(struct bpf_map *map)
@@ -1448,15 +1440,11 @@ EXPORT_SYMBOL_GPL(bpf_map_inc_with_uref);
struct bpf_map *bpf_map_get(u32 ufd)
{
- struct fd f = fdget(ufd);
- struct bpf_map *map;
-
- map = __bpf_map_get(f);
- if (IS_ERR(map))
- return map;
+ CLASS(fd, f)(ufd);
+ struct bpf_map *map = __bpf_map_get(f);
- bpf_map_inc(map);
- fdput(f);
+ if (!IS_ERR(map))
+ bpf_map_inc(map);
return map;
}
@@ -1464,15 +1452,11 @@ EXPORT_SYMBOL(bpf_map_get);
struct bpf_map *bpf_map_get_with_uref(u32 ufd)
{
- struct fd f = fdget(ufd);
- struct bpf_map *map;
-
- map = __bpf_map_get(f);
- if (IS_ERR(map))
- return map;
+ CLASS(fd, f)(ufd);
+ struct bpf_map *map = __bpf_map_get(f);
- bpf_map_inc_with_uref(map);
- fdput(f);
+ if (!IS_ERR(map))
+ bpf_map_inc_with_uref(map);
return map;
}
@@ -1537,11 +1521,9 @@ static int map_lookup_elem(union bpf_attr *attr)
{
void __user *ukey = u64_to_user_ptr(attr->key);
void __user *uvalue = u64_to_user_ptr(attr->value);
- int ufd = attr->map_fd;
struct bpf_map *map;
void *key, *value;
u32 value_size;
- struct fd f;
int err;
if (CHECK_ATTR(BPF_MAP_LOOKUP_ELEM))
@@ -1550,26 +1532,20 @@ static int map_lookup_elem(union bpf_attr *attr)
if (attr->flags & ~BPF_F_LOCK)
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
- if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
- err = -EPERM;
- goto err_put;
- }
+ if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
+ return -EPERM;
if ((attr->flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
- err = -EINVAL;
- goto err_put;
- }
+ !btf_record_has_field(map->record, BPF_SPIN_LOCK))
+ return -EINVAL;
key = __bpf_copy_key(ukey, map->key_size);
- if (IS_ERR(key)) {
- err = PTR_ERR(key);
- goto err_put;
- }
+ if (IS_ERR(key))
+ return PTR_ERR(key);
value_size = bpf_map_value_size(map);
@@ -1600,8 +1576,6 @@ static int map_lookup_elem(union bpf_attr *attr)
kvfree(value);
free_key:
kvfree(key);
-err_put:
- fdput(f);
return err;
}
@@ -1612,17 +1586,15 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
{
bpfptr_t ukey = make_bpfptr(attr->key, uattr.is_kernel);
bpfptr_t uvalue = make_bpfptr(attr->value, uattr.is_kernel);
- int ufd = attr->map_fd;
struct bpf_map *map;
void *key, *value;
u32 value_size;
- struct fd f;
int err;
if (CHECK_ATTR(BPF_MAP_UPDATE_ELEM))
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
@@ -1660,7 +1632,6 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
kvfree(key);
err_put:
bpf_map_write_active_dec(map);
- fdput(f);
return err;
}
@@ -1669,16 +1640,14 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
static int map_delete_elem(union bpf_attr *attr, bpfptr_t uattr)
{
bpfptr_t ukey = make_bpfptr(attr->key, uattr.is_kernel);
- int ufd = attr->map_fd;
struct bpf_map *map;
- struct fd f;
void *key;
int err;
if (CHECK_ATTR(BPF_MAP_DELETE_ELEM))
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
@@ -1715,7 +1684,6 @@ static int map_delete_elem(union bpf_attr *attr, bpfptr_t uattr)
kvfree(key);
err_put:
bpf_map_write_active_dec(map);
- fdput(f);
return err;
}
@@ -1726,30 +1694,24 @@ static int map_get_next_key(union bpf_attr *attr)
{
void __user *ukey = u64_to_user_ptr(attr->key);
void __user *unext_key = u64_to_user_ptr(attr->next_key);
- int ufd = attr->map_fd;
struct bpf_map *map;
void *key, *next_key;
- struct fd f;
int err;
if (CHECK_ATTR(BPF_MAP_GET_NEXT_KEY))
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
- if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
- err = -EPERM;
- goto err_put;
- }
+ if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
+ return -EPERM;
if (ukey) {
key = __bpf_copy_key(ukey, map->key_size);
- if (IS_ERR(key)) {
- err = PTR_ERR(key);
- goto err_put;
- }
+ if (IS_ERR(key))
+ return PTR_ERR(key);
} else {
key = NULL;
}
@@ -1781,8 +1743,6 @@ static int map_get_next_key(union bpf_attr *attr)
kvfree(next_key);
free_key:
kvfree(key);
-err_put:
- fdput(f);
return err;
}
@@ -2011,11 +1971,9 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
{
void __user *ukey = u64_to_user_ptr(attr->key);
void __user *uvalue = u64_to_user_ptr(attr->value);
- int ufd = attr->map_fd;
struct bpf_map *map;
void *key, *value;
u32 value_size;
- struct fd f;
int err;
if (CHECK_ATTR(BPF_MAP_LOOKUP_AND_DELETE_ELEM))
@@ -2024,7 +1982,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
if (attr->flags & ~BPF_F_LOCK)
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
@@ -2094,7 +2052,6 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
kvfree(key);
err_put:
bpf_map_write_active_dec(map);
- fdput(f);
return err;
}
@@ -2102,27 +2059,22 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
static int map_freeze(const union bpf_attr *attr)
{
- int err = 0, ufd = attr->map_fd;
+ int err = 0;
struct bpf_map *map;
- struct fd f;
if (CHECK_ATTR(BPF_MAP_FREEZE))
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->map_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
- if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || !IS_ERR_OR_NULL(map->record)) {
- fdput(f);
+ if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS || !IS_ERR_OR_NULL(map->record))
return -ENOTSUPP;
- }
- if (!(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) {
- fdput(f);
+ if (!(map_get_sys_perms(map, f) & FMODE_CAN_WRITE))
return -EPERM;
- }
mutex_lock(&map->freeze_mutex);
if (bpf_map_write_active(map)) {
@@ -2137,7 +2089,6 @@ static int map_freeze(const union bpf_attr *attr)
WRITE_ONCE(map->frozen, true);
err_put:
mutex_unlock(&map->freeze_mutex);
- fdput(f);
return err;
}
@@ -2407,18 +2358,6 @@ int bpf_prog_new_fd(struct bpf_prog *prog)
O_RDWR | O_CLOEXEC);
}
-static struct bpf_prog *____bpf_prog_get(struct fd f)
-{
- if (!fd_file(f))
- return ERR_PTR(-EBADF);
- if (fd_file(f)->f_op != &bpf_prog_fops) {
- fdput(f);
- return ERR_PTR(-EINVAL);
- }
-
- return fd_file(f)->private_data;
-}
-
void bpf_prog_add(struct bpf_prog *prog, int i)
{
atomic64_add(i, &prog->aux->refcnt);
@@ -2474,20 +2413,22 @@ bool bpf_prog_get_ok(struct bpf_prog *prog,
static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *attach_type,
bool attach_drv)
{
- struct fd f = fdget(ufd);
+ CLASS(fd, f)(ufd);
struct bpf_prog *prog;
- prog = ____bpf_prog_get(f);
- if (IS_ERR(prog))
+ if (fd_empty(f))
+ return ERR_PTR(-EBADF);
+ if (fd_file(f)->f_op != &bpf_prog_fops)
+ return ERR_PTR(-EINVAL);
+
+ prog = fd_file(f)->private_data;
+ if (IS_ERR(prog)) // can that actually happen?
return prog;
- if (!bpf_prog_get_ok(prog, attach_type, attach_drv)) {
- prog = ERR_PTR(-EINVAL);
- goto out;
- }
+
+ if (!bpf_prog_get_ok(prog, attach_type, attach_drv))
+ return ERR_PTR(-EINVAL);
bpf_prog_inc(prog);
-out:
- fdput(f);
return prog;
}
@@ -5154,14 +5095,13 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH;
bool has_write = cmd != BPF_MAP_LOOKUP_BATCH;
struct bpf_map *map;
- int err, ufd;
- struct fd f;
+ int err;
if (CHECK_ATTR(BPF_MAP_BATCH))
return -EINVAL;
- ufd = attr->batch.map_fd;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->batch.map_fd);
+
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
@@ -5189,7 +5129,6 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
maybe_wait_bpf_programs(map);
bpf_map_write_active_dec(map);
}
- fdput(f);
return err;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 77da1f438bec..cfd30e79ac76 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18399,7 +18399,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) {
struct bpf_insn_aux_data *aux;
struct bpf_map *map;
- struct fd f;
u64 addr;
u32 fd;
@@ -18462,7 +18461,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
break;
}
- f = fdget(fd);
+ CLASS(fd, f)(fd);
+
map = __bpf_map_get(f);
if (IS_ERR(map)) {
verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
@@ -18470,10 +18470,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
}
err = check_map_prog_compatibility(env, map, env->prog);
- if (err) {
- fdput(f);
+ if (err)
return err;
- }
aux = &env->insn_aux_data[i];
if (insn[0].src_reg == BPF_PSEUDO_MAP_FD ||
@@ -18484,13 +18482,11 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (off >= BPF_MAX_VAR_OFF) {
verbose(env, "direct value offset of %u is not allowed\n", off);
- fdput(f);
return -EINVAL;
}
if (!map->ops->map_direct_value_addr) {
verbose(env, "no direct value access support for this map type\n");
- fdput(f);
return -EINVAL;
}
@@ -18498,7 +18494,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (err) {
verbose(env, "invalid access to map value pointer, value_size=%u off=%u\n",
map->value_size, off);
- fdput(f);
return err;
}
@@ -18513,7 +18508,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
for (j = 0; j < env->used_map_cnt; j++) {
if (env->used_maps[j] == map) {
aux->map_index = j;
- fdput(f);
goto next_insn;
}
}
@@ -18521,7 +18515,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (env->used_map_cnt >= MAX_USED_MAPS) {
verbose(env, "The total number of maps per program has reached the limit of %u\n",
MAX_USED_MAPS);
- fdput(f);
return -E2BIG;
}
@@ -18540,39 +18533,32 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
if (bpf_map_is_cgroup_storage(map) &&
bpf_cgroup_storage_assign(env->prog->aux, map)) {
verbose(env, "only one cgroup storage of each type is allowed\n");
- fdput(f);
return -EBUSY;
}
if (map->map_type == BPF_MAP_TYPE_ARENA) {
if (env->prog->aux->arena) {
verbose(env, "Only one arena per program\n");
- fdput(f);
return -EBUSY;
}
if (!env->allow_ptr_leaks || !env->bpf_capable) {
verbose(env, "CAP_BPF and CAP_PERFMON are required to use arena\n");
- fdput(f);
return -EPERM;
}
if (!env->prog->jit_requested) {
verbose(env, "JIT is required to use arena\n");
- fdput(f);
return -EOPNOTSUPP;
}
if (!bpf_jit_supports_arena()) {
verbose(env, "JIT doesn't support arena\n");
- fdput(f);
return -EOPNOTSUPP;
}
env->prog->aux->arena = (void *)map;
if (!bpf_arena_get_user_vm_start(env->prog->aux->arena)) {
verbose(env, "arena's user address must be set via map_extra or mmap()\n");
- fdput(f);
return -EINVAL;
}
}
- fdput(f);
next_insn:
insn++;
i++;
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 9402889840bf..3151dc1c8b3d 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -67,46 +67,39 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog)
{
- u32 ufd = attr->target_fd;
struct bpf_map *map;
- struct fd f;
int ret;
if (attr->attach_flags || attr->replace_bpf_fd)
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->target_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
mutex_lock(&sockmap_mutex);
ret = sock_map_prog_update(map, prog, NULL, NULL, attr->attach_type);
mutex_unlock(&sockmap_mutex);
- fdput(f);
return ret;
}
int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype)
{
- u32 ufd = attr->target_fd;
struct bpf_prog *prog;
struct bpf_map *map;
- struct fd f;
int ret;
if (attr->attach_flags || attr->replace_bpf_fd)
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->target_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
prog = bpf_prog_get(attr->attach_bpf_fd);
- if (IS_ERR(prog)) {
- ret = PTR_ERR(prog);
- goto put_map;
- }
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
if (prog->type != ptype) {
ret = -EINVAL;
@@ -118,8 +111,6 @@ int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype)
mutex_unlock(&sockmap_mutex);
put_prog:
bpf_prog_put(prog);
-put_map:
- fdput(f);
return ret;
}
@@ -1556,18 +1547,17 @@ int sock_map_bpf_prog_query(const union bpf_attr *attr,
union bpf_attr __user *uattr)
{
__u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
- u32 prog_cnt = 0, flags = 0, ufd = attr->target_fd;
+ u32 prog_cnt = 0, flags = 0;
struct bpf_prog **pprog;
struct bpf_prog *prog;
struct bpf_map *map;
- struct fd f;
u32 id = 0;
int ret;
if (attr->query.query_flags)
return -EINVAL;
- f = fdget(ufd);
+ CLASS(fd, f)(attr->target_fd);
map = __bpf_map_get(f);
if (IS_ERR(map))
return PTR_ERR(map);
@@ -1599,7 +1589,6 @@ int sock_map_bpf_prog_query(const union bpf_attr *attr,
copy_to_user(&uattr->query.prog_cnt, &prog_cnt, sizeof(prog_cnt)))
ret = -EFAULT;
- fdput(f);
return ret;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 13/19] convert vmsplice() to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (10 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 12/19] bpf: switch to CLASS(fd, ...) Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 14/19] finit_module(): convert " Al Viro
` (6 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Irregularity here is fdput() not in the same scope as fdget(); we could
just lift it out vmsplice_type() in vmsplice(2), but there's no much
point keeping vmsplice_type() separate after that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/splice.c | 31 ++++++++++---------------------
1 file changed, 10 insertions(+), 21 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index 42aa7bc46be5..2898fa1e9e63 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1564,21 +1564,6 @@ static ssize_t vmsplice_to_pipe(struct file *file, struct iov_iter *iter,
return ret;
}
-static int vmsplice_type(struct fd f, int *type)
-{
- if (!fd_file(f))
- return -EBADF;
- if (fd_file(f)->f_mode & FMODE_WRITE) {
- *type = ITER_SOURCE;
- } else if (fd_file(f)->f_mode & FMODE_READ) {
- *type = ITER_DEST;
- } else {
- fdput(f);
- return -EBADF;
- }
- return 0;
-}
-
/*
* Note that vmsplice only really supports true splicing _from_ user memory
* to a pipe, not the other way around. Splicing from user memory is a simple
@@ -1602,21 +1587,25 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
struct iovec *iov = iovstack;
struct iov_iter iter;
ssize_t error;
- struct fd f;
int type;
if (unlikely(flags & ~SPLICE_F_ALL))
return -EINVAL;
- f = fdget(fd);
- error = vmsplice_type(f, &type);
- if (error)
- return error;
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+ if (fd_file(f)->f_mode & FMODE_WRITE)
+ type = ITER_SOURCE;
+ else if (fd_file(f)->f_mode & FMODE_READ)
+ type = ITER_DEST;
+ else
+ return -EBADF;
error = import_iovec(type, uiov, nr_segs,
ARRAY_SIZE(iovstack), &iov, &iter);
if (error < 0)
- goto out_fdput;
+ return error;
if (!iov_iter_count(&iter))
error = 0;
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 14/19] finit_module(): convert to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (11 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 13/19] convert vmsplice() " Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 15/19] timerfd: switch " Al Viro
` (5 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Slightly unidiomatic emptiness check; just lift it out of idempotent_init_module()
and into finit_module(2) and that's it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
kernel/module/main.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 98fda13fdca7..f2f045b3740d 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -3177,7 +3177,7 @@ static int idempotent_init_module(struct file *f, const char __user * uargs, int
{
struct idempotent idem;
- if (!f || !(f->f_mode & FMODE_READ))
+ if (!(f->f_mode & FMODE_READ))
return -EBADF;
/* See if somebody else is doing the operation? */
@@ -3193,10 +3193,7 @@ static int idempotent_init_module(struct file *f, const char __user * uargs, int
SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
{
- int err;
- struct fd f;
-
- err = may_init_module();
+ int err = may_init_module();
if (err)
return err;
@@ -3207,10 +3204,10 @@ SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
|MODULE_INIT_COMPRESSED_FILE))
return -EINVAL;
- f = fdget(fd);
- err = idempotent_init_module(fd_file(f), uargs, flags);
- fdput(f);
- return err;
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+ return idempotent_init_module(fd_file(f), uargs, flags);
}
/* Keep in sync with MODULE_FLAGS_BUF_SIZE !!! */
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 15/19] timerfd: switch to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (12 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 14/19] finit_module(): convert " Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 16/19] do_mq_notify(): " Al Viro
` (4 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Fold timerfd_fget() into both callers to have fdget() and fdput() in
the same scope. Could be done in different ways, but this is probably
the smallest solution.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/timerfd.c | 38 ++++++++++++--------------------------
1 file changed, 12 insertions(+), 26 deletions(-)
diff --git a/fs/timerfd.c b/fs/timerfd.c
index 137523e0bb21..a72f83ed2e69 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -394,19 +394,6 @@ static const struct file_operations timerfd_fops = {
.unlocked_ioctl = timerfd_ioctl,
};
-static int timerfd_fget(int fd, struct fd *p)
-{
- struct fd f = fdget(fd);
- if (!fd_file(f))
- return -EBADF;
- if (fd_file(f)->f_op != &timerfd_fops) {
- fdput(f);
- return -EINVAL;
- }
- *p = f;
- return 0;
-}
-
SYSCALL_DEFINE2(timerfd_create, int, clockid, int, flags)
{
int ufd;
@@ -471,7 +458,6 @@ static int do_timerfd_settime(int ufd, int flags,
const struct itimerspec64 *new,
struct itimerspec64 *old)
{
- struct fd f;
struct timerfd_ctx *ctx;
int ret;
@@ -479,15 +465,15 @@ static int do_timerfd_settime(int ufd, int flags,
!itimerspec64_valid(new))
return -EINVAL;
- ret = timerfd_fget(ufd, &f);
- if (ret)
- return ret;
+ CLASS(fd, f)(ufd);
+ if (fd_empty(f))
+ return -EBADF;
+ if (fd_file(f)->f_op != &timerfd_fops)
+ return -EINVAL;
ctx = fd_file(f)->private_data;
- if (isalarm(ctx) && !capable(CAP_WAKE_ALARM)) {
- fdput(f);
+ if (isalarm(ctx) && !capable(CAP_WAKE_ALARM))
return -EPERM;
- }
timerfd_setup_cancel(ctx, flags);
@@ -535,17 +521,18 @@ static int do_timerfd_settime(int ufd, int flags,
ret = timerfd_setup(ctx, flags, new);
spin_unlock_irq(&ctx->wqh.lock);
- fdput(f);
return ret;
}
static int do_timerfd_gettime(int ufd, struct itimerspec64 *t)
{
- struct fd f;
+ CLASS(fd, f)(ufd);
struct timerfd_ctx *ctx;
- int ret = timerfd_fget(ufd, &f);
- if (ret)
- return ret;
+
+ if (fd_empty(f))
+ return -EBADF;
+ if (fd_file(f)->f_op != &timerfd_fops)
+ return -EINVAL;
ctx = fd_file(f)->private_data;
spin_lock_irq(&ctx->wqh.lock);
@@ -567,7 +554,6 @@ static int do_timerfd_gettime(int ufd, struct itimerspec64 *t)
t->it_value = ktime_to_timespec64(timerfd_get_remaining(ctx));
t->it_interval = ktime_to_timespec64(ctx->tintv);
spin_unlock_irq(&ctx->wqh.lock);
- fdput(f);
return 0;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 16/19] do_mq_notify(): switch to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (13 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 15/19] timerfd: switch " Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 17/19] simplify xfs_find_handle() a bit Al Viro
` (3 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
a minor twist is the reuse of struct fd instance in there
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
ipc/mqueue.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index c72ef725e845..d798a43fe981 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1290,7 +1290,6 @@ SYSCALL_DEFINE5(mq_timedreceive, mqd_t, mqdes, char __user *, u_msg_ptr,
static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
{
int ret;
- struct fd f;
struct sock *sock;
struct inode *inode;
struct mqueue_inode_info *info;
@@ -1328,13 +1327,14 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
skb_put(nc, NOTIFY_COOKIE_LEN);
/* and attach it to the socket */
retry:
- f = fdget(notification->sigev_signo);
- if (!fd_file(f)) {
- ret = -EBADF;
- goto out;
+ {
+ CLASS(fd, f)(notification->sigev_signo);
+ if (fd_empty(f)) {
+ ret = -EBADF;
+ goto out;
+ }
+ sock = netlink_getsockbyfilp(fd_file(f));
}
- sock = netlink_getsockbyfilp(fd_file(f));
- fdput(f);
if (IS_ERR(sock)) {
ret = PTR_ERR(sock);
goto free_skb;
@@ -1351,8 +1351,8 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
}
}
- f = fdget(mqdes);
- if (!fd_file(f)) {
+ CLASS(fd, f)(mqdes);
+ if (fd_empty(f)) {
ret = -EBADF;
goto out;
}
@@ -1360,7 +1360,7 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
inode = file_inode(fd_file(f));
if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
ret = -EBADF;
- goto out_fput;
+ goto out;
}
info = MQUEUE_I(inode);
@@ -1399,8 +1399,6 @@ static int do_mq_notify(mqd_t mqdes, const struct sigevent *notification)
inode_set_atime_to_ts(inode, inode_set_ctime_current(inode));
}
spin_unlock(&info->lock);
-out_fput:
- fdput(f);
out:
if (sock)
netlink_detachskb(sock, nc);
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 17/19] simplify xfs_find_handle() a bit
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (14 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 16/19] do_mq_notify(): " Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 18/19] convert kernel/events/core.c Al Viro
` (2 subsequent siblings)
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
XFS_IOC_FD_TO_HANDLE can grab a reference to copied ->f_path and
let the file go; results in simpler control flow - cleanup is
the same for both "by descriptor" and "by pathname" cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/xfs/xfs_handle.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_handle.c b/fs/xfs/xfs_handle.c
index bb250f4246b3..ccf87940f264 100644
--- a/fs/xfs/xfs_handle.c
+++ b/fs/xfs/xfs_handle.c
@@ -86,22 +86,23 @@ xfs_find_handle(
int hsize;
xfs_handle_t handle;
struct inode *inode;
- struct fd f = EMPTY_FD;
struct path path;
int error;
struct xfs_inode *ip;
if (cmd == XFS_IOC_FD_TO_HANDLE) {
- f = fdget(hreq->fd);
- if (!fd_file(f))
+ CLASS(fd, f)(hreq->fd);
+
+ if (fd_empty(f))
return -EBADF;
- inode = file_inode(fd_file(f));
+ path = fd_file(f)->f_path;
+ path_get(&path);
} else {
error = user_path_at(AT_FDCWD, hreq->path, 0, &path);
if (error)
return error;
- inode = d_inode(path.dentry);
}
+ inode = d_inode(path.dentry);
ip = XFS_I(inode);
/*
@@ -135,10 +136,7 @@ xfs_find_handle(
error = 0;
out_put:
- if (cmd == XFS_IOC_FD_TO_HANDLE)
- fdput(f);
- else
- path_put(&path);
+ path_put(&path);
return error;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 18/19] convert kernel/events/core.c
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (15 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 17/19] simplify xfs_find_handle() a bit Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 1:59 ` [PATCH 19/19] deal with the last remaing boolean uses of fd_file() Al Viro
2024-06-07 15:16 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Christian Brauner
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
a questionable trick in perf_event_open(2) - deliberate call of
fdget(-1), expecting it to yield empty.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
kernel/events/core.c | 47 +++++++++++++++-----------------------------
1 file changed, 16 insertions(+), 31 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bc4910442642..0cb3ecdaecae 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5866,18 +5866,9 @@ EXPORT_SYMBOL_GPL(perf_event_period);
static const struct file_operations perf_fops;
-static inline int perf_fget_light(int fd, struct fd *p)
+static inline bool is_perf_file(struct fd f)
{
- struct fd f = fdget(fd);
- if (!fd_file(f))
- return -EBADF;
-
- if (fd_file(f)->f_op != &perf_fops) {
- fdput(f);
- return -EBADF;
- }
- *p = f;
- return 0;
+ return !fd_empty(f) && fd_file(f)->f_op == &perf_fops;
}
static int perf_event_set_output(struct perf_event *event,
@@ -5925,20 +5916,16 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
case PERF_EVENT_IOC_SET_OUTPUT:
{
- int ret;
if (arg != -1) {
struct perf_event *output_event;
- struct fd output;
- ret = perf_fget_light(arg, &output);
- if (ret)
- return ret;
+ CLASS(fd, output)(arg);
+ if (!is_perf_file(output))
+ return -EBADF;
output_event = fd_file(output)->private_data;
- ret = perf_event_set_output(event, output_event);
- fdput(output);
+ return perf_event_set_output(event, output_event);
} else {
- ret = perf_event_set_output(event, NULL);
+ return perf_event_set_output(event, NULL);
}
- return ret;
}
case PERF_EVENT_IOC_SET_FILTER:
@@ -12434,7 +12421,6 @@ SYSCALL_DEFINE5(perf_event_open,
struct perf_event_attr attr;
struct perf_event_context *ctx;
struct file *event_file = NULL;
- struct fd group = EMPTY_FD;
struct task_struct *task = NULL;
struct pmu *pmu;
int event_fd;
@@ -12505,10 +12491,12 @@ SYSCALL_DEFINE5(perf_event_open,
if (event_fd < 0)
return event_fd;
+ CLASS(fd, group)(group_fd); // group_fd == -1 => empty
if (group_fd != -1) {
- err = perf_fget_light(group_fd, &group);
- if (err)
+ if (!is_perf_file(group)) {
+ err = -EBADF;
goto err_fd;
+ }
group_leader = fd_file(group)->private_data;
if (flags & PERF_FLAG_FD_OUTPUT)
output_event = group_leader;
@@ -12520,7 +12508,7 @@ SYSCALL_DEFINE5(perf_event_open,
task = find_lively_task_by_vpid(pid);
if (IS_ERR(task)) {
err = PTR_ERR(task);
- goto err_group_fd;
+ goto err_fd;
}
}
@@ -12787,12 +12775,11 @@ SYSCALL_DEFINE5(perf_event_open,
mutex_unlock(¤t->perf_event_mutex);
/*
- * Drop the reference on the group_event after placing the
- * new event on the sibling_list. This ensures destruction
- * of the group leader will find the pointer to itself in
- * perf_group_detach().
+ * File reference in group guarantees that group_leader has been
+ * kept alive until we place the new event on the sibling_list.
+ * This ensures destruction of the group leader will find
+ * the pointer to itself in perf_group_detach().
*/
- fdput(group);
fd_install(event_fd, event_file);
return event_fd;
@@ -12811,8 +12798,6 @@ SYSCALL_DEFINE5(perf_event_open,
err_task:
if (task)
put_task_struct(task);
-err_group_fd:
- fdput(group);
err_fd:
put_unused_fd(event_fd);
return err;
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 19/19] deal with the last remaing boolean uses of fd_file()
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (16 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 18/19] convert kernel/events/core.c Al Viro
@ 2024-06-07 1:59 ` Al Viro
2024-06-07 15:16 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Christian Brauner
18 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 1:59 UTC (permalink / raw)
To: linux-fsdevel; +Cc: brauner, torvalds
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
drivers/infiniband/core/uverbs_cmd.c | 8 +++-----
include/linux/cleanup.h | 2 +-
sound/core/pcm_native.c | 2 +-
3 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index efe3cc3debba..717880ebdd88 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -584,7 +584,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
if (cmd.fd != -1) {
/* search for file descriptor */
f = fdget(cmd.fd);
- if (!fd_file(f)) {
+ if (fd_empty(f)) {
ret = -EBADF;
goto err_tree_mutex_unlock;
}
@@ -632,8 +632,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
atomic_inc(&xrcd->usecnt);
}
- if (fd_file(f))
- fdput(f);
+ fdput(f);
mutex_unlock(&ibudev->xrcd_tree_mutex);
uobj_finalize_uobj_create(&obj->uobject, attrs);
@@ -648,8 +647,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
uobj_alloc_abort(&obj->uobject, attrs);
err_tree_mutex_unlock:
- if (fd_file(f))
- fdput(f);
+ fdput(f);
mutex_unlock(&ibudev->xrcd_tree_mutex);
diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 22cda00cf6a8..d3f943ea1d60 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -95,7 +95,7 @@ const volatile void * __must_check_fn(const volatile void *val)
* DEFINE_CLASS(fdget, struct fd, fdput(_T), fdget(fd), int fd)
*
* CLASS(fdget, f)(fd);
- * if (!fd_file(f))
+ * if (fd_empty(f))
* return -EBADF;
*
* // use 'f' without concern
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 388fdc226c1a..af5f651c9cc6 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2242,7 +2242,7 @@ static int snd_pcm_link(struct snd_pcm_substream *substream, int fd)
bool nonatomic = substream->pcm->nonatomic;
CLASS(fd, f)(fd);
- if (!fd_file(f))
+ if (fd_empty(f))
return -EBADFD;
if (!is_pcm_file(fd_file(f)))
return -EBADFD;
--
2.39.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 04/19] struct fd: representation change
2024-06-07 1:59 ` [PATCH 04/19] struct fd: representation change Al Viro
@ 2024-06-07 5:55 ` Amir Goldstein
0 siblings, 0 replies; 41+ messages in thread
From: Amir Goldstein @ 2024-06-07 5:55 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, brauner, torvalds, Miklos Szeredi
On Fri, Jun 7, 2024 at 5:00 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> The absolute majority of instances comes from fdget() and its
> relatives; the underlying primitives actually return a struct file
> reference and a couple of flags encoded into an unsigned long - the lower
> two bits of file address are always zero, so we can stash the flags
> into those. On the way out we use __to_fd() to unpack that unsigned
> long into struct fd.
>
> Let's use that representation for struct fd itself - make it
> a structure with a single unsigned long member (.word), with the value
> equal either to (unsigned long)p | flags, p being an address of some
> struct file instance, or to 0 for an empty fd.
>
> Note that we never used a struct fd instance with NULL ->file
> and non-zero ->flags; the emptiness had been checked as (!f.file) and
> we expected e.g. fdput(empty) to be a no-op. With new representation
> we can use (!f.word) for emptiness check; that is enough for compiler
> to figure out that (f.word & FDPUT_FPUT) will be false and that fdput(f)
> will be a no-op in such case.
>
> For now the new predicate (fd_empty(f)) has no users; all the
> existing checks have form (!fd_file(f)). We will convert to fd_empty()
> use later; here we only define it (and tell the compiler that it's
> unlikely to return true).
>
> This commit only deals with representation change; there will
> be followups.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> drivers/infiniband/core/uverbs_cmd.c | 2 +-
> fs/overlayfs/file.c | 28 +++++++++++++++-------------
> fs/xfs/xfs_handle.c | 2 +-
> include/linux/file.h | 22 ++++++++++++++++------
> kernel/events/core.c | 2 +-
> net/socket.c | 2 +-
> 6 files changed, 35 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 03ea3afcb31f..efe3cc3debba 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -572,7 +572,7 @@ static int ib_uverbs_open_xrcd(struct uverbs_attr_bundle *attrs)
> struct inode *inode = NULL;
> int new_xrcd = 0;
> struct ib_device *ib_dev;
> - struct fd f = {};
> + struct fd f = EMPTY_FD;
> int ret;
>
> ret = uverbs_request(attrs, &cmd, sizeof(cmd));
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index c4963d0c5549..458299873780 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -93,11 +93,11 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> bool allow_meta)
> {
> struct dentry *dentry = file_dentry(file);
> + struct file *private = file->private_data;
In future versions please rename s/private/realfile.
Thanks,
Amir.
> struct path realpath;
> int err;
>
> - real->flags = 0;
> - real->file = file->private_data;
> + real->word = (unsigned long)private;
>
> if (allow_meta) {
> ovl_path_real(dentry, &realpath);
> @@ -113,16 +113,17 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> return -EIO;
>
> /* Has it been copied up since we'd opened it? */
> - if (unlikely(file_inode(real->file) != d_inode(realpath.dentry))) {
> - real->flags = FDPUT_FPUT;
> - real->file = ovl_open_realfile(file, &realpath);
> -
> - return PTR_ERR_OR_ZERO(real->file);
> + if (unlikely(file_inode(private) != d_inode(realpath.dentry))) {
> + struct file *f = ovl_open_realfile(file, &realpath);
> + if (IS_ERR(f))
> + return PTR_ERR(f);
> + real->word = (unsigned long)ovl_open_realfile(file, &realpath) | FDPUT_FPUT;
> + return 0;
> }
>
> /* Did the flags change since open? */
> - if (unlikely((file->f_flags ^ real->file->f_flags) & ~OVL_OPEN_FLAGS))
> - return ovl_change_flags(real->file, file->f_flags);
> + if (unlikely((file->f_flags ^ private->f_flags) & ~OVL_OPEN_FLAGS))
> + return ovl_change_flags(private, file->f_flags);
>
> return 0;
> }
> @@ -130,10 +131,11 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> static int ovl_real_fdget(const struct file *file, struct fd *real)
> {
> if (d_is_dir(file_dentry(file))) {
> - real->flags = 0;
> - real->file = ovl_dir_real_file(file, false);
> -
> - return PTR_ERR_OR_ZERO(real->file);
> + struct file *f = ovl_dir_real_file(file, false);
> + if (IS_ERR(f))
> + return PTR_ERR(f);
> + real->word = (unsigned long)f;
> + return 0;
> }
>
> return ovl_real_fdget_meta(file, real, false);
> diff --git a/fs/xfs/xfs_handle.c b/fs/xfs/xfs_handle.c
> index 445a2daff233..bb250f4246b3 100644
> --- a/fs/xfs/xfs_handle.c
> +++ b/fs/xfs/xfs_handle.c
> @@ -86,7 +86,7 @@ xfs_find_handle(
> int hsize;
> xfs_handle_t handle;
> struct inode *inode;
> - struct fd f = {NULL};
> + struct fd f = EMPTY_FD;
> struct path path;
> int error;
> struct xfs_inode *ip;
> diff --git a/include/linux/file.h b/include/linux/file.h
> index 0964408727a7..39eb10a1bbfc 100644
> --- a/include/linux/file.h
> +++ b/include/linux/file.h
> @@ -35,18 +35,28 @@ static inline void fput_light(struct file *file, int fput_needed)
> fput(file);
> }
>
> +/* either a reference to struct file + flags
> + * (cloned vs. borrowed, pos locked), with
> + * flags stored in lower bits of value,
> + * or empty (represented by 0).
> + */
> struct fd {
> - struct file *file;
> - unsigned int flags;
> + unsigned long word;
> };
> #define FDPUT_FPUT 1
> #define FDPUT_POS_UNLOCK 2
>
> -#define fd_file(f) ((f).file)
> +#define fd_file(f) ((struct file *)((f).word & ~3))
> +static inline bool fd_empty(struct fd f)
> +{
> + return unlikely(!f.word);
> +}
> +
> +#define EMPTY_FD (struct fd){0}
>
> static inline void fdput(struct fd fd)
> {
> - if (fd.flags & FDPUT_FPUT)
> + if (fd.word & FDPUT_FPUT)
> fput(fd_file(fd));
> }
>
> @@ -60,7 +70,7 @@ extern void __f_unlock_pos(struct file *);
>
> static inline struct fd __to_fd(unsigned long v)
> {
> - return (struct fd){(struct file *)(v & ~3),v & 3};
> + return (struct fd){v};
> }
>
> static inline struct fd fdget(unsigned int fd)
> @@ -80,7 +90,7 @@ static inline struct fd fdget_pos(int fd)
>
> static inline void fdput_pos(struct fd f)
> {
> - if (f.flags & FDPUT_POS_UNLOCK)
> + if (f.word & FDPUT_POS_UNLOCK)
> __f_unlock_pos(fd_file(f));
> fdput(f);
> }
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7acf44111a6e..fd4621cd9c23 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -12438,7 +12438,7 @@ SYSCALL_DEFINE5(perf_event_open,
> struct perf_event_attr attr;
> struct perf_event_context *ctx;
> struct file *event_file = NULL;
> - struct fd group = {NULL, 0};
> + struct fd group = EMPTY_FD;
> struct task_struct *task = NULL;
> struct pmu *pmu;
> int event_fd;
> diff --git a/net/socket.c b/net/socket.c
> index 50b074f52147..a2c509363d4d 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -559,7 +559,7 @@ static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed)
> if (fd_file(f)) {
> sock = sock_from_file(fd_file(f));
> if (likely(sock)) {
> - *fput_needed = f.flags & FDPUT_FPUT;
> + *fput_needed = f.word & FDPUT_FPUT;
> return sock;
> }
> *err = -ENOTSOCK;
> --
> 2.39.2
>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
` (17 preceding siblings ...)
2024-06-07 1:59 ` [PATCH 19/19] deal with the last remaing boolean uses of fd_file() Al Viro
@ 2024-06-07 15:16 ` Christian Brauner
18 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:16 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:39AM +0100, Al Viro wrote:
> missing fdput() on one of the failure exits
>
> Fixes: eacc56bb9de3e # v5.2
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak
2024-06-07 1:59 ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
@ 2024-06-07 15:17 ` Christian Brauner
0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:17 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:40AM +0100, Al Viro wrote:
> missing fdput() on a failure exit
>
> Fixes: 6a9d552483d50 "media: rc: bpf attach/detach requires write permission" # v6.9
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...)
2024-06-07 1:59 ` [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...) Al Viro
@ 2024-06-07 15:20 ` Christian Brauner
0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:20 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:46AM +0100, Al Viro wrote:
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 09/19] css_set_fork(): switch to CLASS(fd_raw, ...)
2024-06-07 1:59 ` [PATCH 09/19] css_set_fork(): " Al Viro
@ 2024-06-07 15:21 ` Christian Brauner
0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:21 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:47AM +0100, Al Viro wrote:
> reference acquired there by fget_raw() is not stashed anywhere -
> we could as well borrow instead.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 10/19] introduce "fd_pos" class
2024-06-07 1:59 ` [PATCH 10/19] introduce "fd_pos" class Al Viro
@ 2024-06-07 15:21 ` Christian Brauner
0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:21 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:48AM +0100, Al Viro wrote:
> fdget_pos() for constructor, fdput_pos() for cleanup, users of
> fd..._pos() converted.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...) Al Viro
@ 2024-06-07 15:26 ` Christian Brauner
2024-06-07 16:10 ` Al Viro
0 siblings, 1 reply; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:26 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:49AM +0100, Al Viro wrote:
> low-hanging fruit; that's another likely source of conflicts
> over the cycle and it might make a lot of sense to split;
> fortunately, it can be split pretty much on per-function
> basis - chunks are independent from each other.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
I can pick conversions from you for files where I already have changes
in the tree anyway or already have done conversion as part of other
patches.
> arch/powerpc/kvm/book3s_64_vio.c | 7 +-
> arch/powerpc/kvm/powerpc.c | 24 ++---
> arch/powerpc/platforms/cell/spu_syscalls.c | 13 +--
> arch/x86/kernel/cpu/sgx/main.c | 10 +-
> arch/x86/kvm/svm/sev.c | 39 +++-----
> drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 23 ++---
> drivers/gpu/drm/drm_syncobj.c | 9 +-
> drivers/infiniband/core/ucma.c | 19 ++--
> drivers/media/mc/mc-request.c | 18 ++--
> drivers/media/rc/lirc_dev.c | 13 +--
> drivers/vfio/group.c | 6 +-
> drivers/vfio/virqfd.c | 16 +--
> drivers/virt/acrn/irqfd.c | 10 +-
> drivers/xen/privcmd.c | 31 ++----
> fs/btrfs/ioctl.c | 5 +-
> fs/coda/inode.c | 11 +-
> fs/eventfd.c | 9 +-
> fs/eventpoll.c | 38 ++-----
> fs/ext4/ioctl.c | 21 ++--
> fs/f2fs/file.c | 15 +--
> fs/fhandle.c | 5 +-
> fs/fsopen.c | 19 ++--
> fs/fuse/dev.c | 6 +-
> fs/ioctl.c | 23 ++---
> fs/kernel_read_file.c | 12 +--
> fs/locks.c | 15 +--
> fs/namespace.c | 47 +++------
> fs/notify/fanotify/fanotify_user.c | 44 +++-----
> fs/notify/inotify/inotify_user.c | 38 +++----
> fs/ocfs2/cluster/heartbeat.c | 13 +--
> fs/open.c | 48 ++++-----
> fs/read_write.c | 111 ++++++++-------------
> fs/remap_range.c | 11 +-
> fs/select.c | 13 +--
> fs/signalfd.c | 9 +-
> fs/smb/client/ioctl.c | 11 +-
> fs/splice.c | 47 ++++-----
> fs/sync.c | 29 ++----
> fs/utimes.c | 11 +-
> fs/xattr.c | 40 +++-----
> fs/xfs/xfs_exchrange.c | 10 +-
> fs/xfs/xfs_ioctl.c | 69 ++++---------
> io_uring/sqpoll.c | 29 ++----
> ipc/mqueue.c | 76 +++++---------
> kernel/bpf/btf.c | 11 +-
> kernel/bpf/syscall.c | 32 ++----
> kernel/bpf/token.c | 15 +--
> kernel/events/core.c | 14 +--
> kernel/nsproxy.c | 5 +-
> kernel/pid.c | 20 ++--
> kernel/signal.c | 29 ++----
> kernel/sys.c | 15 +--
> kernel/taskstats.c | 18 ++--
> kernel/watch_queue.c | 6 +-
> mm/fadvise.c | 10 +-
> mm/filemap.c | 17 +---
> mm/memcontrol.c | 29 ++----
> mm/readahead.c | 17 +---
> net/core/net_namespace.c | 10 +-
> security/integrity/ima/ima_main.c | 7 +-
> security/landlock/syscalls.c | 26 ++---
> security/loadpin/loadpin.c | 8 +-
> virt/kvm/eventfd.c | 15 +--
> virt/kvm/vfio.c | 14 +--
> 64 files changed, 454 insertions(+), 937 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
> index ce8f9539af2b..742aa58a7c7e 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -115,10 +115,9 @@ long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
> struct iommu_table_group *table_group;
> long i;
> struct kvmppc_spapr_tce_iommu_table *stit;
> - struct fd f;
> + CLASS(fd, f)(tablefd);
>
> - f = fdget(tablefd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> rcu_read_lock();
> @@ -130,8 +129,6 @@ long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
> }
> rcu_read_unlock();
>
> - fdput(f);
> -
> if (!found)
> return -EINVAL;
>
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index fd62144e497e..e31412069117 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -1933,12 +1933,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
> #endif
> #ifdef CONFIG_KVM_MPIC
> case KVM_CAP_IRQ_MPIC: {
> - struct fd f;
> + CLASS(fd, f)(cap->args[0]);
> struct kvm_device *dev;
>
> r = -EBADF;
> - f = fdget(cap->args[0]);
> - if (!fd_file(f))
> + if (fd_empty(f))
> break;
>
> r = -EPERM;
> @@ -1946,18 +1945,16 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
> if (dev)
> r = kvmppc_mpic_connect_vcpu(dev, vcpu, cap->args[1]);
>
> - fdput(f);
> break;
> }
> #endif
> #ifdef CONFIG_KVM_XICS
> case KVM_CAP_IRQ_XICS: {
> - struct fd f;
> + CLASS(fd, f)(cap->args[0]);
> struct kvm_device *dev;
>
> r = -EBADF;
> - f = fdget(cap->args[0]);
> - if (!fd_file(f))
> + if (fd_empty(f))
> break;
>
> r = -EPERM;
> @@ -1968,34 +1965,27 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
> else
> r = kvmppc_xics_connect_vcpu(dev, vcpu, cap->args[1]);
> }
> -
> - fdput(f);
> break;
> }
> #endif /* CONFIG_KVM_XICS */
> #ifdef CONFIG_KVM_XIVE
> case KVM_CAP_PPC_IRQ_XIVE: {
> - struct fd f;
> + CLASS(fd, f)(cap->args[0]);
> struct kvm_device *dev;
>
> r = -EBADF;
> - f = fdget(cap->args[0]);
> - if (!fd_file(f))
> + if (fd_empty(f))
> break;
>
> r = -ENXIO;
> - if (!xive_enabled()) {
> - fdput(f);
> + if (!xive_enabled())
> break;
> - }
>
> r = -EPERM;
> dev = kvm_device_from_filp(fd_file(f));
> if (dev)
> r = kvmppc_xive_native_connect_vcpu(dev, vcpu,
> cap->args[1]);
> -
> - fdput(f);
> break;
> }
> #endif /* CONFIG_KVM_XIVE */
> diff --git a/arch/powerpc/platforms/cell/spu_syscalls.c b/arch/powerpc/platforms/cell/spu_syscalls.c
> index cd7d42fc12a6..ca602376d025 100644
> --- a/arch/powerpc/platforms/cell/spu_syscalls.c
> +++ b/arch/powerpc/platforms/cell/spu_syscalls.c
> @@ -64,12 +64,10 @@ SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
> return -ENOSYS;
>
> if (flags & SPU_CREATE_AFFINITY_SPU) {
> - struct fd neighbor = fdget(neighbor_fd);
> + CLASS(fd, neighbor)(neighbor_fd);
> ret = -EBADF;
> - if (fd_file(neighbor)) {
> + if (!fd_empty(neighbor))
> ret = calls->create_thread(name, flags, mode, fd_file(neighbor));
> - fdput(neighbor);
> - }
> } else
> ret = calls->create_thread(name, flags, mode, NULL);
>
> @@ -80,7 +78,6 @@ SYSCALL_DEFINE4(spu_create, const char __user *, name, unsigned int, flags,
> SYSCALL_DEFINE3(spu_run,int, fd, __u32 __user *, unpc, __u32 __user *, ustatus)
> {
> long ret;
> - struct fd arg;
> struct spufs_calls *calls;
>
> calls = spufs_calls_get();
> @@ -88,11 +85,9 @@ SYSCALL_DEFINE3(spu_run,int, fd, __u32 __user *, unpc, __u32 __user *, ustatus)
> return -ENOSYS;
>
> ret = -EBADF;
> - arg = fdget(fd);
> - if (fd_file(arg)) {
> + CLASS(fd, arg)(fd);
> + if (!fd_empty(arg))
> ret = calls->spu_run(fd_file(arg), unpc, ustatus);
> - fdput(arg);
> - }
>
> spufs_calls_put(calls);
> return ret;
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index d01deb386395..d91284001162 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -893,19 +893,15 @@ static struct miscdevice sgx_dev_provision = {
> int sgx_set_attribute(unsigned long *allowed_attributes,
> unsigned int attribute_fd)
> {
> - struct fd f = fdget(attribute_fd);
> + CLASS(fd, f)(attribute_fd);
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EINVAL;
>
> - if (fd_file(f)->f_op != &sgx_provision_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &sgx_provision_fops)
> return -EINVAL;
> - }
>
> *allowed_attributes |= SGX_ATTR_PROVISIONKEY;
> -
> - fdput(f);
> return 0;
> }
> EXPORT_SYMBOL_GPL(sgx_set_attribute);
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 6a8154d6935a..197c80b809dc 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -373,17 +373,12 @@ static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
>
> static int __sev_issue_cmd(int fd, int id, void *data, int *error)
> {
> - struct fd f;
> - int ret;
> + CLASS(fd, f)(fd);
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> - ret = sev_issue_cmd_external_user(fd_file(f), id, data, error);
> -
> - fdput(f);
> - return ret;
> + return sev_issue_cmd_external_user(fd_file(f), id, data, error);
> }
>
> static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
> @@ -1908,23 +1903,21 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
> {
> struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> struct kvm_sev_info *src_sev, *cg_cleanup_sev;
> - struct fd f = fdget(source_fd);
> + CLASS(fd, f)(source_fd);
> struct kvm *source_kvm;
> bool charged = false;
> int ret;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> - if (!file_is_kvm(fd_file(f))) {
> - ret = -EBADF;
> - goto out_fput;
> - }
> + if (!file_is_kvm(fd_file(f)))
> + return -EBADF;
>
> source_kvm = fd_file(f)->private_data;
> ret = sev_lock_two_vms(kvm, source_kvm);
> if (ret)
> - goto out_fput;
> + return ret;
>
> if (kvm->arch.vm_type != source_kvm->arch.vm_type ||
> sev_guest(kvm) || !sev_guest(source_kvm)) {
> @@ -1971,8 +1964,6 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
> cg_cleanup_sev->misc_cg = NULL;
> out_unlock:
> sev_unlock_two_vms(kvm, source_kvm);
> -out_fput:
> - fdput(f);
> return ret;
> }
>
> @@ -2209,23 +2200,21 @@ int sev_mem_enc_unregister_region(struct kvm *kvm,
>
> int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
> {
> - struct fd f = fdget(source_fd);
> + CLASS(fd, f)(source_fd);
> struct kvm *source_kvm;
> struct kvm_sev_info *source_sev, *mirror_sev;
> int ret;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> - if (!file_is_kvm(fd_file(f))) {
> - ret = -EBADF;
> - goto e_source_fput;
> - }
> + if (!file_is_kvm(fd_file(f)))
> + return -EBADF;
>
> source_kvm = fd_file(f)->private_data;
> ret = sev_lock_two_vms(kvm, source_kvm);
> if (ret)
> - goto e_source_fput;
> + return ret;
>
> /*
> * Mirrors of mirrors should work, but let's not get silly. Also
> @@ -2268,8 +2257,6 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
>
> e_unlock:
> sev_unlock_two_vms(kvm, source_kvm);
> -e_source_fput:
> - fdput(f);
> return ret;
> }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> index a9298cb8d19a..570634654489 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
> @@ -36,21 +36,19 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
> int fd,
> int32_t priority)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct amdgpu_fpriv *fpriv;
> struct amdgpu_ctx_mgr *mgr;
> struct amdgpu_ctx *ctx;
> uint32_t id;
> int r;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EINVAL;
>
> r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
> - if (r) {
> - fdput(f);
> + if (r)
> return r;
> - }
>
> mgr = &fpriv->ctx_mgr;
> mutex_lock(&mgr->lock);
> @@ -58,7 +56,6 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
> amdgpu_ctx_priority_override(ctx, priority);
> mutex_unlock(&mgr->lock);
>
> - fdput(f);
> return 0;
> }
>
> @@ -67,31 +64,25 @@ static int amdgpu_sched_context_priority_override(struct amdgpu_device *adev,
> unsigned ctx_id,
> int32_t priority)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct amdgpu_fpriv *fpriv;
> struct amdgpu_ctx *ctx;
> int r;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EINVAL;
>
> r = amdgpu_file_to_fpriv(fd_file(f), &fpriv);
> - if (r) {
> - fdput(f);
> + if (r)
> return r;
> - }
>
> ctx = amdgpu_ctx_get(fpriv, ctx_id);
>
> - if (!ctx) {
> - fdput(f);
> + if (!ctx)
> return -EINVAL;
> - }
>
> amdgpu_ctx_priority_override(ctx, priority);
> amdgpu_ctx_put(ctx);
> - fdput(f);
> -
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 7fb31ca3b5fc..4eaebd69253b 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -712,16 +712,14 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
> int fd, u32 *handle)
> {
> struct drm_syncobj *syncobj;
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> int ret;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EINVAL;
>
> - if (fd_file(f)->f_op != &drm_syncobj_file_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &drm_syncobj_file_fops)
> return -EINVAL;
> - }
>
> /* take a reference to put in the idr */
> syncobj = fd_file(f)->private_data;
> @@ -739,7 +737,6 @@ static int drm_syncobj_fd_to_handle(struct drm_file *file_private,
> } else
> drm_syncobj_put(syncobj);
>
> - fdput(f);
> return ret;
> }
>
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
> index dc57d07a1f45..cc95718fd24b 100644
> --- a/drivers/infiniband/core/ucma.c
> +++ b/drivers/infiniband/core/ucma.c
> @@ -1615,7 +1615,6 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
> struct ucma_event *uevent, *tmp;
> struct ucma_context *ctx;
> LIST_HEAD(event_list);
> - struct fd f;
> struct ucma_file *cur_file;
> int ret = 0;
>
> @@ -1623,21 +1622,17 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
> return -EFAULT;
>
> /* Get current fd to protect against it being closed */
> - f = fdget(cmd.fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(cmd.fd);
> + if (fd_empty(f))
> return -ENOENT;
> - if (fd_file(f)->f_op != &ucma_fops) {
> - ret = -EINVAL;
> - goto file_put;
> - }
> + if (fd_file(f)->f_op != &ucma_fops)
> + return -EINVAL;
> cur_file = fd_file(f)->private_data;
>
> /* Validate current fd and prevent destruction of id. */
> ctx = ucma_get_ctx(cur_file, cmd.id);
> - if (IS_ERR(ctx)) {
> - ret = PTR_ERR(ctx);
> - goto file_put;
> - }
> + if (IS_ERR(ctx))
> + return PTR_ERR(ctx);
>
> rdma_lock_handler(ctx->cm_id);
> /*
> @@ -1678,8 +1673,6 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
> err_unlock:
> rdma_unlock_handler(ctx->cm_id);
> ucma_put_ctx(ctx);
> -file_put:
> - fdput(f);
> return ret;
> }
>
> diff --git a/drivers/media/mc/mc-request.c b/drivers/media/mc/mc-request.c
> index e064914c476e..df39c8c11e9a 100644
> --- a/drivers/media/mc/mc-request.c
> +++ b/drivers/media/mc/mc-request.c
> @@ -246,22 +246,21 @@ static const struct file_operations request_fops = {
> struct media_request *
> media_request_get_by_fd(struct media_device *mdev, int request_fd)
> {
> - struct fd f;
> struct media_request *req;
>
> if (!mdev || !mdev->ops ||
> !mdev->ops->req_validate || !mdev->ops->req_queue)
> return ERR_PTR(-EBADR);
>
> - f = fdget(request_fd);
> - if (!fd_file(f))
> - goto err_no_req_fd;
> + CLASS(fd, f)(request_fd);
> + if (fd_empty(f))
> + goto err;
>
> if (fd_file(f)->f_op != &request_fops)
> - goto err_fput;
> + goto err;
> req = fd_file(f)->private_data;
> if (req->mdev != mdev)
> - goto err_fput;
> + goto err;
>
> /*
> * Note: as long as someone has an open filehandle of the request,
> @@ -272,14 +271,9 @@ media_request_get_by_fd(struct media_device *mdev, int request_fd)
> * before media_request_get() is called.
> */
> media_request_get(req);
> - fdput(f);
> -
> return req;
>
> -err_fput:
> - fdput(f);
> -
> -err_no_req_fd:
> +err:
> dev_dbg(mdev->dev, "cannot find request_fd %d\n", request_fd);
> return ERR_PTR(-EINVAL);
> }
> diff --git a/drivers/media/rc/lirc_dev.c b/drivers/media/rc/lirc_dev.c
> index b8dfd530fab7..af6337489d21 100644
> --- a/drivers/media/rc/lirc_dev.c
> +++ b/drivers/media/rc/lirc_dev.c
> @@ -816,28 +816,23 @@ void __exit lirc_dev_exit(void)
>
> struct rc_dev *rc_dev_get_from_fd(int fd, bool write)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct lirc_fh *fh;
> struct rc_dev *dev;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
>
> - if (fd_file(f)->f_op != &lirc_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &lirc_fops)
> return ERR_PTR(-EINVAL);
> - }
>
> - if (write && !(fd_file(f)->f_mode & FMODE_WRITE)) {
> - fdput(f);
> + if (write && !(fd_file(f)->f_mode & FMODE_WRITE))
> return ERR_PTR(-EPERM);
> - }
>
> fh = fd_file(f)->private_data;
> dev = fh->rc;
>
> get_device(&dev->dev);
> - fdput(f);
>
> return dev;
> }
> diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
> index f0d77e3c7196..8bad9b11c844 100644
> --- a/drivers/vfio/group.c
> +++ b/drivers/vfio/group.c
> @@ -104,15 +104,14 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
> {
> struct vfio_container *container;
> struct iommufd_ctx *iommufd;
> - struct fd f;
> int ret;
> int fd;
>
> if (get_user(fd, arg))
> return -EFAULT;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EBADF;
>
> mutex_lock(&group->group_lock);
> @@ -153,7 +152,6 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
>
> out_unlock:
> mutex_unlock(&group->group_lock);
> - fdput(f);
> return ret;
> }
>
> diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c
> index d22881245e89..aa2891f97508 100644
> --- a/drivers/vfio/virqfd.c
> +++ b/drivers/vfio/virqfd.c
> @@ -113,7 +113,6 @@ int vfio_virqfd_enable(void *opaque,
> void (*thread)(void *, void *),
> void *data, struct virqfd **pvirqfd, int fd)
> {
> - struct fd irqfd;
> struct eventfd_ctx *ctx;
> struct virqfd *virqfd;
> int ret = 0;
> @@ -133,8 +132,8 @@ int vfio_virqfd_enable(void *opaque,
> INIT_WORK(&virqfd->inject, virqfd_inject);
> INIT_WORK(&virqfd->flush_inject, virqfd_flush_inject);
>
> - irqfd = fdget(fd);
> - if (!fd_file(irqfd)) {
> + CLASS(fd, irqfd)(fd);
> + if (fd_empty(irqfd)) {
> ret = -EBADF;
> goto err_fd;
> }
> @@ -142,7 +141,7 @@ int vfio_virqfd_enable(void *opaque,
> ctx = eventfd_ctx_fileget(fd_file(irqfd));
> if (IS_ERR(ctx)) {
> ret = PTR_ERR(ctx);
> - goto err_ctx;
> + goto err_fd;
> }
>
> virqfd->eventfd = ctx;
> @@ -181,18 +180,9 @@ int vfio_virqfd_enable(void *opaque,
> if ((!handler || handler(opaque, data)) && thread)
> schedule_work(&virqfd->inject);
> }
> -
> - /*
> - * Do not drop the file until the irqfd is fully initialized,
> - * otherwise we might race against the EPOLLHUP.
> - */
> - fdput(irqfd);
> -
> return 0;
> err_busy:
> eventfd_ctx_put(ctx);
> -err_ctx:
> - fdput(irqfd);
> err_fd:
> kfree(virqfd);
>
> diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
> index 9994d818bb7e..f4d37ceea2ab 100644
> --- a/drivers/virt/acrn/irqfd.c
> +++ b/drivers/virt/acrn/irqfd.c
> @@ -112,7 +112,6 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
> struct eventfd_ctx *eventfd = NULL;
> struct hsm_irqfd *irqfd, *tmp;
> __poll_t events;
> - struct fd f;
> int ret = 0;
>
> irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> @@ -124,8 +123,8 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
> INIT_LIST_HEAD(&irqfd->list);
> INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work);
>
> - f = fdget(args->fd);
> - if (!fd_file(f)) {
> + CLASS(fd, f)(args->fd);
> + if (fd_empty(f)) {
> ret = -EBADF;
> goto out;
> }
> @@ -133,7 +132,7 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
> eventfd = eventfd_ctx_fileget(fd_file(f));
> if (IS_ERR(eventfd)) {
> ret = PTR_ERR(eventfd);
> - goto fail;
> + goto out;
> }
>
> irqfd->eventfd = eventfd;
> @@ -162,13 +161,10 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
> if (events & EPOLLIN)
> acrn_irqfd_inject(irqfd);
>
> - fdput(f);
> return 0;
> fail:
> if (eventfd && !IS_ERR(eventfd))
> eventfd_ctx_put(eventfd);
> -
> - fdput(f);
> out:
> kfree(irqfd);
> return ret;
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index c35c2455aa61..ba772a6347f6 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -930,7 +930,6 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
> {
> struct privcmd_kernel_irqfd *kirqfd, *tmp;
> __poll_t events;
> - struct fd f;
> void *dm_op;
> int ret;
>
> @@ -949,8 +948,8 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
> kirqfd->dom = irqfd->dom;
> INIT_WORK(&kirqfd->shutdown, irqfd_shutdown);
>
> - f = fdget(irqfd->fd);
> - if (!fd_file(f)) {
> + CLASS(fd, f)(irqfd->fd);
> + if (fd_empty(f)) {
> ret = -EBADF;
> goto error_kfree;
> }
> @@ -958,7 +957,7 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
> kirqfd->eventfd = eventfd_ctx_fileget(fd_file(f));
> if (IS_ERR(kirqfd->eventfd)) {
> ret = PTR_ERR(kirqfd->eventfd);
> - goto error_fd_put;
> + goto error_kfree;
> }
>
> /*
> @@ -989,19 +988,11 @@ static int privcmd_irqfd_assign(struct privcmd_irqfd *irqfd)
> if (events & EPOLLIN)
> irqfd_inject(kirqfd);
>
> - /*
> - * Do not drop the file until the kirqfd is fully initialized, otherwise
> - * we might race against the EPOLLHUP.
> - */
> - fdput(f);
> return 0;
>
> error_eventfd:
> eventfd_ctx_put(kirqfd->eventfd);
>
> -error_fd_put:
> - fdput(f);
> -
> error_kfree:
> kfree(kirqfd);
> return ret;
> @@ -1310,7 +1301,6 @@ static int privcmd_ioeventfd_assign(struct privcmd_ioeventfd *ioeventfd)
> struct privcmd_kernel_ioeventfd *kioeventfd;
> struct privcmd_kernel_ioreq *kioreq;
> unsigned long flags;
> - struct fd f;
> int ret;
>
> /* Check for range overflow */
> @@ -1330,14 +1320,15 @@ static int privcmd_ioeventfd_assign(struct privcmd_ioeventfd *ioeventfd)
> if (!kioeventfd)
> return -ENOMEM;
>
> - f = fdget(ioeventfd->event_fd);
> - if (!fd_file(f)) {
> - ret = -EBADF;
> - goto error_kfree;
> - }
> + {
> + CLASS(fd, f)(ioeventfd->event_fd);
> + if (fd_empty(f)) {
> + ret = -EBADF;
> + goto error_kfree;
> + }
>
> - kioeventfd->eventfd = eventfd_ctx_fileget(fd_file(f));
> - fdput(f);
> + kioeventfd->eventfd = eventfd_ctx_fileget(fd_file(f));
> + }
>
> if (IS_ERR(kioeventfd->eventfd)) {
> ret = PTR_ERR(kioeventfd->eventfd);
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 2dccb7f90e4d..1703ba0b07e6 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1309,9 +1309,9 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
> ret = btrfs_mksubvol(&file->f_path, idmap, name,
> namelen, NULL, readonly, inherit);
> } else {
> - struct fd src = fdget(fd);
> + CLASS(fd, src)(fd);
> struct inode *src_inode;
> - if (!fd_file(src)) {
> + if (fd_empty(src)) {
> ret = -EINVAL;
> goto out_drop_write;
> }
> @@ -1342,7 +1342,6 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
> BTRFS_I(src_inode)->root,
> readonly, inherit);
> }
> - fdput(src);
> }
> out_drop_write:
> mnt_drop_write_file(file);
> diff --git a/fs/coda/inode.c b/fs/coda/inode.c
> index 7d56b6d1e4c3..293cf5e6dfeb 100644
> --- a/fs/coda/inode.c
> +++ b/fs/coda/inode.c
> @@ -122,22 +122,17 @@ static const struct fs_parameter_spec coda_param_specs[] = {
> static int coda_parse_fd(struct fs_context *fc, int fd)
> {
> struct coda_fs_context *ctx = fc->fs_private;
> - struct fd f;
> + CLASS(fd, f)(fd);
> struct inode *inode;
> int idx;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
> inode = file_inode(fd_file(f));
> - if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR) {
> - fdput(f);
> + if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR)
> return invalf(fc, "code: Not coda psdev");
> - }
>
> idx = iminor(inode);
> - fdput(f);
> -
> if (idx < 0 || idx >= MAX_CODADEVS)
> return invalf(fc, "coda: Bad minor number");
> ctx->idx = idx;
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index 22c934f3a080..76129bfcd663 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -347,13 +347,10 @@ EXPORT_SYMBOL_GPL(eventfd_fget);
> */
> struct eventfd_ctx *eventfd_ctx_fdget(int fd)
> {
> - struct eventfd_ctx *ctx;
> - struct fd f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
> - ctx = eventfd_ctx_fileget(fd_file(f));
> - fdput(f);
> - return ctx;
> + return eventfd_ctx_fileget(fd_file(f));
> }
> EXPORT_SYMBOL_GPL(eventfd_ctx_fdget);
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 28d1a754cf33..1e63f3b03ca5 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -2259,25 +2259,22 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
> {
> int error;
> int full_check = 0;
> - struct fd f, tf;
> struct eventpoll *ep;
> struct epitem *epi;
> struct eventpoll *tep = NULL;
>
> - error = -EBADF;
> - f = fdget(epfd);
> - if (!fd_file(f))
> - goto error_return;
> + CLASS(fd, f)(epfd);
> + if (fd_empty(f))
> + return -EBADF;
>
> /* Get the "struct file *" for the target file */
> - tf = fdget(fd);
> - if (!fd_file(tf))
> - goto error_fput;
> + CLASS(fd, tf)(fd);
> + if (fd_empty(tf))
> + return -EBADF;
>
> /* The target file descriptor must support poll */
> - error = -EPERM;
> if (!file_can_poll(fd_file(tf)))
> - goto error_tgt_fput;
> + return -EPERM;
>
> /* Check if EPOLLWAKEUP is allowed */
> if (ep_op_has_event(op))
> @@ -2396,12 +2393,6 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds,
> loop_check_gen++;
> mutex_unlock(&epnested_mutex);
> }
> -
> - fdput(tf);
> -error_fput:
> - fdput(f);
> -error_return:
> -
> return error;
> }
>
> @@ -2429,8 +2420,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
> static int do_epoll_wait(int epfd, struct epoll_event __user *events,
> int maxevents, struct timespec64 *to)
> {
> - int error;
> - struct fd f;
> struct eventpoll *ep;
>
> /* The maximum number of event must be greater than zero */
> @@ -2442,17 +2431,16 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
> return -EFAULT;
>
> /* Get the "struct file *" for the eventpoll file */
> - f = fdget(epfd);
> - if (!fd_file(f))
> + CLASS(fd, f)(epfd);
> + if (fd_empty(f))
> return -EBADF;
>
> /*
> * We have to check that the file structure underneath the fd
> * the user passed to us _is_ an eventpoll file.
> */
> - error = -EINVAL;
> if (!is_file_epoll(fd_file(f)))
> - goto error_fput;
> + return -EINVAL;
>
> /*
> * At this point it is safe to assume that the "private_data" contains
> @@ -2461,11 +2449,7 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events,
> ep = fd_file(f)->private_data;
>
> /* Time to fish for events ... */
> - error = ep_poll(ep, events, maxevents, to);
> -
> -error_fput:
> - fdput(f);
> - return error;
> + return ep_poll(ep, events, maxevents, to);
> }
>
> SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events,
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 63eaa1fa2556..5503c92cdb6d 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -1330,7 +1330,6 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>
> case EXT4_IOC_MOVE_EXT: {
> struct move_extent me;
> - struct fd donor;
> int err;
>
> if (!(filp->f_mode & FMODE_READ) ||
> @@ -1342,30 +1341,26 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> return -EFAULT;
> me.moved_len = 0;
>
> - donor = fdget(me.donor_fd);
> - if (!fd_file(donor))
> + CLASS(fd, donor)(me.donor_fd);
> + if (fd_empty(donor))
> return -EBADF;
>
> - if (!(fd_file(donor)->f_mode & FMODE_WRITE)) {
> - err = -EBADF;
> - goto mext_out;
> - }
> + if (!(fd_file(donor)->f_mode & FMODE_WRITE))
> + return -EBADF;
>
> if (ext4_has_feature_bigalloc(sb)) {
> ext4_msg(sb, KERN_ERR,
> "Online defrag not supported with bigalloc");
> - err = -EOPNOTSUPP;
> - goto mext_out;
> + return -EOPNOTSUPP;
> } else if (IS_DAX(inode)) {
> ext4_msg(sb, KERN_ERR,
> "Online defrag not supported with DAX");
> - err = -EOPNOTSUPP;
> - goto mext_out;
> + return -EOPNOTSUPP;
> }
>
> err = mnt_want_write_file(filp);
> if (err)
> - goto mext_out;
> + return err;
>
> err = ext4_move_extents(filp, fd_file(donor), me.orig_start,
> me.donor_start, me.len, &me.moved_len);
> @@ -1374,8 +1369,6 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> if (copy_to_user((struct move_extent __user *)arg,
> &me, sizeof(me)))
> err = -EFAULT;
> -mext_out:
> - fdput(donor);
> return err;
> }
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 89db22f9488b..c70191b8b345 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -2962,32 +2962,27 @@ static int f2fs_move_file_range(struct file *file_in, loff_t pos_in,
> static int __f2fs_ioc_move_range(struct file *filp,
> struct f2fs_move_range *range)
> {
> - struct fd dst;
> int err;
>
> if (!(filp->f_mode & FMODE_READ) ||
> !(filp->f_mode & FMODE_WRITE))
> return -EBADF;
>
> - dst = fdget(range->dst_fd);
> - if (!fd_file(dst))
> + CLASS(fd, dst)(range->dst_fd);
> + if (fd_empty(dst))
> return -EBADF;
>
> - if (!(fd_file(dst)->f_mode & FMODE_WRITE)) {
> - err = -EBADF;
> - goto err_out;
> - }
> + if (!(fd_file(dst)->f_mode & FMODE_WRITE))
> + return -EBADF;
>
> err = mnt_want_write_file(filp);
> if (err)
> - goto err_out;
> + return err;
>
> err = f2fs_move_file_range(filp, range->pos_in, fd_file(dst),
> range->pos_out, range->len);
>
> mnt_drop_write_file(filp);
> -err_out:
> - fdput(dst);
> return err;
> }
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 94fc4126eaa4..c6ae83e456b2 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -125,11 +125,10 @@ static struct vfsmount *get_vfsmount_from_fd(int fd)
> mnt = mntget(fs->pwd.mnt);
> spin_unlock(&fs->lock);
> } else {
> - struct fd f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
> mnt = mntget(fd_file(f)->f_path.mnt);
> - fdput(f);
> }
> return mnt;
> }
> diff --git a/fs/fsopen.c b/fs/fsopen.c
> index cf9f37b2a5d4..f779836c7288 100644
> --- a/fs/fsopen.c
> +++ b/fs/fsopen.c
> @@ -354,7 +354,6 @@ SYSCALL_DEFINE5(fsconfig,
> int, aux)
> {
> struct fs_context *fc;
> - struct fd f;
> int ret;
> int lookup_flags = 0;
>
> @@ -397,12 +396,11 @@ SYSCALL_DEFINE5(fsconfig,
> return -EOPNOTSUPP;
> }
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EBADF;
> - ret = -EINVAL;
> if (fd_file(f)->f_op != &fscontext_fops)
> - goto out_f;
> + return -EINVAL;
>
> fc = fd_file(f)->private_data;
> if (fc->ops == &legacy_fs_context_ops) {
> @@ -411,17 +409,14 @@ SYSCALL_DEFINE5(fsconfig,
> case FSCONFIG_SET_PATH:
> case FSCONFIG_SET_PATH_EMPTY:
> case FSCONFIG_SET_FD:
> - ret = -EOPNOTSUPP;
> - goto out_f;
> + return -EOPNOTSUPP;
> }
> }
>
> if (_key) {
> param.key = strndup_user(_key, 256);
> - if (IS_ERR(param.key)) {
> - ret = PTR_ERR(param.key);
> - goto out_f;
> - }
> + if (IS_ERR(param.key))
> + return PTR_ERR(param.key);
> }
>
> switch (cmd) {
> @@ -500,7 +495,5 @@ SYSCALL_DEFINE5(fsconfig,
> }
> out_key:
> kfree(param.key);
> -out_f:
> - fdput(f);
> return ret;
> }
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 991b9ae8e7c9..27826116a4fb 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -2315,13 +2315,12 @@ static long fuse_dev_ioctl_clone(struct file *file, __u32 __user *argp)
> int res;
> int oldfd;
> struct fuse_dev *fud = NULL;
> - struct fd f;
>
> if (get_user(oldfd, argp))
> return -EFAULT;
>
> - f = fdget(oldfd);
> - if (!fd_file(f))
> + CLASS(fd, f)(oldfd);
> + if (fd_empty(f))
> return -EINVAL;
>
> /*
> @@ -2338,7 +2337,6 @@ static long fuse_dev_ioctl_clone(struct file *file, __u32 __user *argp)
> mutex_unlock(&fuse_mutex);
> }
>
> - fdput(f);
> return res;
> }
>
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 6e0c954388d4..638a36be31c1 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -231,11 +231,11 @@ static int ioctl_fiemap(struct file *filp, struct fiemap __user *ufiemap)
> static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
> u64 off, u64 olen, u64 destoff)
> {
> - struct fd src_file = fdget(srcfd);
> + CLASS(fd, src_file)(srcfd);
> loff_t cloned;
> int ret;
>
> - if (!fd_file(src_file))
> + if (fd_empty(src_file))
> return -EBADF;
> cloned = vfs_clone_file_range(fd_file(src_file), off, dst_file, destoff,
> olen, 0);
> @@ -245,7 +245,6 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
> ret = -EINVAL;
> else
> ret = 0;
> - fdput(src_file);
> return ret;
> }
>
> @@ -892,22 +891,20 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd,
>
> SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> int error;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> error = security_file_ioctl(fd_file(f), cmd, arg);
> if (error)
> - goto out;
> + return error;
>
> error = do_vfs_ioctl(fd_file(f), fd, cmd, arg);
> if (error == -ENOIOCTLCMD)
> error = vfs_ioctl(fd_file(f), cmd, arg);
>
> -out:
> - fdput(f);
> return error;
> }
>
> @@ -950,15 +947,15 @@ EXPORT_SYMBOL(compat_ptr_ioctl);
> COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
> compat_ulong_t, arg)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> int error;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> error = security_file_ioctl_compat(fd_file(f), cmd, arg);
> if (error)
> - goto out;
> + return error;
>
> switch (cmd) {
> /* FICLONE takes an int argument, so don't use compat_ptr() */
> @@ -1009,10 +1006,6 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
> error = -ENOTTY;
> break;
> }
> -
> - out:
> - fdput(f);
> -
> return error;
> }
> #endif
> diff --git a/fs/kernel_read_file.c b/fs/kernel_read_file.c
> index 9ff37ae650ea..de32c95d823d 100644
> --- a/fs/kernel_read_file.c
> +++ b/fs/kernel_read_file.c
> @@ -175,15 +175,11 @@ ssize_t kernel_read_file_from_fd(int fd, loff_t offset, void **buf,
> size_t buf_size, size_t *file_size,
> enum kernel_read_file_id id)
> {
> - struct fd f = fdget(fd);
> - ssize_t ret = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
> - goto out;
> + if (fd_empty(f) || !(fd_file(f)->f_mode & FMODE_READ))
> + return -EBADF;
>
> - ret = kernel_read_file(fd_file(f), offset, buf, buf_size, file_size, id);
> -out:
> - fdput(f);
> - return ret;
> + return kernel_read_file(fd_file(f), offset, buf, buf_size, file_size, id);
> }
> EXPORT_SYMBOL_GPL(kernel_read_file_from_fd);
> diff --git a/fs/locks.c b/fs/locks.c
> index ee8e1925dc42..4170035a2bc8 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2132,7 +2132,6 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
> {
> int can_sleep, error, type;
> struct file_lock fl;
> - struct fd f;
>
> /*
> * LOCK_MAND locks were broken for a long time in that they never
> @@ -2151,19 +2150,18 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
> if (type < 0)
> return type;
>
> - error = -EBADF;
> - f = fdget(fd);
> - if (!fd_file(f))
> - return error;
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> + return -EBADF;
>
> if (type != F_UNLCK && !(fd_file(f)->f_mode & (FMODE_READ | FMODE_WRITE)))
> - goto out_putf;
> + return -EBADF;
>
> flock_make_lock(fd_file(f), &fl, type);
>
> error = security_file_lock(fd_file(f), fl.c.flc_type);
> if (error)
> - goto out_putf;
> + return error;
>
> can_sleep = !(cmd & LOCK_NB);
> if (can_sleep)
> @@ -2177,9 +2175,6 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
> error = locks_lock_file_wait(fd_file(f), &fl);
>
> locks_release_private(&fl);
> - out_putf:
> - fdput(f);
> -
> return error;
> }
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 7c8248aca8bd..6fbb70272ebd 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -3948,7 +3948,6 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
> struct file *file;
> struct path newmount;
> struct mount *mnt;
> - struct fd f;
> unsigned int mnt_flags = 0;
> long ret;
>
> @@ -3976,19 +3975,18 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
> return -EINVAL;
> }
>
> - f = fdget(fs_fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fs_fd);
> + if (fd_empty(f))
> return -EBADF;
>
> - ret = -EINVAL;
> if (fd_file(f)->f_op != &fscontext_fops)
> - goto err_fsfd;
> + return -EINVAL;
>
> fc = fd_file(f)->private_data;
>
> ret = mutex_lock_interruptible(&fc->uapi_mutex);
> if (ret < 0)
> - goto err_fsfd;
> + return ret;
>
> /* There must be a valid superblock or we can't mount it */
> ret = -EINVAL;
> @@ -4055,8 +4053,6 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
> path_put(&newmount);
> err_unlock:
> mutex_unlock(&fc->uapi_mutex);
> -err_fsfd:
> - fdput(f);
> return ret;
> }
>
> @@ -4507,10 +4503,8 @@ static int do_mount_setattr(struct path *path, struct mount_kattr *kattr)
> static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
> struct mount_kattr *kattr, unsigned int flags)
> {
> - int err = 0;
> struct ns_common *ns;
> struct user_namespace *mnt_userns;
> - struct fd f;
>
> if (!((attr->attr_set | attr->attr_clr) & MOUNT_ATTR_IDMAP))
> return 0;
> @@ -4526,20 +4520,16 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
> if (attr->userns_fd > INT_MAX)
> return -EINVAL;
>
> - f = fdget(attr->userns_fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(attr->userns_fd);
> + if (fd_empty(f))
> return -EBADF;
>
> - if (!proc_ns_file(fd_file(f))) {
> - err = -EINVAL;
> - goto out_fput;
> - }
> + if (!proc_ns_file(fd_file(f)))
> + return -EINVAL;
>
> ns = get_proc_ns(file_inode(fd_file(f)));
> - if (ns->ops->type != CLONE_NEWUSER) {
> - err = -EINVAL;
> - goto out_fput;
> - }
> + if (ns->ops->type != CLONE_NEWUSER)
> + return -EINVAL;
>
> /*
> * The initial idmapping cannot be used to create an idmapped
> @@ -4550,22 +4540,15 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
> * result.
> */
> mnt_userns = container_of(ns, struct user_namespace, ns);
> - if (mnt_userns == &init_user_ns) {
> - err = -EPERM;
> - goto out_fput;
> - }
> + if (mnt_userns == &init_user_ns)
> + return -EPERM;
>
> /* We're not controlling the target namespace. */
> - if (!ns_capable(mnt_userns, CAP_SYS_ADMIN)) {
> - err = -EPERM;
> - goto out_fput;
> - }
> + if (!ns_capable(mnt_userns, CAP_SYS_ADMIN))
> + return -EPERM;
>
> kattr->mnt_userns = get_user_ns(mnt_userns);
> -
> -out_fput:
> - fdput(f);
> - return err;
> + return 0;
> }
>
> static int build_mount_kattr(const struct mount_attr *attr, size_t usize,
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 13454e5fd3fb..7b5abc1b8c8f 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1003,22 +1003,17 @@ static int fanotify_find_path(int dfd, const char __user *filename,
> dfd, filename, flags);
>
> if (filename == NULL) {
> - struct fd f = fdget(dfd);
> + CLASS(fd, f)(dfd);
>
> - ret = -EBADF;
> - if (!fd_file(f))
> - goto out;
> + if (fd_empty(f))
> + return -EBADF;
>
> - ret = -ENOTDIR;
> if ((flags & FAN_MARK_ONLYDIR) &&
> - !(S_ISDIR(file_inode(fd_file(f))->i_mode))) {
> - fdput(f);
> - goto out;
> - }
> + !(S_ISDIR(file_inode(fd_file(f))->i_mode)))
> + return -ENOTDIR;
>
> *path = fd_file(f)->f_path;
> path_get(path);
> - fdput(f);
> } else {
> unsigned int lookup_flags = 0;
>
> @@ -1682,7 +1677,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> struct inode *inode = NULL;
> struct vfsmount *mnt = NULL;
> struct fsnotify_group *group;
> - struct fd f;
> struct path path;
> struct fan_fsid __fsid, *fsid = NULL;
> u32 valid_mask = FANOTIFY_EVENTS | FANOTIFY_EVENT_FLAGS;
> @@ -1752,14 +1746,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> umask = FANOTIFY_EVENT_FLAGS;
> }
>
> - f = fdget(fanotify_fd);
> - if (unlikely(!fd_file(f)))
> + CLASS(fd, f)(fanotify_fd);
> + if (fd_empty(f))
> return -EBADF;
>
> /* verify that this is indeed an fanotify instance */
> - ret = -EINVAL;
> if (unlikely(fd_file(f)->f_op != &fanotify_fops))
> - goto fput_and_out;
> + return -EINVAL;
> group = fd_file(f)->private_data;
>
> /*
> @@ -1767,23 +1760,21 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> * marks. This also includes setting up such marks by a group that
> * was initialized by an unprivileged user.
> */
> - ret = -EPERM;
> if ((!capable(CAP_SYS_ADMIN) ||
> FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
> mark_type != FAN_MARK_INODE)
> - goto fput_and_out;
> + return -EPERM;
>
> /*
> * Permission events require minimum priority FAN_CLASS_CONTENT.
> */
> - ret = -EINVAL;
> if (mask & FANOTIFY_PERM_EVENTS &&
> group->priority < FSNOTIFY_PRIO_CONTENT)
> - goto fput_and_out;
> + return -EINVAL;
>
> if (mask & FAN_FS_ERROR &&
> mark_type != FAN_MARK_FILESYSTEM)
> - goto fput_and_out;
> + return -EINVAL;
>
> /*
> * Evictable is only relevant for inode marks, because only inode object
> @@ -1791,7 +1782,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> */
> if (flags & FAN_MARK_EVICTABLE &&
> mark_type != FAN_MARK_INODE)
> - goto fput_and_out;
> + return -EINVAL;
>
> /*
> * Events that do not carry enough information to report
> @@ -1803,7 +1794,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> if (mask & ~(FANOTIFY_FD_EVENTS|FANOTIFY_EVENT_FLAGS) &&
> (!fid_mode || mark_type == FAN_MARK_MOUNT))
> - goto fput_and_out;
> + return -EINVAL;
>
> /*
> * FAN_RENAME uses special info type records to report the old and
> @@ -1811,23 +1802,22 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> * useful and was not implemented.
> */
> if (mask & FAN_RENAME && !(fid_mode & FAN_REPORT_NAME))
> - goto fput_and_out;
> + return -EINVAL;
>
> if (mark_cmd == FAN_MARK_FLUSH) {
> - ret = 0;
> if (mark_type == FAN_MARK_MOUNT)
> fsnotify_clear_vfsmount_marks_by_group(group);
> else if (mark_type == FAN_MARK_FILESYSTEM)
> fsnotify_clear_sb_marks_by_group(group);
> else
> fsnotify_clear_inode_marks_by_group(group);
> - goto fput_and_out;
> + return 0;
> }
>
> ret = fanotify_find_path(dfd, pathname, &path, flags,
> (mask & ALL_FSNOTIFY_EVENTS), obj_type);
> if (ret)
> - goto fput_and_out;
> + return ret;
>
> if (mark_cmd == FAN_MARK_ADD) {
> ret = fanotify_events_supported(group, &path, mask, flags);
> @@ -1906,8 +1896,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>
> path_put_and_out:
> path_put(&path);
> -fput_and_out:
> - fdput(f);
> return ret;
> }
>
> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> index c7e451d5bd51..f46ec6afee3c 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -732,7 +732,6 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
> struct fsnotify_group *group;
> struct inode *inode;
> struct path path;
> - struct fd f;
> int ret;
> unsigned flags = 0;
>
> @@ -752,21 +751,17 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
> if (unlikely(!(mask & ALL_INOTIFY_BITS)))
> return -EINVAL;
>
> - f = fdget(fd);
> - if (unlikely(!fd_file(f)))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EBADF;
>
> /* IN_MASK_ADD and IN_MASK_CREATE don't make sense together */
> - if (unlikely((mask & IN_MASK_ADD) && (mask & IN_MASK_CREATE))) {
> - ret = -EINVAL;
> - goto fput_and_out;
> - }
> + if (unlikely((mask & IN_MASK_ADD) && (mask & IN_MASK_CREATE)))
> + return -EINVAL;
>
> /* verify that this is indeed an inotify instance */
> - if (unlikely(fd_file(f)->f_op != &inotify_fops)) {
> - ret = -EINVAL;
> - goto fput_and_out;
> - }
> + if (unlikely(fd_file(f)->f_op != &inotify_fops))
> + return -EINVAL;
>
> if (!(mask & IN_DONT_FOLLOW))
> flags |= LOOKUP_FOLLOW;
> @@ -776,7 +771,7 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
> ret = inotify_find_inode(pathname, &path, flags,
> (mask & IN_ALL_EVENTS));
> if (ret)
> - goto fput_and_out;
> + return ret;
>
> /* inode held in place by reference to path; group by fget on fd */
> inode = path.dentry->d_inode;
> @@ -785,8 +780,6 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
> /* create/update an inode mark */
> ret = inotify_update_watch(group, inode, mask);
> path_put(&path);
> -fput_and_out:
> - fdput(f);
> return ret;
> }
>
> @@ -794,33 +787,26 @@ SYSCALL_DEFINE2(inotify_rm_watch, int, fd, __s32, wd)
> {
> struct fsnotify_group *group;
> struct inotify_inode_mark *i_mark;
> - struct fd f;
> - int ret = -EINVAL;
> + CLASS(fd, f)(fd);
>
> - f = fdget(fd);
> - if (unlikely(!fd_file(f)))
> + if (fd_empty(f))
> return -EBADF;
>
> /* verify that this is indeed an inotify instance */
> if (unlikely(fd_file(f)->f_op != &inotify_fops))
> - goto out;
> + return -EINVAL;
>
> group = fd_file(f)->private_data;
>
> i_mark = inotify_idr_find(group, wd);
> if (unlikely(!i_mark))
> - goto out;
> -
> - ret = 0;
> + return -EINVAL;
>
> fsnotify_destroy_mark(&i_mark->fsn_mark, group);
>
> /* match ref taken by inotify_idr_find */
> fsnotify_put_mark(&i_mark->fsn_mark);
> -
> -out:
> - fdput(f);
> - return ret;
> + return 0;
> }
>
> /*
> diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
> index 4b9f45d7049e..08d9ac1b137f 100644
> --- a/fs/ocfs2/cluster/heartbeat.c
> +++ b/fs/ocfs2/cluster/heartbeat.c
> @@ -1765,7 +1765,6 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
> long fd;
> int sectsize;
> char *p = (char *)page;
> - struct fd f;
> ssize_t ret = -EINVAL;
> int live_threshold;
>
> @@ -1784,23 +1783,23 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
> if (fd < 0 || fd >= INT_MAX)
> goto out;
>
> - f = fdget(fd);
> - if (fd_file(f) == NULL)
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> goto out;
>
> if (reg->hr_blocks == 0 || reg->hr_start_block == 0 ||
> reg->hr_block_bytes == 0)
> - goto out2;
> + goto out;
>
> if (!S_ISBLK(fd_file(f)->f_mapping->host->i_mode))
> - goto out2;
> + goto out;
>
> reg->hr_bdev_file = bdev_file_open_by_dev(fd_file(f)->f_mapping->host->i_rdev,
> BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
> if (IS_ERR(reg->hr_bdev_file)) {
> ret = PTR_ERR(reg->hr_bdev_file);
> reg->hr_bdev_file = NULL;
> - goto out2;
> + goto out;
> }
>
> sectsize = bdev_logical_block_size(reg_bdev(reg));
> @@ -1906,8 +1905,6 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
> fput(reg->hr_bdev_file);
> reg->hr_bdev_file = NULL;
> }
> -out2:
> - fdput(f);
> out:
> return ret;
> }
> diff --git a/fs/open.c b/fs/open.c
> index 4cb5e12e84a5..71e166e0907c 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -187,19 +187,13 @@ long do_ftruncate(struct file *file, loff_t length, int small)
>
> long do_sys_ftruncate(unsigned int fd, loff_t length, int small)
> {
> - struct fd f;
> - int error;
> -
> if (length < 0)
> return -EINVAL;
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EBADF;
>
> - error = do_ftruncate(fd_file(f), length, small);
> -
> - fdput(f);
> - return error;
> + return do_ftruncate(fd_file(f), length, small);
> }
>
> SYSCALL_DEFINE2(ftruncate, unsigned int, fd, unsigned long, length)
> @@ -346,14 +340,12 @@ EXPORT_SYMBOL_GPL(vfs_fallocate);
>
> int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len)
> {
> - struct fd f = fdget(fd);
> - int error = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (fd_file(f)) {
> - error = vfs_fallocate(fd_file(f), mode, offset, len);
> - fdput(f);
> - }
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> +
> + return vfs_fallocate(fd_file(f), mode, offset, len);
> }
>
> SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len)
> @@ -663,14 +655,12 @@ int vfs_fchmod(struct file *file, umode_t mode)
>
> SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> {
> - struct fd f = fdget(fd);
> - int err = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (fd_file(f)) {
> - err = vfs_fchmod(fd_file(f), mode);
> - fdput(f);
> - }
> - return err;
> + if (fd_empty(f))
> + return -EBADF;
> +
> + return vfs_fchmod(fd_file(f), mode);
> }
>
> static int do_fchmodat(int dfd, const char __user *filename, umode_t mode,
> @@ -857,14 +847,12 @@ int vfs_fchown(struct file *file, uid_t user, gid_t group)
>
> int ksys_fchown(unsigned int fd, uid_t user, gid_t group)
> {
> - struct fd f = fdget(fd);
> - int error = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (fd_file(f)) {
> - error = vfs_fchown(fd_file(f), user, group);
> - fdput(f);
> - }
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> +
> + return vfs_fchown(fd_file(f), user, group);
> }
>
> SYSCALL_DEFINE3(fchown, unsigned int, fd, uid_t, user, gid_t, group)
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 6d49202e2507..4d3067381915 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -652,21 +652,17 @@ SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
> ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
> loff_t pos)
> {
> - struct fd f;
> - ssize_t ret = -EBADF;
> -
> if (pos < 0)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (fd_file(f)) {
> - ret = -ESPIPE;
> - if (fd_file(f)->f_mode & FMODE_PREAD)
> - ret = vfs_read(fd_file(f), buf, count, &pos);
> - fdput(f);
> - }
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> + return -EBADF;
>
> - return ret;
> + if (fd_file(f)->f_mode & FMODE_PREAD)
> + return vfs_read(fd_file(f), buf, count, &pos);
> +
> + return -ESPIPE;
> }
>
> SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf,
> @@ -686,21 +682,17 @@ COMPAT_SYSCALL_DEFINE5(pread64, unsigned int, fd, char __user *, buf,
> ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
> size_t count, loff_t pos)
> {
> - struct fd f;
> - ssize_t ret = -EBADF;
> -
> if (pos < 0)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (fd_file(f)) {
> - ret = -ESPIPE;
> - if (fd_file(f)->f_mode & FMODE_PWRITE)
> - ret = vfs_write(fd_file(f), buf, count, &pos);
> - fdput(f);
> - }
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> + return -EBADF;
>
> - return ret;
> + if (fd_file(f)->f_mode & FMODE_PWRITE)
> + return vfs_write(fd_file(f), buf, count, &pos);
> +
> + return -ESPIPE;
> }
>
> SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf,
> @@ -1028,18 +1020,16 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
> static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
> unsigned long vlen, loff_t pos, rwf_t flags)
> {
> - struct fd f;
> ssize_t ret = -EBADF;
>
> if (pos < 0)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (fd_file(f)) {
> + CLASS(fd, f)(fd);
> + if (!fd_empty(f)) {
> ret = -ESPIPE;
> if (fd_file(f)->f_mode & FMODE_PREAD)
> ret = vfs_readv(fd_file(f), vec, vlen, &pos, flags);
> - fdput(f);
> }
>
> if (ret > 0)
> @@ -1051,18 +1041,16 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
> static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
> unsigned long vlen, loff_t pos, rwf_t flags)
> {
> - struct fd f;
> ssize_t ret = -EBADF;
>
> if (pos < 0)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (fd_file(f)) {
> + CLASS(fd, f)(fd);
> + if (!fd_empty(f)) {
> ret = -ESPIPE;
> if (fd_file(f)->f_mode & FMODE_PWRITE)
> ret = vfs_writev(fd_file(f), vec, vlen, &pos, flags);
> - fdput(f);
> }
>
> if (ret > 0)
> @@ -1214,7 +1202,6 @@ COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
> static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
> size_t count, loff_t max)
> {
> - struct fd in, out;
> struct inode *in_inode, *out_inode;
> struct pipe_inode_info *opipe;
> loff_t pos;
> @@ -1225,35 +1212,32 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
> /*
> * Get input file, and verify that it is ok..
> */
> - retval = -EBADF;
> - in = fdget(in_fd);
> - if (!fd_file(in))
> - goto out;
> + CLASS(fd, in)(in_fd);
> + if (fd_empty(in))
> + return -EBADF;
> if (!(fd_file(in)->f_mode & FMODE_READ))
> - goto fput_in;
> - retval = -ESPIPE;
> + return -EBADF;
> if (!ppos) {
> pos = fd_file(in)->f_pos;
> } else {
> pos = *ppos;
> if (!(fd_file(in)->f_mode & FMODE_PREAD))
> - goto fput_in;
> + return -ESPIPE;
> }
> retval = rw_verify_area(READ, fd_file(in), &pos, count);
> if (retval < 0)
> - goto fput_in;
> + return retval;
> if (count > MAX_RW_COUNT)
> count = MAX_RW_COUNT;
>
> /*
> * Get output file, and verify that it is ok..
> */
> - retval = -EBADF;
> - out = fdget(out_fd);
> - if (!fd_file(out))
> - goto fput_in;
> + CLASS(fd, out)(out_fd);
> + if (fd_empty(out))
> + return -EBADF;
> if (!(fd_file(out)->f_mode & FMODE_WRITE))
> - goto fput_out;
> + return -EBADF;
> in_inode = file_inode(fd_file(in));
> out_inode = file_inode(fd_file(out));
> out_pos = fd_file(out)->f_pos;
> @@ -1262,9 +1246,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
> max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
>
> if (unlikely(pos + count > max)) {
> - retval = -EOVERFLOW;
> if (pos >= max)
> - goto fput_out;
> + return -EOVERFLOW;
> count = max - pos;
> }
>
> @@ -1283,7 +1266,7 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
> if (!opipe) {
> retval = rw_verify_area(WRITE, fd_file(out), &out_pos, count);
> if (retval < 0)
> - goto fput_out;
> + return retval;
> retval = do_splice_direct(fd_file(in), &pos, fd_file(out), &out_pos,
> count, fl);
> } else {
> @@ -1309,12 +1292,6 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
> inc_syscw(current);
> if (pos > max)
> retval = -EOVERFLOW;
> -
> -fput_out:
> - fdput(out);
> -fput_in:
> - fdput(in);
> -out:
> return retval;
> }
>
> @@ -1570,36 +1547,32 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
> {
> loff_t pos_in;
> loff_t pos_out;
> - struct fd f_in;
> - struct fd f_out;
> ssize_t ret = -EBADF;
>
> - f_in = fdget(fd_in);
> - if (!fd_file(f_in))
> - goto out2;
> + CLASS(fd, f_in)(fd_in);
> + if (fd_empty(f_in))
> + return -EBADF;
>
> - f_out = fdget(fd_out);
> - if (!fd_file(f_out))
> - goto out1;
> + CLASS(fd, f_out)(fd_out);
> + if (fd_empty(f_out))
> + return -EBADF;
>
> - ret = -EFAULT;
> if (off_in) {
> if (copy_from_user(&pos_in, off_in, sizeof(loff_t)))
> - goto out;
> + return -EFAULT;
> } else {
> pos_in = fd_file(f_in)->f_pos;
> }
>
> if (off_out) {
> if (copy_from_user(&pos_out, off_out, sizeof(loff_t)))
> - goto out;
> + return -EFAULT;
> } else {
> pos_out = fd_file(f_out)->f_pos;
> }
>
> - ret = -EINVAL;
> if (flags != 0)
> - goto out;
> + return -EINVAL;
>
> ret = vfs_copy_file_range(fd_file(f_in), pos_in, fd_file(f_out), pos_out, len,
> flags);
> @@ -1621,12 +1594,6 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
> fd_file(f_out)->f_pos = pos_out;
> }
> }
> -
> -out:
> - fdput(f_out);
> -out1:
> - fdput(f_in);
> -out2:
> return ret;
> }
>
> diff --git a/fs/remap_range.c b/fs/remap_range.c
> index 4403d5c68fcb..26afbbbfb10c 100644
> --- a/fs/remap_range.c
> +++ b/fs/remap_range.c
> @@ -536,20 +536,19 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
> }
>
> for (i = 0, info = same->info; i < count; i++, info++) {
> - struct fd dst_fd = fdget(info->dest_fd);
> - struct file *dst_file = fd_file(dst_fd);
> + CLASS(fd, dst_fd)(info->dest_fd);
>
> - if (!dst_file) {
> + if (fd_empty(dst_fd)) {
> info->status = -EBADF;
> goto next_loop;
> }
>
> if (info->reserved) {
> info->status = -EINVAL;
> - goto next_fdput;
> + goto next_loop;
> }
>
> - deduped = vfs_dedupe_file_range_one(file, off, dst_file,
> + deduped = vfs_dedupe_file_range_one(file, off, fd_file(dst_fd),
> info->dest_offset, len,
> REMAP_FILE_CAN_SHORTEN);
> if (deduped == -EBADE)
> @@ -559,8 +558,6 @@ int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
> else
> info->bytes_deduped = len;
>
> -next_fdput:
> - fdput(dst_fd);
> next_loop:
> if (fatal_signal_pending(current))
> break;
> diff --git a/fs/select.c b/fs/select.c
> index 97e1009dde00..0befca98af60 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -525,19 +525,16 @@ static noinline_for_stack int do_select(int n, fd_set_bits *fds, struct timespec
> }
>
> for (j = 0; j < BITS_PER_LONG; ++j, ++i, bit <<= 1) {
> - struct fd f;
> if (i >= n)
> break;
> if (!(bit & all_bits))
> continue;
> mask = EPOLLNVAL;
> - f = fdget(i);
> - if (fd_file(f)) {
> + CLASS(fd, f)(i);
> + if (!fd_empty(f)) {
> wait_key_set(wait, in, out, bit,
> busy_flag);
> mask = vfs_poll(fd_file(f), wait);
> -
> - fdput(f);
> }
> if ((mask & POLLIN_SET) && (in & bit)) {
> res_in |= bit;
> @@ -858,13 +855,12 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
> {
> int fd = pollfd->fd;
> __poll_t mask = 0, filter;
> - struct fd f;
>
> if (fd < 0)
> goto out;
> mask = EPOLLNVAL;
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> goto out;
>
> /* userland u16 ->events contains POLL... bitmap */
> @@ -874,7 +870,6 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
> if (mask & busy_flag)
> *can_busy_poll = true;
> mask &= filter; /* Mask out unneeded events. */
> - fdput(f);
>
> out:
> /* ... and so does ->revents */
> diff --git a/fs/signalfd.c b/fs/signalfd.c
> index c39cf00ab28a..cc7af00b8527 100644
> --- a/fs/signalfd.c
> +++ b/fs/signalfd.c
> @@ -292,20 +292,17 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
> */
> fd_install(ufd, file);
> } else {
> - struct fd f = fdget(ufd);
> - if (!fd_file(f))
> + CLASS(fd, f)(ufd);
> + if (fd_empty(f))
> return -EBADF;
> ctx = fd_file(f)->private_data;
> - if (fd_file(f)->f_op != &signalfd_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &signalfd_fops)
> return -EINVAL;
> - }
> spin_lock_irq(¤t->sighand->siglock);
> ctx->sigmask = *mask;
> spin_unlock_irq(¤t->sighand->siglock);
>
> wake_up(¤t->sighand->signalfd_wqh);
> - fdput(f);
> }
>
> return ufd;
> diff --git a/fs/smb/client/ioctl.c b/fs/smb/client/ioctl.c
> index 94bf2e5014d9..6d9df3646df3 100644
> --- a/fs/smb/client/ioctl.c
> +++ b/fs/smb/client/ioctl.c
> @@ -72,7 +72,6 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
> unsigned long srcfd)
> {
> int rc;
> - struct fd src_file;
> struct inode *src_inode;
>
> cifs_dbg(FYI, "ioctl copychunk range\n");
> @@ -89,8 +88,8 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
> return rc;
> }
>
> - src_file = fdget(srcfd);
> - if (!fd_file(src_file)) {
> + CLASS(fd, src_file)(srcfd);
> + if (fd_empty(src_file)) {
> rc = -EBADF;
> goto out_drop_write;
> }
> @@ -98,20 +97,18 @@ static long cifs_ioctl_copychunk(unsigned int xid, struct file *dst_file,
> if (fd_file(src_file)->f_op->unlocked_ioctl != cifs_ioctl) {
> rc = -EBADF;
> cifs_dbg(VFS, "src file seems to be from a different filesystem type\n");
> - goto out_fput;
> + goto out_drop_write;
> }
>
> src_inode = file_inode(fd_file(src_file));
> rc = -EINVAL;
> if (S_ISDIR(src_inode->i_mode))
> - goto out_fput;
> + goto out_drop_write;
>
> rc = cifs_file_copychunk_range(xid, fd_file(src_file), 0, dst_file, 0,
> src_inode->i_size, 0);
> if (rc > 0)
> rc = 0;
> -out_fput:
> - fdput(src_file);
> out_drop_write:
> mnt_drop_write_file(dst_file);
> return rc;
> diff --git a/fs/splice.c b/fs/splice.c
> index 06232d7e505f..42aa7bc46be5 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -1626,8 +1626,6 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
> error = vmsplice_to_user(fd_file(f), &iter, flags);
>
> kfree(iov);
> -out_fdput:
> - fdput(f);
> return error;
> }
>
> @@ -1635,27 +1633,22 @@ SYSCALL_DEFINE6(splice, int, fd_in, loff_t __user *, off_in,
> int, fd_out, loff_t __user *, off_out,
> size_t, len, unsigned int, flags)
> {
> - struct fd in, out;
> - ssize_t error;
> -
> if (unlikely(!len))
> return 0;
>
> if (unlikely(flags & ~SPLICE_F_ALL))
> return -EINVAL;
>
> - error = -EBADF;
> - in = fdget(fd_in);
> - if (fd_file(in)) {
> - out = fdget(fd_out);
> - if (fd_file(out)) {
> - error = __do_splice(fd_file(in), off_in, fd_file(out), off_out,
> + CLASS(fd, in)(fd_in);
> + if (fd_empty(in))
> + return -EBADF;
> +
> + CLASS(fd, out)(fd_out);
> + if (fd_empty(out))
> + return -EBADF;
> +
> + return __do_splice(fd_file(in), off_in, fd_file(out), off_out,
> len, flags);
> - fdput(out);
> - }
> - fdput(in);
> - }
> - return error;
> }
>
> /*
> @@ -2005,25 +1998,19 @@ ssize_t do_tee(struct file *in, struct file *out, size_t len,
>
> SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags)
> {
> - struct fd in, out;
> - ssize_t error;
> -
> if (unlikely(flags & ~SPLICE_F_ALL))
> return -EINVAL;
>
> if (unlikely(!len))
> return 0;
>
> - error = -EBADF;
> - in = fdget(fdin);
> - if (fd_file(in)) {
> - out = fdget(fdout);
> - if (fd_file(out)) {
> - error = do_tee(fd_file(in), fd_file(out), len, flags);
> - fdput(out);
> - }
> - fdput(in);
> - }
> + CLASS(fd, in)(fdin);
> + if (fd_empty(in))
> + return -EBADF;
>
> - return error;
> + CLASS(fd, out)(fdout);
> + if (fd_empty(out))
> + return -EBADF;
> +
> + return do_tee(fd_file(in), fd_file(out), len, flags);
> }
> diff --git a/fs/sync.c b/fs/sync.c
> index 67df255eb189..2955cd4c77a3 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -148,11 +148,11 @@ void emergency_sync(void)
> */
> SYSCALL_DEFINE1(syncfs, int, fd)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct super_block *sb;
> int ret, ret2;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
> sb = fd_file(f)->f_path.dentry->d_sb;
>
> @@ -162,7 +162,6 @@ SYSCALL_DEFINE1(syncfs, int, fd)
>
> ret2 = errseq_check_and_advance(&sb->s_wb_err, &fd_file(f)->f_sb_err);
>
> - fdput(f);
> return ret ? ret : ret2;
> }
>
> @@ -205,14 +204,12 @@ EXPORT_SYMBOL(vfs_fsync);
>
> static int do_fsync(unsigned int fd, int datasync)
> {
> - struct fd f = fdget(fd);
> - int ret = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (fd_file(f)) {
> - ret = vfs_fsync(fd_file(f), datasync);
> - fdput(f);
> - }
> - return ret;
> + if (fd_empty(f))
> + return -EBADF;
> +
> + return vfs_fsync(fd_file(f), datasync);
> }
>
> SYSCALL_DEFINE1(fsync, unsigned int, fd)
> @@ -355,16 +352,12 @@ int sync_file_range(struct file *file, loff_t offset, loff_t nbytes,
> int ksys_sync_file_range(int fd, loff_t offset, loff_t nbytes,
> unsigned int flags)
> {
> - int ret;
> - struct fd f;
> + CLASS(fd, f)(fd);
>
> - ret = -EBADF;
> - f = fdget(fd);
> - if (fd_file(f))
> - ret = sync_file_range(fd_file(f), offset, nbytes, flags);
> + if (fd_empty(f))
> + return -EBADF;
>
> - fdput(f);
> - return ret;
> + return sync_file_range(fd_file(f), offset, nbytes, flags);
> }
>
> SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes,
> diff --git a/fs/utimes.c b/fs/utimes.c
> index 99b26f792b89..c7c7958e57b2 100644
> --- a/fs/utimes.c
> +++ b/fs/utimes.c
> @@ -108,18 +108,13 @@ static int do_utimes_path(int dfd, const char __user *filename,
>
> static int do_utimes_fd(int fd, struct timespec64 *times, int flags)
> {
> - struct fd f;
> - int error;
> -
> if (flags)
> return -EINVAL;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EBADF;
> - error = vfs_utimes(&fd_file(f)->f_path, times);
> - fdput(f);
> - return error;
> + return vfs_utimes(&fd_file(f)->f_path, times);
> }
>
> /*
> diff --git a/fs/xattr.c b/fs/xattr.c
> index d4f84f57e703..980dc1710e97 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -697,11 +697,11 @@ SYSCALL_DEFINE5(lsetxattr, const char __user *, pathname,
> SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
> const void __user *,value, size_t, size, int, flags)
> {
> - struct fd f = fdget(fd);
> - int error = -EBADF;
> + CLASS(fd, f)(fd);
> + int error;
>
> - if (!fd_file(f))
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> audit_file(fd_file(f));
> error = mnt_want_write_file(fd_file(f));
> if (!error) {
> @@ -710,7 +710,6 @@ SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
> value, size, flags);
> mnt_drop_write_file(fd_file(f));
> }
> - fdput(f);
> return error;
> }
>
> @@ -808,16 +807,13 @@ SYSCALL_DEFINE4(lgetxattr, const char __user *, pathname,
> SYSCALL_DEFINE4(fgetxattr, int, fd, const char __user *, name,
> void __user *, value, size_t, size)
> {
> - struct fd f = fdget(fd);
> - ssize_t error = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (!fd_file(f))
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> audit_file(fd_file(f));
> - error = getxattr(file_mnt_idmap(fd_file(f)), fd_file(f)->f_path.dentry,
> + return getxattr(file_mnt_idmap(fd_file(f)), fd_file(f)->f_path.dentry,
> name, value, size);
> - fdput(f);
> - return error;
> }
>
> /*
> @@ -884,15 +880,12 @@ SYSCALL_DEFINE3(llistxattr, const char __user *, pathname, char __user *, list,
>
> SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
> {
> - struct fd f = fdget(fd);
> - ssize_t error = -EBADF;
> + CLASS(fd, f)(fd);
>
> - if (!fd_file(f))
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> audit_file(fd_file(f));
> - error = listxattr(fd_file(f)->f_path.dentry, list, size);
> - fdput(f);
> - return error;
> + return listxattr(fd_file(f)->f_path.dentry, list, size);
> }
>
> /*
> @@ -953,11 +946,11 @@ SYSCALL_DEFINE2(lremovexattr, const char __user *, pathname,
>
> SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
> {
> - struct fd f = fdget(fd);
> - int error = -EBADF;
> + CLASS(fd, f)(fd);
> + int error;
>
> - if (!fd_file(f))
> - return error;
> + if (fd_empty(f))
> + return -EBADF;
> audit_file(fd_file(f));
> error = mnt_want_write_file(fd_file(f));
> if (!error) {
> @@ -965,7 +958,6 @@ SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
> fd_file(f)->f_path.dentry, name);
> mnt_drop_write_file(fd_file(f));
> }
> - fdput(f);
> return error;
> }
>
> diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
> index 9790e0f45d14..35b9b58a4f6f 100644
> --- a/fs/xfs/xfs_exchrange.c
> +++ b/fs/xfs/xfs_exchrange.c
> @@ -778,8 +778,6 @@ xfs_ioc_exchange_range(
> .file2 = file,
> };
> struct xfs_exchange_range args;
> - struct fd file1;
> - int error;
>
> if (copy_from_user(&args, argp, sizeof(args)))
> return -EFAULT;
> @@ -793,12 +791,10 @@ xfs_ioc_exchange_range(
> fxr.length = args.length;
> fxr.flags = args.flags;
>
> - file1 = fdget(args.file1_fd);
> - if (!fd_file(file1))
> + CLASS(fd, file1)(args.file1_fd);
> + if (fd_empty(file1))
> return -EBADF;
> fxr.file1 = fd_file(file1);
>
> - error = xfs_exchange_range(&fxr);
> - fdput(file1);
> - return error;
> + return xfs_exchange_range(&fxr);
> }
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index c8b432fb7b40..4458ddf5dec5 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1060,41 +1060,29 @@ xfs_ioc_swapext(
> xfs_swapext_t *sxp)
> {
> xfs_inode_t *ip, *tip;
> - struct fd f, tmp;
> - int error = 0;
>
> /* Pull information for the target fd */
> - f = fdget((int)sxp->sx_fdtarget);
> - if (!fd_file(f)) {
> - error = -EINVAL;
> - goto out;
> - }
> + CLASS(fd, f)((int)sxp->sx_fdtarget);
> + if (fd_empty(f))
> + return -EINVAL;
>
> if (!(fd_file(f)->f_mode & FMODE_WRITE) ||
> !(fd_file(f)->f_mode & FMODE_READ) ||
> - (fd_file(f)->f_flags & O_APPEND)) {
> - error = -EBADF;
> - goto out_put_file;
> - }
> + (fd_file(f)->f_flags & O_APPEND))
> + return -EBADF;
>
> - tmp = fdget((int)sxp->sx_fdtmp);
> - if (!fd_file(tmp)) {
> - error = -EINVAL;
> - goto out_put_file;
> - }
> + CLASS(fd, tmp)((int)sxp->sx_fdtmp);
> + if (fd_empty(tmp))
> + return -EINVAL;
>
> if (!(fd_file(tmp)->f_mode & FMODE_WRITE) ||
> !(fd_file(tmp)->f_mode & FMODE_READ) ||
> - (fd_file(tmp)->f_flags & O_APPEND)) {
> - error = -EBADF;
> - goto out_put_tmp_file;
> - }
> + (fd_file(tmp)->f_flags & O_APPEND))
> + return -EBADF;
>
> if (IS_SWAPFILE(file_inode(fd_file(f))) ||
> - IS_SWAPFILE(file_inode(fd_file(tmp)))) {
> - error = -EINVAL;
> - goto out_put_tmp_file;
> - }
> + IS_SWAPFILE(file_inode(fd_file(tmp))))
> + return -EINVAL;
>
> /*
> * We need to ensure that the fds passed in point to XFS inodes
> @@ -1102,37 +1090,22 @@ xfs_ioc_swapext(
> * control over what the user passes us here.
> */
> if (fd_file(f)->f_op != &xfs_file_operations ||
> - fd_file(tmp)->f_op != &xfs_file_operations) {
> - error = -EINVAL;
> - goto out_put_tmp_file;
> - }
> + fd_file(tmp)->f_op != &xfs_file_operations)
> + return -EINVAL;
>
> ip = XFS_I(file_inode(fd_file(f)));
> tip = XFS_I(file_inode(fd_file(tmp)));
>
> - if (ip->i_mount != tip->i_mount) {
> - error = -EINVAL;
> - goto out_put_tmp_file;
> - }
> -
> - if (ip->i_ino == tip->i_ino) {
> - error = -EINVAL;
> - goto out_put_tmp_file;
> - }
> + if (ip->i_mount != tip->i_mount)
> + return -EINVAL;
>
> - if (xfs_is_shutdown(ip->i_mount)) {
> - error = -EIO;
> - goto out_put_tmp_file;
> - }
> + if (ip->i_ino == tip->i_ino)
> + return -EINVAL;
>
> - error = xfs_swap_extents(ip, tip, sxp);
> + if (xfs_is_shutdown(ip->i_mount))
> + return -EIO;
>
> - out_put_tmp_file:
> - fdput(tmp);
> - out_put_file:
> - fdput(f);
> - out:
> - return error;
> + return xfs_swap_extents(ip, tip, sxp);
> }
>
> static int
> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index ffa7d341bd95..26b1c14d5967 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -105,29 +105,21 @@ static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
> {
> struct io_ring_ctx *ctx_attach;
> struct io_sq_data *sqd;
> - struct fd f;
> + CLASS(fd, f)(p->wq_fd);
>
> - f = fdget(p->wq_fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-ENXIO);
> - if (!io_is_uring_fops(fd_file(f))) {
> - fdput(f);
> + if (!io_is_uring_fops(fd_file(f)))
> return ERR_PTR(-EINVAL);
> - }
>
> ctx_attach = fd_file(f)->private_data;
> sqd = ctx_attach->sq_data;
> - if (!sqd) {
> - fdput(f);
> + if (!sqd)
> return ERR_PTR(-EINVAL);
> - }
> - if (sqd->task_tgid != current->tgid) {
> - fdput(f);
> + if (sqd->task_tgid != current->tgid)
> return ERR_PTR(-EPERM);
> - }
>
> refcount_inc(&sqd->refs);
> - fdput(f);
> return sqd;
> }
>
> @@ -415,16 +407,11 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
> /* Retain compatibility with failing for an invalid attach attempt */
> if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
> IORING_SETUP_ATTACH_WQ) {
> - struct fd f;
> -
> - f = fdget(p->wq_fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(p->wq_fd);
> + if (fd_empty(f))
> return -ENXIO;
> - if (!io_is_uring_fops(fd_file(f))) {
> - fdput(f);
> + if (!io_is_uring_fops(fd_file(f)))
> return -EINVAL;
> - }
> - fdput(f);
> }
> if (ctx->flags & IORING_SETUP_SQPOLL) {
> struct task_struct *tsk;
> diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> index 9133a52be69b..c72ef725e845 100644
> --- a/ipc/mqueue.c
> +++ b/ipc/mqueue.c
> @@ -1062,7 +1062,6 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
> size_t msg_len, unsigned int msg_prio,
> struct timespec64 *ts)
> {
> - struct fd f;
> struct inode *inode;
> struct ext_wait_queue wait;
> struct ext_wait_queue *receiver;
> @@ -1083,37 +1082,27 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
>
> audit_mq_sendrecv(mqdes, msg_len, msg_prio, ts);
>
> - f = fdget(mqdes);
> - if (unlikely(!fd_file(f))) {
> - ret = -EBADF;
> - goto out;
> - }
> + CLASS(fd, f)(mqdes);
> + if (fd_empty(f))
> + return -EBADF;
>
> inode = file_inode(fd_file(f));
> - if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
> - ret = -EBADF;
> - goto out_fput;
> - }
> + if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
> + return -EBADF;
> info = MQUEUE_I(inode);
> audit_file(fd_file(f));
>
> - if (unlikely(!(fd_file(f)->f_mode & FMODE_WRITE))) {
> - ret = -EBADF;
> - goto out_fput;
> - }
> + if (unlikely(!(fd_file(f)->f_mode & FMODE_WRITE)))
> + return -EBADF;
>
> - if (unlikely(msg_len > info->attr.mq_msgsize)) {
> - ret = -EMSGSIZE;
> - goto out_fput;
> - }
> + if (unlikely(msg_len > info->attr.mq_msgsize))
> + return -EMSGSIZE;
>
> /* First try to allocate memory, before doing anything with
> * existing queues. */
> msg_ptr = load_msg(u_msg_ptr, msg_len);
> - if (IS_ERR(msg_ptr)) {
> - ret = PTR_ERR(msg_ptr);
> - goto out_fput;
> - }
> + if (IS_ERR(msg_ptr))
> + return PTR_ERR(msg_ptr);
> msg_ptr->m_ts = msg_len;
> msg_ptr->m_type = msg_prio;
>
> @@ -1171,9 +1160,6 @@ static int do_mq_timedsend(mqd_t mqdes, const char __user *u_msg_ptr,
> out_free:
> if (ret)
> free_msg(msg_ptr);
> -out_fput:
> - fdput(f);
> -out:
> return ret;
> }
>
> @@ -1183,7 +1169,6 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
> {
> ssize_t ret;
> struct msg_msg *msg_ptr;
> - struct fd f;
> struct inode *inode;
> struct mqueue_inode_info *info;
> struct ext_wait_queue wait;
> @@ -1197,30 +1182,22 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
>
> audit_mq_sendrecv(mqdes, msg_len, 0, ts);
>
> - f = fdget(mqdes);
> - if (unlikely(!fd_file(f))) {
> - ret = -EBADF;
> - goto out;
> - }
> + CLASS(fd, f)(mqdes);
> + if (fd_empty(f))
> + return -EBADF;
>
> inode = file_inode(fd_file(f));
> - if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
> - ret = -EBADF;
> - goto out_fput;
> - }
> + if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
> + return -EBADF;
> info = MQUEUE_I(inode);
> audit_file(fd_file(f));
>
> - if (unlikely(!(fd_file(f)->f_mode & FMODE_READ))) {
> - ret = -EBADF;
> - goto out_fput;
> - }
> + if (unlikely(!(fd_file(f)->f_mode & FMODE_READ)))
> + return -EBADF;
>
> /* checks if buffer is big enough */
> - if (unlikely(msg_len < info->attr.mq_msgsize)) {
> - ret = -EMSGSIZE;
> - goto out_fput;
> - }
> + if (unlikely(msg_len < info->attr.mq_msgsize))
> + return -EMSGSIZE;
>
> /*
> * msg_insert really wants us to have a valid, spare node struct so
> @@ -1274,9 +1251,6 @@ static int do_mq_timedreceive(mqd_t mqdes, char __user *u_msg_ptr,
> }
> free_msg(msg_ptr);
> }
> -out_fput:
> - fdput(f);
> -out:
> return ret;
> }
>
> @@ -1451,21 +1425,18 @@ SYSCALL_DEFINE2(mq_notify, mqd_t, mqdes,
>
> static int do_mq_getsetattr(int mqdes, struct mq_attr *new, struct mq_attr *old)
> {
> - struct fd f;
> struct inode *inode;
> struct mqueue_inode_info *info;
>
> if (new && (new->mq_flags & (~O_NONBLOCK)))
> return -EINVAL;
>
> - f = fdget(mqdes);
> - if (!fd_file(f))
> + CLASS(fd, f)(mqdes);
> + if (fd_empty(f))
> return -EBADF;
>
> - if (unlikely(fd_file(f)->f_op != &mqueue_file_operations)) {
> - fdput(f);
> + if (unlikely(fd_file(f)->f_op != &mqueue_file_operations))
> return -EBADF;
> - }
>
> inode = file_inode(fd_file(f));
> info = MQUEUE_I(inode);
> @@ -1489,7 +1460,6 @@ static int do_mq_getsetattr(int mqdes, struct mq_attr *new, struct mq_attr *old)
> }
>
> spin_unlock(&info->lock);
> - fdput(f);
> return 0;
> }
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index a0e812c29c97..d0adca07b0b5 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -7505,21 +7505,16 @@ int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
> struct btf *btf_get_by_fd(int fd)
> {
> struct btf *btf;
> - struct fd f;
> + CLASS(fd, f)(fd);
>
> - f = fdget(fd);
> -
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
>
> - if (fd_file(f)->f_op != &btf_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &btf_fops)
> return ERR_PTR(-EINVAL);
> - }
>
> btf = fd_file(f)->private_data;
> refcount_inc(&btf->refcnt);
> - fdput(f);
>
> return btf;
> }
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 477bb89f03aa..fd833a2b7c1b 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3234,20 +3234,16 @@ int bpf_link_new_fd(struct bpf_link *link)
>
> struct bpf_link *bpf_link_get_from_fd(u32 ufd)
> {
> - struct fd f = fdget(ufd);
> + CLASS(fd, f)(ufd);
> struct bpf_link *link;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
> - if (fd_file(f)->f_op != &bpf_link_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &bpf_link_fops)
> return ERR_PTR(-EINVAL);
> - }
>
> link = fd_file(f)->private_data;
> bpf_link_inc(link);
> - fdput(f);
> -
> return link;
> }
> EXPORT_SYMBOL(bpf_link_get_from_fd);
> @@ -4952,33 +4948,25 @@ static int bpf_link_get_info_by_fd(struct file *file,
> static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
> union bpf_attr __user *uattr)
> {
> - int ufd = attr->info.bpf_fd;
> - struct fd f;
> - int err;
> -
> if (CHECK_ATTR(BPF_OBJ_GET_INFO_BY_FD))
> return -EINVAL;
>
> - f = fdget(ufd);
> - if (!fd_file(f))
> + CLASS(fd, f)(attr->info.bpf_fd);
> + if (fd_empty(f))
> return -EBADFD;
>
> if (fd_file(f)->f_op == &bpf_prog_fops)
> - err = bpf_prog_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
> + return bpf_prog_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
> uattr);
> else if (fd_file(f)->f_op == &bpf_map_fops)
> - err = bpf_map_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
> + return bpf_map_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr,
> uattr);
> else if (fd_file(f)->f_op == &btf_fops)
> - err = bpf_btf_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr, uattr);
> + return bpf_btf_get_info_by_fd(fd_file(f), fd_file(f)->private_data, attr, uattr);
> else if (fd_file(f)->f_op == &bpf_link_fops)
> - err = bpf_link_get_info_by_fd(fd_file(f), fd_file(f)->private_data,
> + return bpf_link_get_info_by_fd(fd_file(f), fd_file(f)->private_data,
> attr, uattr);
> - else
> - err = -EINVAL;
> -
> - fdput(f);
> - return err;
> + return -EINVAL;
> }
>
> #define BPF_BTF_LOAD_LAST_FIELD btf_token_fd
> diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
> index 9a1d356e79ed..4164feec9a3b 100644
> --- a/kernel/bpf/token.c
> +++ b/kernel/bpf/token.c
> @@ -117,17 +117,15 @@ int bpf_token_create(union bpf_attr *attr)
> struct inode *inode;
> struct file *file;
> struct path path;
> - struct fd f;
> + CLASS(fd, f)(attr->token_create.bpffs_fd);
> umode_t mode;
> int err, fd;
>
> - f = fdget(attr->token_create.bpffs_fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> path = fd_file(f)->f_path;
> path_get(&path);
> - fdput(f);
>
> if (path.dentry != path.mnt->mnt_sb->s_root) {
> err = -EINVAL;
> @@ -232,19 +230,16 @@ int bpf_token_create(union bpf_attr *attr)
>
> struct bpf_token *bpf_token_get_from_fd(u32 ufd)
> {
> - struct fd f = fdget(ufd);
> + CLASS(fd, f)(ufd);
> struct bpf_token *token;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
> - if (fd_file(f)->f_op != &bpf_token_fops) {
> - fdput(f);
> + if (fd_file(f)->f_op != &bpf_token_fops)
> return ERR_PTR(-EINVAL);
> - }
>
> token = fd_file(f)->private_data;
> bpf_token_inc(token);
> - fdput(f);
>
> return token;
> }
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index fd4621cd9c23..bc4910442642 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -930,22 +930,20 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
> {
> struct perf_cgroup *cgrp;
> struct cgroup_subsys_state *css;
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> int ret = 0;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> css = css_tryget_online_from_dir(fd_file(f)->f_path.dentry,
> &perf_event_cgrp_subsys);
> - if (IS_ERR(css)) {
> - ret = PTR_ERR(css);
> - goto out;
> - }
> + if (IS_ERR(css))
> + return PTR_ERR(css);
>
> ret = perf_cgroup_ensure_storage(event, css);
> if (ret)
> - goto out;
> + return ret;
>
> cgrp = container_of(css, struct perf_cgroup, css);
> event->cgrp = cgrp;
> @@ -959,8 +957,6 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
> perf_detach_cgroup(event);
> ret = -EINVAL;
> }
> -out:
> - fdput(f);
> return ret;
> }
>
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index dc952c3b05af..c9d97ed20122 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -545,12 +545,12 @@ static void commit_nsset(struct nsset *nsset)
>
> SYSCALL_DEFINE2(setns, int, fd, int, flags)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct ns_common *ns = NULL;
> struct nsset nsset = {};
> int err = 0;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> if (proc_ns_file(fd_file(f))) {
> @@ -580,7 +580,6 @@ SYSCALL_DEFINE2(setns, int, fd, int, flags)
> }
> put_nsset(&nsset);
> out:
> - fdput(f);
> return err;
> }
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 2715afb77eab..115448e89c3e 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -536,11 +536,10 @@ EXPORT_SYMBOL_GPL(find_ge_pid);
>
> struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
> {
> - struct fd f;
> + CLASS(fd, f)(fd);
> struct pid *pid;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
>
> pid = pidfd_pid(fd_file(f));
> @@ -548,8 +547,6 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
> get_pid(pid);
> *flags = fd_file(f)->f_flags;
> }
> -
> - fdput(f);
> return pid;
> }
>
> @@ -747,23 +744,18 @@ SYSCALL_DEFINE3(pidfd_getfd, int, pidfd, int, fd,
> unsigned int, flags)
> {
> struct pid *pid;
> - struct fd f;
> - int ret;
>
> /* flags is currently unused - make sure it's unset */
> if (flags)
> return -EINVAL;
>
> - f = fdget(pidfd);
> - if (!fd_file(f))
> + CLASS(fd, f)(pidfd);
> + if (fd_empty(f))
> return -EBADF;
>
> pid = pidfd_pid(fd_file(f));
> if (IS_ERR(pid))
> - ret = PTR_ERR(pid);
> - else
> - ret = pidfd_getfd(pid, fd);
> + return PTR_ERR(pid);
>
> - fdput(f);
> - return ret;
> + return pidfd_getfd(pid, fd);
> }
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 6c438fd436d8..9f4949ac8a3c 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -3900,7 +3900,6 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
> siginfo_t __user *, info, unsigned int, flags)
> {
> int ret;
> - struct fd f;
> struct pid *pid;
> kernel_siginfo_t kinfo;
> enum pid_type type;
> @@ -3913,20 +3912,17 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
> if (hweight32(flags & PIDFD_SEND_SIGNAL_FLAGS) > 1)
> return -EINVAL;
>
> - f = fdget(pidfd);
> - if (!fd_file(f))
> + CLASS(fd, f)(pidfd);
> + if (fd_empty(f))
> return -EBADF;
>
> /* Is this a pidfd? */
> pid = pidfd_to_pid(fd_file(f));
> - if (IS_ERR(pid)) {
> - ret = PTR_ERR(pid);
> - goto err;
> - }
> + if (IS_ERR(pid))
> + return PTR_ERR(pid);
>
> - ret = -EINVAL;
> if (!access_pidfd_pidns(pid))
> - goto err;
> + return -EINVAL;
>
> switch (flags) {
> case 0:
> @@ -3950,28 +3946,23 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig,
> if (info) {
> ret = copy_siginfo_from_user_any(&kinfo, info);
> if (unlikely(ret))
> - goto err;
> + return ret;
>
> - ret = -EINVAL;
> if (unlikely(sig != kinfo.si_signo))
> - goto err;
> + return -EINVAL;
>
> /* Only allow sending arbitrary signals to yourself. */
> - ret = -EPERM;
> if ((task_pid(current) != pid || type > PIDTYPE_TGID) &&
> (kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL))
> - goto err;
> + return -EPERM;
> } else {
> prepare_kill_siginfo(sig, &kinfo, type);
> }
>
> if (type == PIDTYPE_PGID)
> - ret = kill_pgrp_info(sig, &kinfo, pid);
> + return kill_pgrp_info(sig, &kinfo, pid);
> else
> - ret = kill_pid_info_type(sig, &kinfo, pid, type);
> -err:
> - fdput(f);
> - return ret;
> + return kill_pid_info_type(sig, &kinfo, pid, type);
> }
>
> static int
> diff --git a/kernel/sys.c b/kernel/sys.c
> index a4be1e568ff5..243d58916899 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1911,12 +1911,11 @@ SYSCALL_DEFINE1(umask, int, mask)
>
> static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> {
> - struct fd exe;
> + CLASS(fd, exe)(fd);
> struct inode *inode;
> int err;
>
> - exe = fdget(fd);
> - if (!fd_file(exe))
> + if (fd_empty(exe))
> return -EBADF;
>
> inode = file_inode(fd_file(exe));
> @@ -1926,18 +1925,14 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> * sure that this one is executable as well, to avoid breaking an
> * overall picture.
> */
> - err = -EACCES;
> if (!S_ISREG(inode->i_mode) || path_noexec(&fd_file(exe)->f_path))
> - goto exit;
> + return -EACCES;
>
> err = file_permission(fd_file(exe), MAY_EXEC);
> if (err)
> - goto exit;
> + return err;
>
> - err = replace_mm_exe_file(mm, fd_file(exe));
> -exit:
> - fdput(exe);
> - return err;
> + return replace_mm_exe_file(mm, fd_file(exe));
> }
>
> /*
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 0700f40c53ac..0cd680ccc7e5 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -411,15 +411,14 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> struct nlattr *na;
> size_t size;
> u32 fd;
> - struct fd f;
>
> na = info->attrs[CGROUPSTATS_CMD_ATTR_FD];
> if (!na)
> return -EINVAL;
>
> fd = nla_get_u32(info->attrs[CGROUPSTATS_CMD_ATTR_FD]);
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return 0;
>
> size = nla_total_size(sizeof(struct cgroupstats));
> @@ -427,14 +426,13 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> rc = prepare_reply(info, CGROUPSTATS_CMD_NEW, &rep_skb,
> size);
> if (rc < 0)
> - goto err;
> + return rc;
>
> na = nla_reserve(rep_skb, CGROUPSTATS_TYPE_CGROUP_STATS,
> sizeof(struct cgroupstats));
> if (na == NULL) {
> nlmsg_free(rep_skb);
> - rc = -EMSGSIZE;
> - goto err;
> + return -EMSGSIZE;
> }
>
> stats = nla_data(na);
> @@ -443,14 +441,10 @@ static int cgroupstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> rc = cgroupstats_build(stats, fd_file(f)->f_path.dentry);
> if (rc < 0) {
> nlmsg_free(rep_skb);
> - goto err;
> + return rc;
> }
>
> - rc = send_reply(rep_skb, info);
> -
> -err:
> - fdput(f);
> - return rc;
> + return send_reply(rep_skb, info);
> }
>
> static int cmd_attr_register_cpumask(struct genl_info *info)
> diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
> index d36242fd4936..1895fbc32bcb 100644
> --- a/kernel/watch_queue.c
> +++ b/kernel/watch_queue.c
> @@ -663,16 +663,14 @@ struct watch_queue *get_watch_queue(int fd)
> {
> struct pipe_inode_info *pipe;
> struct watch_queue *wqueue = ERR_PTR(-EINVAL);
> - struct fd f;
> + CLASS(fd, f)(fd);
>
> - f = fdget(fd);
> - if (fd_file(f)) {
> + if (!fd_empty(f)) {
> pipe = get_pipe_info(fd_file(f), false);
> if (pipe && pipe->watch_queue) {
> wqueue = pipe->watch_queue;
> kref_get(&wqueue->usage);
> }
> - fdput(f);
> }
>
> return wqueue;
> diff --git a/mm/fadvise.c b/mm/fadvise.c
> index 532dee205c6e..588fe76c5a14 100644
> --- a/mm/fadvise.c
> +++ b/mm/fadvise.c
> @@ -190,16 +190,12 @@ EXPORT_SYMBOL(vfs_fadvise);
>
> int ksys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
> {
> - struct fd f = fdget(fd);
> - int ret;
> + CLASS(fd, f)(fd);
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> - ret = vfs_fadvise(fd_file(f), offset, len, advice);
> -
> - fdput(f);
> - return ret;
> + return vfs_fadvise(fd_file(f), offset, len, advice);
> }
>
> SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, loff_t, len, int, advice)
> diff --git a/mm/filemap.c b/mm/filemap.c
> index c79c2c773171..13b2f133796d 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -4373,31 +4373,25 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
> struct cachestat_range __user *, cstat_range,
> struct cachestat __user *, cstat, unsigned int, flags)
> {
> - struct fd f = fdget(fd);
> + CLASS(fd, f)(fd);
> struct address_space *mapping;
> struct cachestat_range csr;
> struct cachestat cs;
> pgoff_t first_index, last_index;
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> if (copy_from_user(&csr, cstat_range,
> - sizeof(struct cachestat_range))) {
> - fdput(f);
> + sizeof(struct cachestat_range)))
> return -EFAULT;
> - }
>
> /* hugetlbfs is not supported */
> - if (is_file_hugepages(fd_file(f))) {
> - fdput(f);
> + if (is_file_hugepages(fd_file(f)))
> return -EOPNOTSUPP;
> - }
>
> - if (flags != 0) {
> - fdput(f);
> + if (flags != 0)
> return -EINVAL;
> - }
>
> first_index = csr.off >> PAGE_SHIFT;
> last_index =
> @@ -4405,7 +4399,6 @@ SYSCALL_DEFINE4(cachestat, unsigned int, fd,
> memset(&cs, 0, sizeof(struct cachestat));
> mapping = fd_file(f)->f_mapping;
> filemap_cachestat(mapping, first_index, last_index, &cs);
> - fdput(f);
>
> if (copy_to_user(cstat, &cs, sizeof(struct cachestat)))
> return -EFAULT;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 58d000013024..46240137fee3 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5266,8 +5266,6 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> struct mem_cgroup_event *event;
> struct cgroup_subsys_state *cfile_css;
> unsigned int efd, cfd;
> - struct fd efile;
> - struct fd cfile;
> struct dentry *cdentry;
> const char *name;
> char *endp;
> @@ -5298,8 +5296,8 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> init_waitqueue_func_entry(&event->wait, memcg_event_wake);
> INIT_WORK(&event->remove, memcg_event_remove);
>
> - efile = fdget(efd);
> - if (!fd_file(efile)) {
> + CLASS(fd, efile)(efd);
> + if (fd_empty(efile)) {
> ret = -EBADF;
> goto out_kfree;
> }
> @@ -5307,11 +5305,11 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> event->eventfd = eventfd_ctx_fileget(fd_file(efile));
> if (IS_ERR(event->eventfd)) {
> ret = PTR_ERR(event->eventfd);
> - goto out_put_efile;
> + goto out_kfree;
> }
>
> - cfile = fdget(cfd);
> - if (!fd_file(cfile)) {
> + CLASS(fd, cfile)(cfd);
> + if (fd_empty(cfile)) {
> ret = -EBADF;
> goto out_put_eventfd;
> }
> @@ -5320,7 +5318,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> /* AV: shouldn't we check that it's been opened for read instead? */
> ret = file_permission(fd_file(cfile), MAY_READ);
> if (ret < 0)
> - goto out_put_cfile;
> + goto out_put_eventfd;
>
> /*
> * The control file must be a regular cgroup1 file. As a regular cgroup
> @@ -5329,7 +5327,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> cdentry = fd_file(cfile)->f_path.dentry;
> if (cdentry->d_sb->s_type != &cgroup_fs_type || !d_is_reg(cdentry)) {
> ret = -EINVAL;
> - goto out_put_cfile;
> + goto out_put_eventfd;
> }
>
> /*
> @@ -5356,7 +5354,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> event->unregister_event = memsw_cgroup_usage_unregister_event;
> } else {
> ret = -EINVAL;
> - goto out_put_cfile;
> + goto out_put_eventfd;
> }
>
> /*
> @@ -5368,10 +5366,10 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> &memory_cgrp_subsys);
> ret = -EINVAL;
> if (IS_ERR(cfile_css))
> - goto out_put_cfile;
> + goto out_put_eventfd;
> if (cfile_css != css) {
> css_put(cfile_css);
> - goto out_put_cfile;
> + goto out_put_eventfd;
> }
>
> ret = event->register_event(memcg, event->eventfd, buf);
> @@ -5384,19 +5382,12 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
> list_add(&event->list, &memcg->event_list);
> spin_unlock_irq(&memcg->event_list_lock);
>
> - fdput(cfile);
> - fdput(efile);
> -
> return nbytes;
>
> out_put_css:
> css_put(css);
> -out_put_cfile:
> - fdput(cfile);
> out_put_eventfd:
> eventfd_ctx_put(event->eventfd);
> -out_put_efile:
> - fdput(efile);
> out_kfree:
> kfree(event);
>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 2be4603488c5..3ce1269b972a 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -721,29 +721,22 @@ EXPORT_SYMBOL_GPL(page_cache_async_ra);
>
> ssize_t ksys_readahead(int fd, loff_t offset, size_t count)
> {
> - ssize_t ret;
> - struct fd f;
> + CLASS(fd, f)(fd);
>
> - ret = -EBADF;
> - f = fdget(fd);
> - if (!fd_file(f) || !(fd_file(f)->f_mode & FMODE_READ))
> - goto out;
> + if (fd_empty(f) || !(fd_file(f)->f_mode & FMODE_READ))
> + return -EBADF;
>
> /*
> * The readahead() syscall is intended to run only on files
> * that can execute readahead. If readahead is not possible
> * on this file, then we must return -EINVAL.
> */
> - ret = -EINVAL;
> if (!fd_file(f)->f_mapping || !fd_file(f)->f_mapping->a_ops ||
> (!S_ISREG(file_inode(fd_file(f))->i_mode) &&
> !S_ISBLK(file_inode(fd_file(f))->i_mode)))
> - goto out;
> + return -EINVAL;
>
> - ret = vfs_fadvise(fd_file(f), offset, count, POSIX_FADV_WILLNEED);
> -out:
> - fdput(f);
> - return ret;
> + return vfs_fadvise(fd_file(f), offset, count, POSIX_FADV_WILLNEED);
> }
>
> SYSCALL_DEFINE3(readahead, int, fd, loff_t, offset, size_t, count)
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index fd536e385b83..e1e56cbf5cbf 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -703,20 +703,18 @@ EXPORT_SYMBOL_GPL(get_net_ns);
>
> struct net *get_net_ns_by_fd(int fd)
> {
> - struct fd f = fdget(fd);
> - struct net *net = ERR_PTR(-EINVAL);
> + CLASS(fd, f)(fd);
>
> - if (!fd_file(f))
> + if (fd_empty(f))
> return ERR_PTR(-EBADF);
>
> if (proc_ns_file(fd_file(f))) {
> struct ns_common *ns = get_proc_ns(file_inode(fd_file(f)));
> if (ns->ops == &netns_operations)
> - net = get_net(container_of(ns, struct net, ns));
> + return get_net(container_of(ns, struct net, ns));
> }
> - fdput(f);
>
> - return net;
> + return ERR_PTR(-EINVAL);
> }
> EXPORT_SYMBOL_GPL(get_net_ns_by_fd);
> #endif
> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index e7c1d3ae33fe..e6b4cdd6e84c 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -1062,19 +1062,16 @@ int process_buffer_measurement(struct mnt_idmap *idmap,
> */
> void ima_kexec_cmdline(int kernel_fd, const void *buf, int size)
> {
> - struct fd f;
> -
> if (!buf || !size)
> return;
>
> - f = fdget(kernel_fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(kernel_fd);
> + if (fd_empty(f))
> return;
>
> process_buffer_measurement(file_mnt_idmap(fd_file(f)), file_inode(fd_file(f)),
> buf, size, "kexec-cmdline", KEXEC_CMDLINE, 0,
> NULL, false, NULL, 0);
> - fdput(f);
> }
>
> /**
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 97b3df540dc7..a661d373ea92 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -234,31 +234,21 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
> static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
> const fmode_t mode)
> {
> - struct fd ruleset_f;
> + CLASS(fd, ruleset_f)(fd);
> struct landlock_ruleset *ruleset;
>
> - ruleset_f = fdget(fd);
> - if (!fd_file(ruleset_f))
> + if (fd_empty(ruleset_f))
> return ERR_PTR(-EBADF);
>
> /* Checks FD type and access right. */
> - if (fd_file(ruleset_f)->f_op != &ruleset_fops) {
> - ruleset = ERR_PTR(-EBADFD);
> - goto out_fdput;
> - }
> - if (!(fd_file(ruleset_f)->f_mode & mode)) {
> - ruleset = ERR_PTR(-EPERM);
> - goto out_fdput;
> - }
> + if (fd_file(ruleset_f)->f_op != &ruleset_fops)
> + return ERR_PTR(-EBADFD);
> + if (!(fd_file(ruleset_f)->f_mode & mode))
> + return ERR_PTR(-EPERM);
> ruleset = fd_file(ruleset_f)->private_data;
> - if (WARN_ON_ONCE(ruleset->num_layers != 1)) {
> - ruleset = ERR_PTR(-EINVAL);
> - goto out_fdput;
> - }
> + if (WARN_ON_ONCE(ruleset->num_layers != 1))
> + return ERR_PTR(-EINVAL);
> landlock_get_ruleset(ruleset);
> -
> -out_fdput:
> - fdput(ruleset_f);
> return ruleset;
> }
>
> diff --git a/security/loadpin/loadpin.c b/security/loadpin/loadpin.c
> index 02144ec39f43..68252452b66c 100644
> --- a/security/loadpin/loadpin.c
> +++ b/security/loadpin/loadpin.c
> @@ -283,7 +283,6 @@ enum loadpin_securityfs_interface_index {
>
> static int read_trusted_verity_root_digests(unsigned int fd)
> {
> - struct fd f;
> void *data;
> int rc;
> char *p, *d;
> @@ -295,8 +294,8 @@ static int read_trusted_verity_root_digests(unsigned int fd)
> if (!list_empty(&dm_verity_loadpin_trusted_root_digests))
> return -EPERM;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> return -EINVAL;
>
> data = kzalloc(SZ_4K, GFP_KERNEL);
> @@ -359,7 +358,6 @@ static int read_trusted_verity_root_digests(unsigned int fd)
> }
>
> kfree(data);
> - fdput(f);
>
> return 0;
>
> @@ -379,8 +377,6 @@ static int read_trusted_verity_root_digests(unsigned int fd)
> /* disallow further attempts after reading a corrupt/invalid file */
> deny_reading_verity_digests = true;
>
> - fdput(f);
> -
> return rc;
> }
>
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 65efb3735e79..70bc0d1f5f6a 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -303,7 +303,6 @@ static int
> kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
> {
> struct kvm_kernel_irqfd *irqfd, *tmp;
> - struct fd f;
> struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL;
> int ret;
> __poll_t events;
> @@ -326,8 +325,8 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
> INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
> seqcount_spinlock_init(&irqfd->irq_entry_sc, &kvm->irqfds.lock);
>
> - f = fdget(args->fd);
> - if (!fd_file(f)) {
> + CLASS(fd, f)(args->fd);
> + if (fd_empty(f)) {
> ret = -EBADF;
> goto out;
> }
> @@ -335,7 +334,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
> eventfd = eventfd_ctx_fileget(fd_file(f));
> if (IS_ERR(eventfd)) {
> ret = PTR_ERR(eventfd);
> - goto fail;
> + goto out;
> }
>
> irqfd->eventfd = eventfd;
> @@ -439,12 +438,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
> #endif
>
> srcu_read_unlock(&kvm->irq_srcu, idx);
> -
> - /*
> - * do not drop the file until the irqfd is fully initialized, otherwise
> - * we might race against the EPOLLHUP
> - */
> - fdput(f);
> return 0;
>
> fail:
> @@ -457,8 +450,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
> if (eventfd && !IS_ERR(eventfd))
> eventfd_ctx_put(eventfd);
>
> - fdput(f);
> -
> out:
> kfree(irqfd);
> return ret;
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 388ae471d258..72aa1fdeb699 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -190,11 +190,10 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
> {
> struct kvm_vfio *kv = dev->private;
> struct kvm_vfio_file *kvf;
> - struct fd f;
> + CLASS(fd, f)(fd);
> int ret;
>
> - f = fdget(fd);
> - if (!fd_file(f))
> + if (fd_empty(f))
> return -EBADF;
>
> ret = -ENOENT;
> @@ -220,9 +219,6 @@ static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
> kvm_vfio_update_coherency(dev);
>
> mutex_unlock(&kv->lock);
> -
> - fdput(f);
> -
> return ret;
> }
>
> @@ -233,14 +229,13 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
> struct kvm_vfio_spapr_tce param;
> struct kvm_vfio *kv = dev->private;
> struct kvm_vfio_file *kvf;
> - struct fd f;
> int ret;
>
> if (copy_from_user(¶m, arg, sizeof(struct kvm_vfio_spapr_tce)))
> return -EFAULT;
>
> - f = fdget(param.groupfd);
> - if (!fd_file(f))
> + CLASS(fd, f)(param.groupfd);
> + if (fd_empty(f))
> return -EBADF;
>
> ret = -ENOENT;
> @@ -266,7 +261,6 @@ static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
>
> err_fdput:
> mutex_unlock(&kv->lock);
> - fdput(f);
> return ret;
> }
> #endif
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 12/19] bpf: switch to CLASS(fd, ...)
2024-06-07 1:59 ` [PATCH 12/19] bpf: switch to CLASS(fd, ...) Al Viro
@ 2024-06-07 15:27 ` Christian Brauner
0 siblings, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:27 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 02:59:50AM +0100, Al Viro wrote:
> Calling conventions for __bpf_map_get() would be more convenient
> if it left fpdut() on failure to callers. Makes for simpler logics
> in the callers.
>
> Among other things, the proof of memory safety no longer has to
> rely upon file->private_data never being ERR_PTR(...) for bpffs files.
> Original calling conventions made it impossible for the caller to tell
> whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found
> the file not be a bpf map one (in which case it would've done fdput())
> or because it found that ERR_PTR(-EINVAL) in file->private_data of a
> bpf map file (in which case fdput() would _not_ have been done).
>
> With that calling conventions change it's easy to switch all
> bpffs users to CLASS(fd, ...)
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
Reviewed-by: Christian Brauner <brauner@kernel.org>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCHES][RFC] rework of struct fd handling
2024-06-07 1:56 [PATCHES][RFC] rework of struct fd handling Al Viro
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
@ 2024-06-07 15:30 ` Christian Brauner
1 sibling, 0 replies; 41+ messages in thread
From: Christian Brauner @ 2024-06-07 15:30 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, Linus Torvalds
On Fri, Jun 07, 2024 at 02:56:56AM +0100, Al Viro wrote:
> Experimental series trying to sanitize the handling
> of struct fd. Lightly tested, in serious need of review.
>
> It's 6.10-rc1-based, lives in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.fd
> Individual patches in followups, descriptions below.
Looks overall like a good approach to me and I really like that we're
not renaming struct fd.
(IMHO, you should just send the CLASS(fd_pos) cleanup helper addition
upstream for this cycle because it's the only thing we're missing. And
that'll likely make conversions in individual patches/per subsystem
easier as well.)
>
> Shortlog:
> Al Viro (19):
> powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
> lirc: rc_dev_get_from_fd(): fix file leak
> introduce fd_file(), convert all accessors to it.
> struct fd: representation change
> add struct fd constructors, get rid of __to_fd()
> net/socket.c: switch to CLASS(fd)
> introduce struct fderr, convert overlayfs uses to that
> fdget_raw() users: switch to CLASS(fd_raw, ...)
> css_set_fork(): switch to CLASS(fd_raw, ...)
> introduce "fd_pos" class
> switch simple users of fdget() to CLASS(fd, ...)
> bpf: switch to CLASS(fd, ...)
> convert vmsplice() to CLASS(fd, ...)
> finit_module(): convert to CLASS(fd, ...)
> timerfd: switch to CLASS(fd, ...)
> do_mq_notify(): switch to CLASS(fd, ...)
> simplify xfs_find_handle() a bit
> convert kernel/events/core.c
> deal with the last remaing boolean uses of fd_file()
>
> Diffstat:
> arch/alpha/kernel/osf_sys.c | 7 +-
> arch/arm/kernel/sys_oabi-compat.c | 18 +-
> arch/powerpc/kvm/book3s_64_vio.c | 9 +-
> arch/powerpc/kvm/powerpc.c | 26 +--
> arch/powerpc/platforms/cell/spu_syscalls.c | 17 +-
> arch/x86/kernel/cpu/sgx/main.c | 10 +-
> arch/x86/kvm/svm/sev.c | 43 ++--
> drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 27 +--
> drivers/gpu/drm/drm_syncobj.c | 11 +-
> drivers/infiniband/core/ucma.c | 21 +-
> drivers/infiniband/core/uverbs_cmd.c | 12 +-
> drivers/media/mc/mc-request.c | 22 +-
> drivers/media/rc/lirc_dev.c | 13 +-
> drivers/vfio/group.c | 10 +-
> drivers/vfio/virqfd.c | 20 +-
> drivers/virt/acrn/irqfd.c | 14 +-
> drivers/xen/privcmd.c | 35 ++--
> fs/btrfs/ioctl.c | 7 +-
> fs/coda/inode.c | 13 +-
> fs/eventfd.c | 9 +-
> fs/eventpoll.c | 62 ++----
> fs/ext4/ioctl.c | 23 +-
> fs/f2fs/file.c | 17 +-
> fs/fcntl.c | 74 +++----
> fs/fhandle.c | 7 +-
> fs/file.c | 26 +--
> fs/fsopen.c | 23 +-
> fs/fuse/dev.c | 10 +-
> fs/ioctl.c | 47 ++---
> fs/kernel_read_file.c | 12 +-
> fs/locks.c | 27 +--
> fs/namei.c | 19 +-
> fs/namespace.c | 53 ++---
> fs/notify/fanotify/fanotify_user.c | 50 ++---
> fs/notify/inotify/inotify_user.c | 44 ++--
> fs/ocfs2/cluster/heartbeat.c | 17 +-
> fs/open.c | 67 +++---
> fs/overlayfs/file.c | 187 +++++++----------
> fs/quota/quota.c | 18 +-
> fs/read_write.c | 227 +++++++++-----------
> fs/readdir.c | 38 ++--
> fs/remap_range.c | 11 +-
> fs/select.c | 17 +-
> fs/signalfd.c | 11 +-
> fs/smb/client/ioctl.c | 17 +-
> fs/splice.c | 82 +++-----
> fs/stat.c | 10 +-
> fs/statfs.c | 12 +-
> fs/sync.c | 33 ++-
> fs/timerfd.c | 42 ++--
> fs/utimes.c | 11 +-
> fs/xattr.c | 64 +++---
> fs/xfs/xfs_exchrange.c | 12 +-
> fs/xfs/xfs_handle.c | 16 +-
> fs/xfs/xfs_ioctl.c | 85 +++-----
> include/linux/bpf.h | 9 +-
> include/linux/cleanup.h | 2 +-
> include/linux/file.h | 89 +++++---
> io_uring/sqpoll.c | 31 +--
> ipc/mqueue.c | 126 +++++------
> kernel/bpf/bpf_inode_storage.c | 29 +--
> kernel/bpf/btf.c | 13 +-
> kernel/bpf/map_in_map.c | 37 +---
> kernel/bpf/syscall.c | 197 ++++++-----------
> kernel/bpf/token.c | 19 +-
> kernel/bpf/verifier.c | 20 +-
> kernel/cgroup/cgroup.c | 21 +-
> kernel/events/core.c | 67 +++---
> kernel/module/main.c | 15 +-
> kernel/nsproxy.c | 15 +-
> kernel/pid.c | 26 +--
> kernel/signal.c | 33 ++-
> kernel/sys.c | 21 +-
> kernel/taskstats.c | 20 +-
> kernel/watch_queue.c | 8 +-
> mm/fadvise.c | 10 +-
> mm/filemap.c | 19 +-
> mm/memcontrol.c | 37 ++--
> mm/readahead.c | 25 +--
> net/core/net_namespace.c | 14 +-
> net/core/sock_map.c | 23 +-
> net/socket.c | 325 +++++++++++++----------------
> security/integrity/ima/ima_main.c | 9 +-
> security/landlock/syscalls.c | 57 ++---
> security/loadpin/loadpin.c | 10 +-
> sound/core/pcm_native.c | 6 +-
> virt/kvm/eventfd.c | 19 +-
> virt/kvm/vfio.c | 18 +-
> 88 files changed, 1234 insertions(+), 1951 deletions(-)
>
> 01/19) powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
> 02/19) lirc: rc_dev_get_from_fd(): fix file leak
>
> First two patches are obvious leak fixes - missing fdput()
> on one a failure exits.
>
> 03/19) introduce fd_file(), convert all accessors to it.
>
> For any changes of struct fd representation we need to
> turn existing accesses to fields into calls of wrappers.
> Accesses to struct fd::flags are very few (3 in linux/file.h,
> 1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in
> explicit initializers).
> Those can be dealt with in the commit converting to
> new layout; accesses to struct fd::file are too many for that.
> This commit converts (almost) all of f.file to
> fd_file(f). It's not entirely mechanical ('file' is used as
> a member name more than just in struct fd) and it does not
> even attempt to distinguish the uses in pointer context from
> those in boolean context; the latter will be eventually turned
> into a separate helper (fd_empty()).
>
> NB: this commit is where I'd expect arseloads of conflicts
> through the cycle, simply because of the breadth of area being
> touched. The biggest one, as well (500 lines modified).
> Might be worth splitting - not sure.
>
> 04/19) struct fd: representation change
>
> The absolute majority of instances comes from fdget() and its
> relatives; the underlying primitives actually return a struct file
> reference and a couple of flags encoded into an unsigned long - the lower
> two bits of file address are always zero, so we can stash the flags
> into those. On the way out we use __to_fd() to unpack that unsigned
> long into struct fd.
> Let's use that representation for struct fd itself - make it
> a structure with a single unsigned long member (.word), with the value
> equal either to (unsigned long)p | flags, p being an address of some
> struct file instance, or to 0 for an empty fd.
> Note that we never used a struct fd instance with NULL ->file
> and non-zero ->flags; the emptiness had been checked as (!f.file) and
> we expected e.g. fdput(empty) to be a no-op. With new representation
> we can use (!f.word) for emptiness check; that is enough for compiler
> to figure out that (f.word & FDPUT_FPUT) will be false and that fdput(f)
> will be a no-op in such case.
> For now the new predicate (fd_empty(f)) has no users; all the
> existing checks have form (!fd_file(f)). We will convert to fd_empty()
> use later; here we only define it (and tell the compiler that it's
> unlikely to return true).
> This commit only deals with representation change; there will
> be followups.
> NOTE: overlayfs part is _not_ in the final form - it will be
> massaged shortly.
>
> 05/19) add struct fd constructors, get rid of __to_fd()
>
> Make __fdget() et.al. return struct fd directly.
> New helpers: BORROWED_FD(file) and CLONED_FD(file), for
> borrowed and cloned file references resp.
> NOTE: this might need tuning; in particular, inline on
> __fget_light() is there to keep the code generation same as
> before - we probably want to keep it inlined in fdget() et.al.
> (especially so in fdget_pos()), but that needs profiling.
>
>
> Next two commits deal with the worst irregularities in struct fd use:
> in net/socket.c we have fdget() without matching fdput() - fdget() is
> done in sockfd_lookup_light(), then the results are passed (in modified
> form) to caller, which deals with conditional fput(). And in
> overlayfs we have an almost-but-not-quite struct fd shoehorned into
> struct fd, with ugly calling conventions as the result of that.
> I'm not sure what order would be better for these two commits.
>
> 06/19) net/socket.c: switch to CLASS(fd)
>
> I strongly suspect that important part in sockfd_lookup_light() is
> avoiding needless file refcount operations, not the marginal reduction of
> the register pressure from not keeping a struct file pointer in the caller.
> If that's true, we should get the same benefits from straight
> fdget()/fdput(). And AFAICS with sane use of CLASS(fd) we can get a better
> code generation...
> Would be nice if somebody tested it on networking test suites
> (including benchmarks)...
>
> sockfd_lookup_light() does fdget(), uses sock_from_file() to
> get the associated socket and returns the struct socket reference to
> the caller, along with "do we need to fput()" flag. No matching fdput(),
> the caller does its equivalent manually, using the fact that sock->file
> points to the struct file the socket has come from.
> Get rid of that - have the callers do fdget()/fdput() and
> use sock_from_file() directly. That kills sockfd_lookup_light()
> and fput_light() (no users left).
> What's more, we can get rid of explicit fdget()/fdput() by
> switching to CLASS(fd, ...) - code generation does not suffer, since
> now fdput() inserted on "descriptor is not opened" failure exit
> is recognized to be a no-op by compiler.
> We could split that commit in two (getting rid of sockd_lookup_light()
> and switch to CLASS(fd, ...)), but AFAICS it ends up being harder to read
> that way.
>
> 07/19) introduce struct fderr, convert overlayfs uses to that
>
> Similar to struct fd; unlike struct fd, it can represent
> error values.
> Accessors:
> * fd_empty(f): true if f represents an error
> * fd_file(f): just as for struct fd it yields a pointer to
> struct file if fd_empty(f) is false. If
> fd_empty(f) is true, fd_file(f) is guaranteed
> _not_ to be an address of any object (IS_ERR()
> will be true in that case)
> * fd_error(f): if f represents an error, returns that error,
> otherwise the return value is junk.
> Constructors:
> * ERR_FD(-E...): an instance encoding given error [ERR_FDERR, perhaps?]
> * BORROWED_FDERR(file): if file points to a struct file instance,
> return a struct fderr representing that file
> reference with no flags set.
> if file is an ERR_PTR(-E...), return a struct
> fderr representing that error.
> file MUST NOT be NULL.
> * CLONED_FDERR(file): similar, but in case when file points to
> a struct file instance, set FDPUT_FPUT in flags.
> fdput_err() serves as a destructor.
> See fs/overlayfs/file.c for example of use.
>
> 08/19) fdget_raw() users: switch to CLASS(fd_raw, ...)
> all convert trivially
> 09/19) css_set_fork(): switch to CLASS(fd_raw, ...)
> reference acquired there by fget_raw() is not stashed anywhere -
> we could as well borrow instead.
>
> 10/19) introduce "fd_pos" class
> fdget_pos() for constructor, fdput_pos() for cleanup, users of
> fd..._pos() converted.
>
> 11/19) switch simple users of fdget() to CLASS(fd, ...)
> low-hanging fruit; that's another likely source of conflicts
> over the cycle and it might make a lot of sense to split; fortunately,
> it can be split pretty much on per-function basis - chunks are independent
> from each other.
>
> 12/19) bpf: switch to CLASS(fd, ...)
> Calling conventions for __bpf_map_get() would be more convenient
> if it left fpdut() on failure to callers. Makes for simpler logics
> in the callers.
> Among other things, the proof of memory safety no longer has to
> rely upon file->private_data never being ERR_PTR(...) for bpffs files.
> Original calling conventions made it impossible for the caller to tell
> whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found
> the file not be a bpf map one (in which case it would've done fdput())
> or because it found that ERR_PTR(-EINVAL) in file->private_data of a
> bpf map file (in which case fdput() would _not_ have been done).
> With that calling conventions change it's easy to switch all
> bpffs users to CLASS(fd, ...).
>
> 13/19) convert vmsplice() to CLASS(fd, ...)
> Irregularity here is fdput() not in the same scope as fdget();
> we could just lift it out vmsplice_type() in vmsplice(2), but there's
> no much point keeping vmsplice_type() separate after that...
>
> 14/19) finit_module(): convert to CLASS(fd, ...)
> Slightly unidiomatic emptiness check; just lift it out of
> idempotent_init_module() and into finit_module(2) and that's it.
>
> 15/19) timerfd: switch to CLASS(fd, ...)
> Fold timerfd_fget() into both callers to have fdget() and fdput()
> in the same scope. Could be done in different ways, but this is probably
> the smallest solution.
>
> 16/19) do_mq_notify(): switch to CLASS(fd, ...)
> a minor twist is the reuse of struct fd instance in there
>
> 17/19) simplify xfs_find_handle() a bit
> XFS_IOC_FD_TO_HANDLE can grab a reference to copied ->f_path and
> let the file go; results in simpler control flow - cleanup is
> the same for both "by descriptor" and "by pathname" cases.
> NOTE: grabbing f->f_path to pin file_inode(f) is valid, since
> we are dealing with XFS files here - no reassignments of file_inode().
>
> 18/19) convert kernel/events/core.c
> a questionable trick in perf_event_open(2) - deliberate call of
> fdget(-1), expecting it to yield empty
>
> 19/19) deal with the last remaing boolean uses of fd_file()
> ... replacing them with uses of fd_empty()
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)
2024-06-07 15:26 ` Christian Brauner
@ 2024-06-07 16:10 ` Al Viro
2024-06-07 16:11 ` Al Viro
2024-06-07 21:08 ` Al Viro
0 siblings, 2 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 16:10 UTC (permalink / raw)
To: Christian Brauner; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 05:26:53PM +0200, Christian Brauner wrote:
> On Fri, Jun 07, 2024 at 02:59:49AM +0100, Al Viro wrote:
> > low-hanging fruit; that's another likely source of conflicts
> > over the cycle and it might make a lot of sense to split;
> > fortunately, it can be split pretty much on per-function
> > basis - chunks are independent from each other.
> >
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > ---
>
> I can pick conversions from you for files where I already have changes
> in the tree anyway or already have done conversion as part of other
> patches.
Some notes:
This kind of conversions depends upon fdput(empty) being not just
a no-op, but a no-op visible to compiler. Representation change arranges
for that.
CLASS(...) has some misfeatures or nearly C++ level of tastelessness;
for this series I decided to try it and see how it goes, but... AFAICS in
a lot of cases it's the wrong answer.
1. Variable declarations belong in the beginning of block.
CLASS use invites violating that; to make things worse, in-block goto
bypassing such declaration is only caught by current clang. Two
examples got caught by Intel buildbot in this patch, actually - one
in fs/select.c, another in ocfs2.
2. The loss of control over the location of destructor call can
be dangerous in some cases. Delaying it is not a problem for struct
file references (well, except for goto problems above), but there's
an opposite issue. Example from later in this series:
if (arg != -1) {
struct perf_event *output_event;
struct fd output;
ret = perf_fget_light(arg, &output);
if (ret)
return ret;
output_event = output.file->private_data;
ret = perf_event_set_output(event, output_event);
fdput(output);
} else {
ret = perf_event_set_output(event, NULL);
}
gets converted to
if (arg != -1) {
struct perf_event *output_event;
CLASS(fd, output)(arg);
if (!is_perf_file(output))
return -EBADF;
output_event = fd_file(output)->private_data;
return perf_event_set_output(event, output_event);
} else {
return perf_event_set_output(event, NULL);
}
Nice, but that invites the next step, doesn't it? Like this:
struct perf_event *output_event = NULL;
if (arg != -1) {
CLASS(fd, output)(arg);
if (!is_perf_file(output))
return -EBADF;
output_event = fd_file(output)->private_data;
}
return perf_event_set_output(event, output_event);
See the trouble here? The last variant would be broken - the value of
file->private_data escapes the scope and can end up used after
the file gets closed.
In the original variant it's easy to see - we have an explicit
fdput() marking the place where protection disappears. After
the conversion to implicit cleanups we lose that.
Yes, use of __cleanup can nicely shrink the source and make
some leaks less likely. But my impression from this experiment
is that we should be very cautious with using that technics ;-/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)
2024-06-07 16:10 ` Al Viro
@ 2024-06-07 16:11 ` Al Viro
2024-06-07 21:08 ` Al Viro
1 sibling, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 16:11 UTC (permalink / raw)
To: Christian Brauner; +Cc: linux-fsdevel, torvalds
On Fri, Jun 07, 2024 at 05:10:43PM +0100, Al Viro wrote:
> On Fri, Jun 07, 2024 at 05:26:53PM +0200, Christian Brauner wrote:
> > On Fri, Jun 07, 2024 at 02:59:49AM +0100, Al Viro wrote:
> > > low-hanging fruit; that's another likely source of conflicts
> > > over the cycle and it might make a lot of sense to split;
> > > fortunately, it can be split pretty much on per-function
> > > basis - chunks are independent from each other.
> > >
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > > ---
> >
> > I can pick conversions from you for files where I already have changes
> > in the tree anyway or already have done conversion as part of other
> > patches.
>
> Some notes:
>
> This kind of conversions depends upon fdput(empty) being not just
> a no-op, but a no-op visible to compiler. Representation change arranges
> for that.
>
> CLASS(...) has some misfeatures or nearly C++ level of tastelessness;
of, sorry.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)
2024-06-07 16:10 ` Al Viro
2024-06-07 16:11 ` Al Viro
@ 2024-06-07 21:08 ` Al Viro
2024-06-10 2:44 ` [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)) Al Viro
2024-06-10 5:12 ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
1 sibling, 2 replies; 41+ messages in thread
From: Al Viro @ 2024-06-07 21:08 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-fsdevel, Christian Brauner
On Fri, Jun 07, 2024 at 05:10:43PM +0100, Al Viro wrote:
> On Fri, Jun 07, 2024 at 05:26:53PM +0200, Christian Brauner wrote:
> > On Fri, Jun 07, 2024 at 02:59:49AM +0100, Al Viro wrote:
> > > low-hanging fruit; that's another likely source of conflicts
> > > over the cycle and it might make a lot of sense to split;
> > > fortunately, it can be split pretty much on per-function
> > > basis - chunks are independent from each other.
> > >
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > > ---
> >
> > I can pick conversions from you for files where I already have changes
> > in the tree anyway or already have done conversion as part of other
> > patches.
>
> Some notes:
>
> This kind of conversions depends upon fdput(empty) being not just
> a no-op, but a no-op visible to compiler. Representation change arranges
> for that.
>
> CLASS(...) has some misfeatures or nearly C++ level of tastelessness;
> for this series I decided to try it and see how it goes, but... AFAICS in
> a lot of cases it's the wrong answer.
>
> 1. Variable declarations belong in the beginning of block.
> CLASS use invites violating that; to make things worse, in-block goto
> bypassing such declaration is only caught by current clang. Two
> examples got caught by Intel buildbot in this patch, actually - one
> in fs/select.c, another in ocfs2.
Considerably more than two, actually. For example, this
inode_lock(inode);
/* Update mode */
ovl_copyattr(inode);
ret = file_remove_privs(file);
if (ret)
goto out_unlock;
CLASS(fd_real, real)(file);
if (fd_empty(real)) {
ret = fd_error(real);
goto out_unlock;
}
old_cred = ovl_override_creds(file_inode(file)->i_sb);
ret = vfs_fallocate(fd_file(real), mode, offset, len);
revert_creds(old_cred);
/* Update size */
ovl_file_modified(file);
out_unlock:
inode_unlock(inode);
steps into the same problem.
Hell knows - it feels like mixing __cleanup-based stuff with anything
explicit leads to massive headache. And I *really* hate to have
e.g. inode_unlock() hidden in __cleanup in a random subset of places.
Unlike dropping file references (if we do that a bit later, nothing
would really care), the loss of explicit control over the places where
inode lock is dropped is asking for serious trouble.
Any suggestions? Linus, what's your opinion on the use of CLASS...
stuff?
^ permalink raw reply [flat|nested] 41+ messages in thread
* [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...))
2024-06-07 21:08 ` Al Viro
@ 2024-06-10 2:44 ` Al Viro
2024-06-12 16:36 ` Linus Torvalds
2024-06-10 5:12 ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
1 sibling, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-10 2:44 UTC (permalink / raw)
To: Michael Ellerman
Cc: linux-fsdevel, Christian Brauner, Alexey Kardashevskiy,
Paul Mackerras, Linus Torvalds
On Fri, Jun 07, 2024 at 10:08:14PM +0100, Al Viro wrote:
> Hell knows - it feels like mixing __cleanup-based stuff with anything
> explicit leads to massive headache. And I *really* hate to have
> e.g. inode_unlock() hidden in __cleanup in a random subset of places.
> Unlike dropping file references (if we do that a bit later, nothing
> would really care), the loss of explicit control over the places where
> inode lock is dropped is asking for serious trouble.
>
> Any suggestions? Linus, what's your opinion on the use of CLASS...
> stuff?
While looking through the converted fdget() users, some interesting
stuff got caught. Example:
kvm_device_fops.unlocked_ioctl() is equal to kvm_device_ioctl() and it
gets called (by ioctl(2)) without any locks.
kvm_device_ioctl() calls kvm_device_ioctl_attr(), passing it dev->ops->set_attr.
kvm_device_ioctl_attr() calls the callback passed to it, still without any
locks.
->set_attr() can be kvm_vfio_set_attr(), which calls kvm_vfio_set_file(), which
calls kvm_vfio_file_set_spapr_tce(), which takes dev->private.lock and
calls kvm_spapr_tce_attach_iommu_group(). No kvm->lock held.
Now, in kvm_spapr_tce_attach_iommu_group() we have (in mainline)
f = fdget(tablefd);
if (!f.file)
return -EBADF;
rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
if (stt == f.file->private_data) {
found = true;
break;
}
}
rcu_read_unlock();
fdput(f);
if (!found)
return -EINVAL;
....
list_add_rcu(&stit->next, &stt->iommu_tables);
What happens if another thread closes the damn descriptor just as we'd
done fdput()? This:
static int kvm_spapr_tce_release(struct inode *inode, struct file *filp)
{
struct kvmppc_spapr_tce_table *stt = filp->private_data;
struct kvmppc_spapr_tce_iommu_table *stit, *tmp;
struct kvm *kvm = stt->kvm;
mutex_lock(&kvm->lock);
list_del_rcu(&stt->list);
mutex_unlock(&kvm->lock);
list_for_each_entry_safe(stit, tmp, &stt->iommu_tables, next) {
WARN_ON(!kref_read(&stit->kref));
while (1) {
if (kref_put(&stit->kref, kvm_spapr_tce_liobn_put))
break;
}
}
account_locked_vm(kvm->mm,
kvmppc_stt_pages(kvmppc_tce_pages(stt->size)), false);
kvm_put_kvm(stt->kvm);
call_rcu(&stt->rcu, release_spapr_tce_table);
return 0;
}
Leaving aside the question of sanity of that while (!kfref_put()) loop,
that function will *NOT* block on kvm->lock (the only lock being held
by the caller of kvm_spapr_tce_attach_iommu_group() is struct kvm_vfio::lock,
not struct kvm::lock) and it will arrange for RCU-delayed call of
release_spapr_tce_table(), which will kfree stt.
Recall that in kvm_spapr_tce_attach_iommu_group() we are not holding
rcu_read_lock() between fdput() and list_add_rcu() (we couldn't - not
with the blocking allocations we have there), so call_rcu() might as
well have been a direct call.
What's there to protect stt from being freed right after fdput()?
Unless I'm misreading that code (entirely possible), this fdput() shouldn't
be done until we are done with stt.
^ permalink raw reply [flat|nested] 41+ messages in thread
* [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-07 21:08 ` Al Viro
2024-06-10 2:44 ` [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)) Al Viro
@ 2024-06-10 5:12 ` Al Viro
2024-06-10 17:03 ` Al Viro
2024-06-10 20:09 ` Alex Williamson
1 sibling, 2 replies; 41+ messages in thread
From: Al Viro @ 2024-06-10 5:12 UTC (permalink / raw)
To: Fei Li, Alex Williamson; +Cc: kvm, linux-fsdevel
In acrn_irqfd_assign():
irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
...
set it up
...
mutex_lock(&vm->irqfds_lock);
list_for_each_entry(tmp, &vm->irqfds, list) {
if (irqfd->eventfd != tmp->eventfd)
continue;
ret = -EBUSY;
mutex_unlock(&vm->irqfds_lock);
goto fail;
}
list_add_tail(&irqfd->list, &vm->irqfds);
mutex_unlock(&vm->irqfds_lock);
Now irqfd is visible in vm->irqfds.
/* Check the pending event in this stage */
events = vfs_poll(f.file, &irqfd->pt);
if (events & EPOLLIN)
acrn_irqfd_inject(irqfd);
OTOH, in
static int acrn_irqfd_deassign(struct acrn_vm *vm,
struct acrn_irqfd *args)
{
struct hsm_irqfd *irqfd, *tmp;
struct eventfd_ctx *eventfd;
eventfd = eventfd_ctx_fdget(args->fd);
if (IS_ERR(eventfd))
return PTR_ERR(eventfd);
mutex_lock(&vm->irqfds_lock);
list_for_each_entry_safe(irqfd, tmp, &vm->irqfds, list) {
if (irqfd->eventfd == eventfd) {
hsm_irqfd_shutdown(irqfd);
and
static void hsm_irqfd_shutdown(struct hsm_irqfd *irqfd)
{
u64 cnt;
lockdep_assert_held(&irqfd->vm->irqfds_lock);
/* remove from wait queue */
list_del_init(&irqfd->list);
eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt);
eventfd_ctx_put(irqfd->eventfd);
kfree(irqfd);
}
Both acrn_irqfd_assign() and acrn_irqfd_deassign() are callable via
ioctl(2), with no serialization whatsoever. Suppose deassign hits
as soon as we'd inserted the damn thing into the list. By the
time we call vfs_poll() irqfd might have been freed. The same
can happen if hsm_irqfd_wakeup() gets called with EPOLLHUP as a key
(incidentally, it ought to do
__poll_t poll_bits = key_to_poll(key);
instead of
unsigned long poll_bits = (unsigned long)key;
and check for EPOLLIN and EPOLLHUP instead of POLLIN and POLLHUP).
AFAICS, that's a UAF...
We could move vfs_poll() under vm->irqfds_lock, but that smells
like asking for deadlocks ;-/
vfio_virqfd_enable() has the same problem, except that there we
definitely can't move vfs_poll() under the lock - it's a spinlock.
Could we move vfs_poll() + inject to _before_ making the thing
public? We'd need to delay POLLHUP handling there, but then
we need it until the moment with do inject anyway. Something
like replacing
if (!list_empty(&irqfd->list))
hsm_irqfd_shutdown(irqfd);
in hsm_irqfd_shutdown_work() with
if (!list_empty(&irqfd->list))
hsm_irqfd_shutdown(irqfd);
else
irqfd->need_shutdown = true;
and doing
if (unlikely(irqfd->need_shutdown))
hsm_irqfd_shutdown(irqfd);
else
list_add_tail(&irqfd->list, &vm->irqfds);
when the sucker is made visible.
I'm *not* familiar with the area, though, so that might be unfeasible
for any number of reasons.
Suggestions?
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-10 5:12 ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
@ 2024-06-10 17:03 ` Al Viro
2024-06-10 20:09 ` Alex Williamson
1 sibling, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-10 17:03 UTC (permalink / raw)
To: Fei Li, Alex Williamson; +Cc: kvm, linux-fsdevel
On Mon, Jun 10, 2024 at 06:12:06AM +0100, Al Viro wrote:
> vfio_virqfd_enable() has the same problem, except that there we
> definitely can't move vfs_poll() under the lock - it's a spinlock.
>
> Could we move vfs_poll() + inject to _before_ making the thing
> public? We'd need to delay POLLHUP handling there, but then
> we need it until the moment with do inject anyway. Something
> like replacing
> if (!list_empty(&irqfd->list))
> hsm_irqfd_shutdown(irqfd);
> in hsm_irqfd_shutdown_work() with
> if (!list_empty(&irqfd->list))
> hsm_irqfd_shutdown(irqfd);
> else
> irqfd->need_shutdown = true;
> and doing
> if (unlikely(irqfd->need_shutdown))
> hsm_irqfd_shutdown(irqfd);
> else
> list_add_tail(&irqfd->list, &vm->irqfds);
> when the sucker is made visible.
>
> I'm *not* familiar with the area, though, so that might be unfeasible
> for any number of reasons.
Hmm... OK, so we rely upon EPOLLHUP being generated only upon the final
close of eventfd file. And vfio seems to have an exclusion in all callers
of vfio_virqfd_{en,dis}able(), which ought to be enough.
For drivers/virt/acrn/irqfd.c EPOLLHUP is not a problem for the same
reasons, but there's no exclusion between acrn_irqfd_assign() and
acrn_irqfd_deassign() calls. So the scenario with explicit deassign
racing with assign and leading to vfs_poll(file, <freed memory>) is
possible.
And it looks like drivers/xen/privcmd.c:privcmd_irqfd_assign() has
a similar problem...
How about the following for acrn side of things? Does anybody see
a problem with that "do vfs_poll() before making the thing visible"
approach?
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
index d4ad211dce7a..71c431506a9b 100644
--- a/drivers/virt/acrn/irqfd.c
+++ b/drivers/virt/acrn/irqfd.c
@@ -133,7 +133,7 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
eventfd = eventfd_ctx_fileget(f.file);
if (IS_ERR(eventfd)) {
ret = PTR_ERR(eventfd);
- goto fail;
+ goto out_file;
}
irqfd->eventfd = eventfd;
@@ -145,29 +145,26 @@ static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args)
init_waitqueue_func_entry(&irqfd->wait, hsm_irqfd_wakeup);
init_poll_funcptr(&irqfd->pt, hsm_irqfd_poll_func);
+ /* Check the pending event in this stage */
+ events = vfs_poll(f.file, &irqfd->pt);
+
+ if (events & EPOLLIN)
+ acrn_irqfd_inject(irqfd);
+
mutex_lock(&vm->irqfds_lock);
list_for_each_entry(tmp, &vm->irqfds, list) {
if (irqfd->eventfd != tmp->eventfd)
continue;
- ret = -EBUSY;
+ hsm_irqfd_shutdown(irqfd);
mutex_unlock(&vm->irqfds_lock);
- goto fail;
+ irqfd = NULL; // consumed by hsm_irqfd_shutdown()
+ ret = -EBUSY;
+ goto out_file;
}
list_add_tail(&irqfd->list, &vm->irqfds);
+ irqfd = NULL; // not for us to free...
mutex_unlock(&vm->irqfds_lock);
-
- /* Check the pending event in this stage */
- events = vfs_poll(f.file, &irqfd->pt);
-
- if (events & EPOLLIN)
- acrn_irqfd_inject(irqfd);
-
- fdput(f);
- return 0;
-fail:
- if (eventfd && !IS_ERR(eventfd))
- eventfd_ctx_put(eventfd);
-
+out_file:
fdput(f);
out:
kfree(irqfd);
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-10 5:12 ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
2024-06-10 17:03 ` Al Viro
@ 2024-06-10 20:09 ` Alex Williamson
2024-06-10 20:53 ` Al Viro
1 sibling, 1 reply; 41+ messages in thread
From: Alex Williamson @ 2024-06-10 20:09 UTC (permalink / raw)
To: Al Viro; +Cc: Fei Li, kvm, linux-fsdevel
On Mon, 10 Jun 2024 06:12:06 +0100
Al Viro <viro@zeniv.linux.org.uk> wrote:
> In acrn_irqfd_assign():
> irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
> ...
> set it up
> ...
> mutex_lock(&vm->irqfds_lock);
> list_for_each_entry(tmp, &vm->irqfds, list) {
> if (irqfd->eventfd != tmp->eventfd)
> continue;
> ret = -EBUSY;
> mutex_unlock(&vm->irqfds_lock);
> goto fail;
> }
> list_add_tail(&irqfd->list, &vm->irqfds);
> mutex_unlock(&vm->irqfds_lock);
> Now irqfd is visible in vm->irqfds.
>
> /* Check the pending event in this stage */
> events = vfs_poll(f.file, &irqfd->pt);
>
> if (events & EPOLLIN)
> acrn_irqfd_inject(irqfd);
>
> OTOH, in
>
> static int acrn_irqfd_deassign(struct acrn_vm *vm,
> struct acrn_irqfd *args)
> {
> struct hsm_irqfd *irqfd, *tmp;
> struct eventfd_ctx *eventfd;
>
> eventfd = eventfd_ctx_fdget(args->fd);
> if (IS_ERR(eventfd))
> return PTR_ERR(eventfd);
>
> mutex_lock(&vm->irqfds_lock);
> list_for_each_entry_safe(irqfd, tmp, &vm->irqfds, list) {
> if (irqfd->eventfd == eventfd) {
> hsm_irqfd_shutdown(irqfd);
>
> and
>
> static void hsm_irqfd_shutdown(struct hsm_irqfd *irqfd)
> {
> u64 cnt;
>
> lockdep_assert_held(&irqfd->vm->irqfds_lock);
>
> /* remove from wait queue */
> list_del_init(&irqfd->list);
> eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt);
> eventfd_ctx_put(irqfd->eventfd);
> kfree(irqfd);
> }
>
> Both acrn_irqfd_assign() and acrn_irqfd_deassign() are callable via
> ioctl(2), with no serialization whatsoever. Suppose deassign hits
> as soon as we'd inserted the damn thing into the list. By the
> time we call vfs_poll() irqfd might have been freed. The same
> can happen if hsm_irqfd_wakeup() gets called with EPOLLHUP as a key
> (incidentally, it ought to do
> __poll_t poll_bits = key_to_poll(key);
> instead of
> unsigned long poll_bits = (unsigned long)key;
> and check for EPOLLIN and EPOLLHUP instead of POLLIN and POLLHUP).
>
> AFAICS, that's a UAF...
>
> We could move vfs_poll() under vm->irqfds_lock, but that smells
> like asking for deadlocks ;-/
>
> vfio_virqfd_enable() has the same problem, except that there we
> definitely can't move vfs_poll() under the lock - it's a spinlock.
vfio_virqfd_enable() and vfio_virqfd_disable() are serialized by their
callers, I don't see that they have a UAF problem. Thanks,
Alex
> Could we move vfs_poll() + inject to _before_ making the thing
> public? We'd need to delay POLLHUP handling there, but then
> we need it until the moment with do inject anyway. Something
> like replacing
> if (!list_empty(&irqfd->list))
> hsm_irqfd_shutdown(irqfd);
> in hsm_irqfd_shutdown_work() with
> if (!list_empty(&irqfd->list))
> hsm_irqfd_shutdown(irqfd);
> else
> irqfd->need_shutdown = true;
> and doing
> if (unlikely(irqfd->need_shutdown))
> hsm_irqfd_shutdown(irqfd);
> else
> list_add_tail(&irqfd->list, &vm->irqfds);
> when the sucker is made visible.
>
> I'm *not* familiar with the area, though, so that might be unfeasible
> for any number of reasons.
>
> Suggestions?
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-10 20:09 ` Alex Williamson
@ 2024-06-10 20:53 ` Al Viro
2024-06-11 23:04 ` Alex Williamson
0 siblings, 1 reply; 41+ messages in thread
From: Al Viro @ 2024-06-10 20:53 UTC (permalink / raw)
To: Alex Williamson; +Cc: Fei Li, kvm, linux-fsdevel
On Mon, Jun 10, 2024 at 02:09:06PM -0600, Alex Williamson wrote:
> >
> > We could move vfs_poll() under vm->irqfds_lock, but that smells
> > like asking for deadlocks ;-/
> >
> > vfio_virqfd_enable() has the same problem, except that there we
> > definitely can't move vfs_poll() under the lock - it's a spinlock.
>
> vfio_virqfd_enable() and vfio_virqfd_disable() are serialized by their
> callers, I don't see that they have a UAF problem. Thanks,
>
> Alex
Umm... I agree that there's no UAF on vfio side; acrn and xen/privcmd
counterparts, OTOH, look like they do have that...
OK, so the memory safety in there depends upon
* external exclusion wrt vfio_virqfd_disable() on caller-specific
locks (vfio_pci_core_device::ioeventfds_lock for vfio_pci_rdwr.c,
vfio_pci_core_device::igate for the rest? What about the path via
vfio_pci_core_disable()?)
* no EPOLLHUP on eventfd while the file is pinned. That's what
/*
* Do not drop the file until the irqfd is fully initialized,
* otherwise we might race against the EPOLLHUP.
*/
in there (that "irqfd" is a typo for "kirqfd", right?) refers to.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-10 20:53 ` Al Viro
@ 2024-06-11 23:04 ` Alex Williamson
2024-06-12 2:16 ` Al Viro
0 siblings, 1 reply; 41+ messages in thread
From: Alex Williamson @ 2024-06-11 23:04 UTC (permalink / raw)
To: Al Viro; +Cc: Fei Li, kvm, linux-fsdevel
On Mon, 10 Jun 2024 21:53:05 +0100
Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Mon, Jun 10, 2024 at 02:09:06PM -0600, Alex Williamson wrote:
> > >
> > > We could move vfs_poll() under vm->irqfds_lock, but that smells
> > > like asking for deadlocks ;-/
> > >
> > > vfio_virqfd_enable() has the same problem, except that there we
> > > definitely can't move vfs_poll() under the lock - it's a spinlock.
> >
> > vfio_virqfd_enable() and vfio_virqfd_disable() are serialized by their
> > callers, I don't see that they have a UAF problem. Thanks,
> >
> > Alex
>
> Umm... I agree that there's no UAF on vfio side; acrn and xen/privcmd
> counterparts, OTOH, look like they do have that...
>
> OK, so the memory safety in there depends upon
> * external exclusion wrt vfio_virqfd_disable() on caller-specific
> locks (vfio_pci_core_device::ioeventfds_lock for vfio_pci_rdwr.c,
> vfio_pci_core_device::igate for the rest? What about the path via
> vfio_pci_core_disable()?)
This is only called when the device is closed, therefore there's no
userspace access to generate a race.
> * no EPOLLHUP on eventfd while the file is pinned. That's what
> /*
> * Do not drop the file until the irqfd is fully initialized,
> * otherwise we might race against the EPOLLHUP.
> */
> in there (that "irqfd" is a typo for "kirqfd", right?) refers to.
Sorry, I'm not fully grasping your comment. "irqfd" is not a typo
here, "kirqfd" seems to be a Xen thing. I believe the comment is
referring to holding a reference to the fd until everything is in place
to cleanup correctly if the user process is killed mid-setup. Thanks,
Alex
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable()
2024-06-11 23:04 ` Alex Williamson
@ 2024-06-12 2:16 ` Al Viro
0 siblings, 0 replies; 41+ messages in thread
From: Al Viro @ 2024-06-12 2:16 UTC (permalink / raw)
To: Alex Williamson; +Cc: Fei Li, kvm, linux-fsdevel
On Tue, Jun 11, 2024 at 05:04:38PM -0600, Alex Williamson wrote:
> > OK, so the memory safety in there depends upon
> > * external exclusion wrt vfio_virqfd_disable() on caller-specific
> > locks (vfio_pci_core_device::ioeventfds_lock for vfio_pci_rdwr.c,
> > vfio_pci_core_device::igate for the rest? What about the path via
> > vfio_pci_core_disable()?)
>
> This is only called when the device is closed, therefore there's no
> userspace access to generate a race.
Umm... Let's see if I got confused in RTFS:
1. Calls of vfio_pci_core_disable() come from assorted ->close_device()
instances and failure exits in ->open_device() ones.
2. ->open_device() is called by vfio_df_device_first_open() from
vfio_df_open(). That's done under device->dev_set->lock.
3. ->close_device() is called by vfio_df_device_last_close() from
vfio_df_close(), under the same lock.
4. vfio_df_open() comes from vfio_df_ioctl_bind_iommufd() or from
vfio_df_group_open(). vfio_df_close() is done by the failure exits in those
two, as well as by vfio_df_unbind_iommufd() and vfio_df_group_close().
5. vfio_df_bind_iommufd() handles VFIO_DEVICE_BIND_IOMMUFD in
vfio_device_fops->unlocked_ioctl(); only works for !df->group and only
once, unless I'm misreading vfio_df_open(). No other ioctls are allowed
until that's done and vfio_df_unbind_iommufd() is done in ->release(),
in case of !df->group. vfio_df_close() is done there, in case we had
a successful BIND_IOMMUFD done at some point. Multiple files can be opened
for the same device; once one of them have done BIND_IOMMUFD, BIND_IOMMUFD
on any of them will fail until the first caller gets closed. Once that
happens, others can get BIND_IOMMUFD; until then ioctls don't work for
them at all (IOW, BIND_IOMMUFD grants the ability to do ioctls only for
the opened file it had been done on).
6. vfio_df_group_open() and vfio_df_close() is about the other
way to get files with such ->f_op - VFIO_GROUP_GET_DEVICE_FD handling
in vfio_group_fops.unlocked_ioctl(). That gets an anon-inode file
with vfio_device_fops and shoves it into descriptor table. Presumably
vfio_device_get_from_name() always returns a device with non-NULL
device->group (it would better, or the things would get really confused).
vfio_df_group_open() is done fist, with vfio_df_group_close() on failure
exit *and* in ->release() of those suckers (again, assuming we do get
non-NULL ->group).
OK, that seems to be enough - anything done in ->ioctl() would
be completed between vfio_df_open() and vfio_df_close(), so we do have
the exclusion. Is the above correct?
FWIW, this was a confusing bit:
vfio_pci_core_disable(vdev);
mutex_lock(&vdev->igate);
if (vdev->err_trigger) {
eventfd_ctx_put(vdev->err_trigger);
vdev->err_trigger = NULL;
}
if (vdev->req_trigger) {
eventfd_ctx_put(vdev->req_trigger);
vdev->req_trigger = NULL;
}
mutex_unlock(&vdev->igate);
in vfio_pci_core_close_device(). Since we need an exclusion on ->igate
there for something, and since it's one of the locks used to serialize
vfio_virqfd_enable()...
> > * no EPOLLHUP on eventfd while the file is pinned. That's what
> > /*
> > * Do not drop the file until the irqfd is fully initialized,
> > * otherwise we might race against the EPOLLHUP.
> > */
> > in there (that "irqfd" is a typo for "kirqfd", right?) refers to.
>
> Sorry, I'm not fully grasping your comment. "irqfd" is not a typo
> here, "kirqfd" seems to be a Xen thing. I believe the comment is
> referring to holding a reference to the fd until everything is in place
> to cleanup correctly if the user process is killed mid-setup. Thanks,
*blink*
s/kirqfd/virqfd/, sorry. In the comment earlier in the same function:
* virqfds can be released by closing the eventfd or directly
^^^^^^^
* through ioctl. These are both done through a workqueue, so
* we update the pointer to the virqfd under lock to avoid
^^^ ^^^^^^
* pushing multiple jobs to release the same virqfd.
^^^ ^^^^ ^^^^^^
In the second comment you have
* Do not drop the file until the irqfd is fully initialized,
^^^ ^^^^^
* otherwise we might race against the EPOLLHUP.
If that refers to the same object, the comment makes sense - once
you've called vfs_poll(), EPOLLHUP wakeup would have your new instance of
struct virqfd freed, so accessing it (e.g. in
schedule_work(&virqfd->inject);
) is only safe because EPOLLHUP won't come until eventfd_release(), which
will not happen as long as you don't drop the file reference that sits in
irqfd.file. That's the reason why struct fd instance can't be released
until you are done with setting the struct virqfd instance up.
If that reading is what you intended, "irqfd" in the second
comment ought to be "virqfd", to be consistent with the reference to the
same thing in the earlier comment. If that's not what you meant... what
is that comment really about? Killing the user process mid-setup won't
actually do anything until your thread is out of whatever syscall it
had been in (ioctl(2), usually); the dangerous scenario would be having
another thread close the same descriptor after you've done fdput().
The thing you need to avoid is having all references to
eventfd file dropped. For that the reference in descriptor table must
be gone. Sure, killing the process might do that - once all threads
get to exit_files() and drop their references to descriptor table.
Then put_files_struct() from the last of them will call close_files()
and drop all file references you had in the table.
But that can happen only when all threads have gotten through
the signal delivery. Including the one that was in the middle of
vfio_virqfd_enable(). And _that_ won't happen until return from that
function.
So having the caller killed mid-setup is not an issue. Another
thread sharing the same descriptor table and calling close(fd) (or
dup2(something, fd), or anything else that would close that descriptor)
would be. _That_ is what is prevented by the fdget()/fdput() pair -
between those we are guaranteed that file reference will stay pinned.
If descriptor table is shared, fdget() will clone the reference and store
it in irqfd.file, so the file remains pinned no matter what happens
to descriptor table, until fdput() drops the reference. If the table
is _not_ shared, the reference in it won't go away until we are done,
so we can borrow that into irqfd.file (and do nothing on fdput()).
Anyway, I believe that what you have there is actually safe.
Analysis could be less convoluted, but then I might've missed simpler
reasons why everything works.
It really needs comments in there - as it is, two drivers have
copied that scheme without bothering with exclusion (commit f8941e6c4c71
"xen: privcmd: Add support for irqfd" last year and commit aa3b483ff1d7
"virt: acrn: Introduce irqfd" three years ago) with, AFAICT, real UAF
in each ;-/
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...))
2024-06-10 2:44 ` [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)) Al Viro
@ 2024-06-12 16:36 ` Linus Torvalds
2024-06-13 10:56 ` Michael Ellerman
0 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2024-06-12 16:36 UTC (permalink / raw)
To: Al Viro
Cc: Michael Ellerman, linux-fsdevel, Christian Brauner,
Alexey Kardashevskiy, Paul Mackerras
On Sun, 9 Jun 2024 at 19:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Unless I'm misreading that code (entirely possible), this fdput() shouldn't
> be done until we are done with stt.
Ack. That looks right to me.
If I follow it right, the lifetime of stt is tied to the lifetime of
the file (plus RCU), so doing fdput early and then dropping the RCU
lock means that stt may not be valid any more later.
Making it use the auto-release of a fd class sounds like a good fix,
but I don't know this code.
Linus
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...))
2024-06-12 16:36 ` Linus Torvalds
@ 2024-06-13 10:56 ` Michael Ellerman
0 siblings, 0 replies; 41+ messages in thread
From: Michael Ellerman @ 2024-06-13 10:56 UTC (permalink / raw)
To: Linus Torvalds, Al Viro
Cc: linux-fsdevel, Christian Brauner, Alexey Kardashevskiy,
Paul Mackerras, linuxppc-dev
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, 9 Jun 2024 at 19:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> Unless I'm misreading that code (entirely possible), this fdput() shouldn't
>> be done until we are done with stt.
>
> Ack. That looks right to me.
>
> If I follow it right, the lifetime of stt is tied to the lifetime of
> the file (plus RCU), so doing fdput early and then dropping the RCU
> lock means that stt may not be valid any more later.
Yep. I added a sleep after the fdput and was able to get KASAN to catch
it (below).
I'll send a fix patch tomorrow, just using fdput(), and then the CLASS
conversion can go on top later.
cheers
==================================================================
BUG: KASAN: slab-use-after-free in kvm_spapr_tce_attach_iommu_group+0x298/0x720 [kvm]
Read of size 4 at addr c000200027552c30 by task kvm-vfio/2505
CPU: 54 PID: 2505 Comm: kvm-vfio Not tainted 6.10.0-rc3-next-20240612-dirty #1
Hardware name: 8335-GTH POWER9 0x4e1202 opal:skiboot-v6.5.3-35-g1851b2a06 PowerNV
Call Trace:
[c00020008c2a7860] [c0000000027d4d50] dump_stack_lvl+0xb4/0x108 (unreliable)
[c00020008c2a78a0] [c00000000072dfa8] print_report+0x2b4/0x6ec
[c00020008c2a7990] [c00000000072d898] kasan_report+0x118/0x2b0
[c00020008c2a7aa0] [c00000000072ff38] __asan_load4+0xb8/0xd0
[c00020008c2a7ac0] [c00800001b343140] kvm_spapr_tce_attach_iommu_group+0x298/0x720 [kvm]
[c00020008c2a7b90] [c00800001b31d61c] kvm_vfio_set_attr+0x524/0xac0 [kvm]
[c00020008c2a7c60] [c00800001b3083ec] kvm_device_ioctl+0x144/0x240 [kvm]
[c00020008c2a7cd0] [c0000000007e052c] sys_ioctl+0x62c/0x1810
[c00020008c2a7df0] [c000000000038d90] system_call_exception+0x190/0x440
[c00020008c2a7e50] [c00000000000d15c] system_call_vectored_common+0x15c/0x2ec
--- interrupt: 3000 at 0x7fff8af5bedc
NIP: 00007fff8af5bedc LR: 00007fff8af5bedc CTR: 0000000000000000
REGS: c00020008c2a7e80 TRAP: 3000 Not tainted (6.10.0-rc3-next-20240612-dirty)
MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44002482 XER: 00000000
IRQMASK: 0
GPR00: 0000000000000036 00007fffda53b1f0 00007fff8b066d00 0000000000000006
GPR04: 000000008018aee1 00007fffda53b270 0000000000000008 00007fff8ac0e9e0
GPR08: 0000000000000006 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fff8b2ca540 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 00000000100101c0
GPR24: 00007fff8b2bf840 00007fff8b2c0000 00007fffda53b728 0000000000000001
GPR28: 00007fffda53b838 0000000000000006 0000000000000001 0000000000000005
NIP [00007fff8af5bedc] 0x7fff8af5bedc
LR [00007fff8af5bedc] 0x7fff8af5bedc
--- interrupt: 3000
Allocated by task 2505:
kasan_save_stack+0x48/0x80
kasan_save_track+0x2c/0x50
kasan_save_alloc_info+0x44/0x60
__kasan_kmalloc+0xd0/0x120
__kmalloc_noprof+0x214/0x670
kvm_vm_ioctl_create_spapr_tce+0x10c/0x420 [kvm]
kvm_arch_vm_ioctl+0x5fc/0x890 [kvm]
kvm_vm_ioctl+0xa54/0x13d0 [kvm]
sys_ioctl+0x62c/0x1810
system_call_exception+0x190/0x440
system_call_vectored_common+0x15c/0x2ec
Freed by task 0:
kasan_save_stack+0x48/0x80
kasan_save_track+0x2c/0x50
kasan_save_free_info+0xac/0xd0
__kasan_slab_free+0x120/0x210
kfree+0xec/0x3e0
release_spapr_tce_table+0xd4/0x11c [kvm]
rcu_core+0x568/0x16a0
handle_softirqs+0x23c/0x920
do_softirq_own_stack+0x6c/0x90
do_softirq_own_stack+0x58/0x90
__irq_exit_rcu+0x218/0x2d0
irq_exit+0x30/0x80
arch_local_irq_restore+0x128/0x230
arch_local_irq_enable+0x1c/0x30
cpuidle_enter_state+0x134/0x5cc
cpuidle_enter+0x6c/0xb0
call_cpuidle+0x7c/0x100
do_idle+0x394/0x410
cpu_startup_entry+0x60/0x70
start_secondary+0x3fc/0x410
start_secondary_prolog+0x10/0x14
Last potentially related work creation:
kasan_save_stack+0x48/0x80
__kasan_record_aux_stack+0xcc/0x130
__call_rcu_common.constprop.0+0x8c/0x8e0
kvm_spapr_tce_release+0x29c/0xbc10 [kvm]
__fput+0x22c/0x630
sys_close+0x70/0xe0
system_call_exception+0x190/0x440
system_call_vectored_common+0x15c/0x2ec
The buggy address belongs to the object at c000200027552c00
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 48 bytes inside of
freed 256-byte region [c000200027552c00, c000200027552d00)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xc000200027551800 pfn:0x20002755
flags: 0x83ffff800000000(node=8|zone=0|lastcpupid=0x7ffff)
page_type: 0xfdffffff(slab)
raw: 083ffff800000000 c000000007010d80 5deadbeef0000122 0000000000000000
raw: c000200027551800 0000000080800078 00000001fdffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
c000200027552b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
c000200027552b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>c000200027552c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
c000200027552c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
c000200027552d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2024-06-13 10:56 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-07 1:56 [PATCHES][RFC] rework of struct fd handling Al Viro
2024-06-07 1:59 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Al Viro
2024-06-07 1:59 ` [PATCH 02/19] lirc: rc_dev_get_from_fd(): fix file leak Al Viro
2024-06-07 15:17 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 03/19] introduce fd_file(), convert all accessors to it Al Viro
2024-06-07 1:59 ` [PATCH 04/19] struct fd: representation change Al Viro
2024-06-07 5:55 ` Amir Goldstein
2024-06-07 1:59 ` [PATCH 05/19] add struct fd constructors, get rid of __to_fd() Al Viro
2024-06-07 1:59 ` [PATCH 06/19] net/socket.c: switch to CLASS(fd) Al Viro
2024-06-07 1:59 ` [PATCH 07/19] introduce struct fderr, convert overlayfs uses to that Al Viro
2024-06-07 1:59 ` [PATCH 08/19] fdget_raw() users: switch to CLASS(fd_raw, ...) Al Viro
2024-06-07 15:20 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 09/19] css_set_fork(): " Al Viro
2024-06-07 15:21 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 10/19] introduce "fd_pos" class Al Viro
2024-06-07 15:21 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...) Al Viro
2024-06-07 15:26 ` Christian Brauner
2024-06-07 16:10 ` Al Viro
2024-06-07 16:11 ` Al Viro
2024-06-07 21:08 ` Al Viro
2024-06-10 2:44 ` [RFC] potential UAF in kvm_spapr_tce_attach_iommu_group() (was Re: [PATCH 11/19] switch simple users of fdget() to CLASS(fd, ...)) Al Viro
2024-06-12 16:36 ` Linus Torvalds
2024-06-13 10:56 ` Michael Ellerman
2024-06-10 5:12 ` [RFC] UAF in acrn_irqfd_assign() and vfio_virqfd_enable() Al Viro
2024-06-10 17:03 ` Al Viro
2024-06-10 20:09 ` Alex Williamson
2024-06-10 20:53 ` Al Viro
2024-06-11 23:04 ` Alex Williamson
2024-06-12 2:16 ` Al Viro
2024-06-07 1:59 ` [PATCH 12/19] bpf: switch to CLASS(fd, ...) Al Viro
2024-06-07 15:27 ` Christian Brauner
2024-06-07 1:59 ` [PATCH 13/19] convert vmsplice() " Al Viro
2024-06-07 1:59 ` [PATCH 14/19] finit_module(): convert " Al Viro
2024-06-07 1:59 ` [PATCH 15/19] timerfd: switch " Al Viro
2024-06-07 1:59 ` [PATCH 16/19] do_mq_notify(): " Al Viro
2024-06-07 1:59 ` [PATCH 17/19] simplify xfs_find_handle() a bit Al Viro
2024-06-07 1:59 ` [PATCH 18/19] convert kernel/events/core.c Al Viro
2024-06-07 1:59 ` [PATCH 19/19] deal with the last remaing boolean uses of fd_file() Al Viro
2024-06-07 15:16 ` [PATCH 01/19] powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap() Christian Brauner
2024-06-07 15:30 ` [PATCHES][RFC] rework of struct fd handling Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).