From: Aleksa Sarai <cyphar@cyphar.com>
To: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Arnd Bergmann <arnd@arndb.de>, Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <kees@kernel.org>, Florian Weimer <fweimer@redhat.com>,
Arnd Bergmann <arnd@arndb.de>,
Mark Rutland <mark.rutland@arm.com>,
linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-kselftest@vger.kernel.org,
Aleksa Sarai <cyphar@cyphar.com>
Subject: [PATCH RFC 4/8] openat2: add CHECK_FIELDS flag to usize argument
Date: Mon, 02 Sep 2024 17:06:26 +1000 [thread overview]
Message-ID: <20240902-extensible-structs-check_fields-v1-4-545e93ede2f2@cyphar.com> (raw)
In-Reply-To: <20240902-extensible-structs-check_fields-v1-0-545e93ede2f2@cyphar.com>
In order for userspace to be able to know what flags and fields the
kernel supports, it is currently necessary for them to do a bunch of
fairly subtle self-checks where you need to get a syscall to return a
non-EINVAL error or no-op for each flag you wish to use. If you get
-EINVAL you know the flag is unsupported, otherwise you know it is
supported.
This doesn't scale well for programs that need to check many flags, and
not all syscalls can be easily checked (how would you check for new
flags for umount2 or clone3 without side-effects?). To solve this
problem, we can take advantage of the extensible struct API used by
copy_struct_from_user() by providing a special CHECK_FIELDS flag to
extensible struct syscalls (like openat2 and clone3) which will:
1. Cause the syscall to fill the structure with every valid bit the
kernel understands. For flag arguments, this is the set of all valid
flag bits. For pointer and file descriptor arguments, this would be
all 0xFF bits (to indicate that any bits are valid). Userspace can
then easily check whether the flag they wanted is supported (by
doing a simple bitwise AND) or if a field itself is supported (by
checking if it is non-zero / all 0xFF).
2. Return a specific no-op error (-EEXTSYS_NOOP) that is not used as an
error by any other kernel code, so that userspace can be absolutely
sure that the kernel supports CHECK_FIELDS.
Rather than passing CHECK_FIELDS using the standard flags arguments for
the syscall, CHECK_FIELDS is instead the highest bit in the provided
struct size. The high bits of the size are never going to be non-zero
(we currently only allow size to be up to PAGE_SIZE, and it seems very
unlikely we will ever allow several exabyte structure arguments).
By passing the flag in the structure size, we can be sure that old
kernels will return a consistent error code (-EFAULT in openat2's case)
and that seccomp can properly filter this syscall mode (which is
guaranteed to be a no-op on all kernels -- it could even force
-EEXTSYS_NOOP to make the userspace program think the kernel doesn't
support any syscall features).
The intended way of using this interface to get feature information
looks something like the following (imagine that openat2 has gained a
new field and a new flag in the future):
static bool openat2_no_automount_supported;
static bool openat2_cwd_fd_supported;
int check_openat2_support(void)
{
int err;
struct open_how how = {};
err = openat2(AT_FDCWD, ".", &how, CHECK_FIELDS | sizeof(how));
assert(err < 0);
switch (errno) {
case EFAULT: case E2BIG:
/* Old kernel... */
check_support_the_old_way();
break;
case EEXTSYS_NOOP:
openat2_no_automount_supported = (how.flags & RESOLVE_NO_AUTOMOUNT);
openat2_cwd_fd_supported = (how.cwd_fd != 0);
break;
}
}
Link: https://youtu.be/ggD-eb3yPVs
Link: https://lwn.net/Articles/830666/
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
fs/open.c | 15 +++++++++++++++
include/uapi/asm-generic/errno.h | 3 +++
include/uapi/linux/openat2.h | 2 ++
3 files changed, 20 insertions(+)
diff --git a/fs/open.c b/fs/open.c
index 30bfcddd505d..10bfc8d6555c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1458,6 +1458,21 @@ SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename,
if (unlikely(usize < OPEN_HOW_SIZE_VER0))
return -EINVAL;
+
+ if (unlikely(usize & CHECK_FIELDS)) {
+ usize &= ~CHECK_FIELDS;
+
+ memset(&tmp, 0, sizeof(tmp));
+ tmp = (struct open_how) {
+ .flags = VALID_OPEN_FLAGS,
+ .mode = S_IALLUGO,
+ .resolve = VALID_RESOLVE_FLAGS,
+ };
+
+ err = copy_struct_to_user(how, usize, &tmp, sizeof(tmp), NULL);
+ return err ?: -EEXTSYS_NOOP;
+ }
+
if (unlikely(usize > PAGE_SIZE))
return -E2BIG;
diff --git a/include/uapi/asm-generic/errno.h b/include/uapi/asm-generic/errno.h
index cf9c51ac49f9..f5bfe081e73a 100644
--- a/include/uapi/asm-generic/errno.h
+++ b/include/uapi/asm-generic/errno.h
@@ -120,4 +120,7 @@
#define EHWPOISON 133 /* Memory page has hardware error */
+/* For extensible syscalls. */
+#define EEXTSYS_NOOP 134 /* Extensible syscall performed no operation */
+
#endif
diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h
index a5feb7604948..6052a504cfa4 100644
--- a/include/uapi/linux/openat2.h
+++ b/include/uapi/linux/openat2.h
@@ -4,6 +4,8 @@
#include <linux/types.h>
+#define CHECK_FIELDS (1ULL << 63)
+
/*
* Arguments for how openat2(2) should open the target path. If only @flags and
* @mode are non-zero, then openat2(2) operates very similarly to openat(2).
--
2.46.0
next prev parent reply other threads:[~2024-09-02 7:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-02 7:06 [PATCH RFC 0/8] extensible syscalls: CHECK_FIELDS to allow for easier feature detection Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 1/8] uaccess: add copy_struct_to_user helper Aleksa Sarai
2024-09-02 8:55 ` Arnd Bergmann
2024-09-02 16:02 ` Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 2/8] sched_getattr: port to copy_struct_to_user Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 3/8] openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) Aleksa Sarai
2024-09-02 9:09 ` Arnd Bergmann
2024-09-02 16:08 ` Aleksa Sarai
2024-09-02 19:23 ` Arnd Bergmann
2024-09-02 7:06 ` Aleksa Sarai [this message]
2024-09-02 7:06 ` [PATCH RFC 5/8] clone3: add CHECK_FIELDS flag to usize argument Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 6/8] selftests: openat2: add 0xFF poisoned data after misaligned struct Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 7/8] selftests: openat2: add CHECK_FIELDS selftests Aleksa Sarai
2024-09-02 7:06 ` [PATCH RFC 8/8] selftests: clone3: " Aleksa Sarai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240902-extensible-structs-check_fields-v1-4-545e93ede2f2@cyphar.com \
--to=cyphar@cyphar.com \
--cc=arnd@arndb.de \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=fweimer@redhat.com \
--cc=jack@suse.cz \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shuah@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=viro@zeniv.linux.org.uk \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).