* Re: [PATCH RESEND v3 1/4] lib: introduce copy_struct_from_user() helper
From: Kees Cook @ 2019-09-30 23:37 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Aleksa Sarai, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20190930191526.19544-2-asarai@suse.de>
On Tue, Oct 01, 2019 at 05:15:23AM +1000, Aleksa Sarai wrote:
> From: Aleksa Sarai <cyphar@cyphar.com>
>
> A common pattern for syscall extensions is increasing the size of a
> struct passed from userspace, such that the zero-value of the new fields
> result in the old kernel behaviour (allowing for a mix of userspace and
> kernel vintages to operate on one another in most cases).
>
> While this interface exists for communication in both directions, only
> one interface is straightforward to have reasonable semantics for
> (userspace passing a struct to the kernel). For kernel returns to
> userspace, what the correct semantics are (whether there should be an
> error if userspace is unaware of a new extension) is very
> syscall-dependent and thus probably cannot be unified between syscalls
> (a good example of this problem is [1]).
>
> Previously there was no common lib/ function that implemented
> the necessary extension-checking semantics (and different syscalls
> implemented them slightly differently or incompletely[2]). Future
> patches replace common uses of this pattern to make use of
> copy_struct_from_user().
>
> Some in-kernel selftests that insure that the handling of alignment and
> various byte patterns are all handled identically to memchr_inv() usage.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> always rejects differently-sized struct arguments.
>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
> include/linux/bitops.h | 7 +++
> include/linux/uaccess.h | 4 ++
> lib/strnlen_user.c | 8 +--
> lib/test_user_copy.c | 133 ++++++++++++++++++++++++++++++++++++++--
> lib/usercopy.c | 123 +++++++++++++++++++++++++++++++++++++
> 5 files changed, 262 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index cf074bce3eb3..c94a9ff9f082 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -4,6 +4,13 @@
> #include <asm/types.h>
> #include <linux/bits.h>
>
> +/* Set bits in the first 'n' bytes when loaded from memory */
> +#ifdef __LITTLE_ENDIAN
> +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> +#else
> +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> +#endif
> +
> #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 70bbdc38dc37..94f20e6ec6ab 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -231,6 +231,10 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
>
> #endif /* ARCH_HAS_NOCACHE_UACCESS */
>
> +extern int check_zeroed_user(const void __user *from, size_t size);
> +extern int copy_struct_from_user(void *dst, size_t ksize,
> + const void __user *src, size_t usize);
> +
> /*
> * probe_kernel_read(): safely attempt to read from a location
> * @dst: pointer to the buffer that shall take the data
> diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
> index 28ff554a1be8..6c0005d5dd5c 100644
> --- a/lib/strnlen_user.c
> +++ b/lib/strnlen_user.c
> @@ -3,16 +3,10 @@
> #include <linux/export.h>
> #include <linux/uaccess.h>
> #include <linux/mm.h>
> +#include <linux/bitops.h>
>
> #include <asm/word-at-a-time.h>
>
> -/* Set bits in the first 'n' bytes when loaded from memory */
> -#ifdef __LITTLE_ENDIAN
> -# define aligned_byte_mask(n) ((1ul << 8*(n))-1)
> -#else
> -# define aligned_byte_mask(n) (~0xfful << (BITS_PER_LONG - 8 - 8*(n)))
> -#endif
> -
> /*
> * Do a strnlen, return length of string *with* final '\0'.
> * 'count' is the user-supplied count, while 'max' is the
> diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> index 67bcd5dfd847..3a17f71029bb 100644
> --- a/lib/test_user_copy.c
> +++ b/lib/test_user_copy.c
> @@ -16,6 +16,7 @@
> #include <linux/slab.h>
> #include <linux/uaccess.h>
> #include <linux/vmalloc.h>
> +#include <linux/random.h>
>
> /*
> * Several 32-bit architectures support 64-bit {get,put}_user() calls.
> @@ -31,14 +32,129 @@
> # define TEST_U64
> #endif
>
> -#define test(condition, msg) \
> -({ \
> - int cond = (condition); \
> - if (cond) \
> - pr_warn("%s\n", msg); \
> - cond; \
> +#define test(condition, msg, ...) \
> +({ \
> + int cond = (condition); \
> + if (cond) \
> + pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
> + cond; \
> })
>
> +static bool is_zeroed(void *from, size_t size)
> +{
> + return memchr_inv(from, 0x0, size) == NULL;
> +}
> +
> +static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> +{
> + int ret = 0;
> + size_t start, end, i;
> + size_t zero_start = size / 4;
> + size_t zero_end = size - zero_start;
> +
> + /*
> + * We conduct a series of check_nonzero_user() tests on a block of memory
> + * with the following byte-pattern (trying every possible [start,end]
> + * pair):
> + *
> + * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
> + *
> + * And we verify that check_nonzero_user() acts identically to memchr_inv().
> + */
> +
> + memset(kmem, 0x0, size);
> + for (i = 1; i < zero_start; i += 2)
> + kmem[i] = 0xff;
> + for (i = zero_end; i < size; i += 2)
> + kmem[i] = 0xff;
> +
> + ret |= test(copy_to_user(umem, kmem, size),
> + "legitimate copy_to_user failed");
> +
> + for (start = 0; start <= size; start++) {
> + for (end = start; end <= size; end++) {
> + size_t len = end - start;
> + int retval = check_zeroed_user(umem + start, len);
> + int expected = is_zeroed(kmem + start, len);
> +
> + ret |= test(retval != expected,
> + "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
> + retval, expected, start, end);
> + }
> + }
> +
> + return ret;
> +}
> +
> +static int test_copy_struct_from_user(char *kmem, char __user *umem,
> + size_t size)
> +{
> + int ret = 0;
> + char *rand = NULL, *expected = NULL;
> + size_t ksize, usize;
> +
> + rand = kmalloc(size, GFP_KERNEL);
> + if (ret |= test(rand == NULL, "kmalloc failed"))
> + goto out_free;
> +
> + expected = kmalloc(size, GFP_KERNEL);
> + if (ret |= test(expected == NULL, "kmalloc failed"))
> + goto out_free;
> +
> + /* Fill umem with random bytes. */
> + memset(kmem, 0x0, size);
> + prandom_bytes(rand, size);
I don't really like using random() in tests on the chance that we get
failures we can't reproduced. If you want to do this (instead of using a
byte-fill pattern), you need to dump the entire state of the memory
region. Why not just memset(rand, 0xff, ...)? (And obviously rename
"rand")
> + ret |= test(copy_to_user(umem, rand, size),
> + "legitimate copy_to_user failed");
> +
> + /* Check basic case -- (usize == ksize). */
> + ksize = size;
> + usize = size;
I'd move the memset(kmem, 0x0, size); down to here.
> + memcpy(expected, rand, ksize);
> +
> + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> + "copy_struct_from_user(usize == ksize) failed");
> + ret |= test(memcmp(kmem, expected, ksize),
> + "copy_struct_from_user(usize == ksize) gives unexpected copy");
> +
> + /* Old userspace case -- (usize < ksize). */
> + ksize = size;
> + usize = ksize / 2;
> +
I would expect memset(kmem, 0x0, size); again here since a new test of
that region is starting.
> + memcpy(expected, rand, usize);
> + memset(expected + usize, 0x0, ksize - usize);
> +
> + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> + "copy_struct_from_user(usize < ksize) failed");
> + ret |= test(memcmp(kmem, expected, ksize),
> + "copy_struct_from_user(usize < ksize) gives unexpected copy");
> +
> + /* New userspace (-E2BIG) case -- (usize > ksize). */
> + usize = size;
> + ksize = usize / 2;
and here?
> +
> + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize) != -E2BIG,
> + "copy_struct_from_user(usize > ksize) didn't give E2BIG");
> +
> + /* New userspace (success) case -- (usize > ksize). */
> + usize = size;
> + ksize = usize / 2;
> +
and here?
> + memcpy(expected, rand, ksize);
> +
> + ret |= test(clear_user(umem + ksize, usize - ksize),
> + "legitimate clear_user failed");
> + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> + "copy_struct_from_user(usize > ksize) failed");
> + ret |= test(memcmp(kmem, expected, ksize),
> + "copy_struct_from_user(usize > ksize) gives unexpected copy");
> +
> +out_free:
> + kfree(expected);
> + kfree(rand);
> + return ret;
> +}
> +
> static int __init test_user_copy_init(void)
> {
> int ret = 0;
> @@ -106,6 +222,11 @@ static int __init test_user_copy_init(void)
> #endif
> #undef test_legit
>
> + /* Test usage of check_nonzero_user(). */
> + ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
> + /* Test usage of copy_struct_from_user(). */
> + ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);
> +
> /*
> * Invalid usage: none of these copies should succeed.
> */
> diff --git a/lib/usercopy.c b/lib/usercopy.c
> index c2bfbcaeb3dc..cf7f854ed9c8 100644
> --- a/lib/usercopy.c
> +++ b/lib/usercopy.c
> @@ -1,5 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0
> #include <linux/uaccess.h>
> +#include <linux/bitops.h>
>
> /* out-of-line parts */
>
> @@ -31,3 +32,125 @@ unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
> }
> EXPORT_SYMBOL(_copy_to_user);
> #endif
> +
> +/**
> + * check_zeroed_user: check if a userspace buffer only contains zero bytes
> + * @from: Source address, in userspace.
> + * @size: Size of buffer.
> + *
> + * This is effectively shorthand for "memchr_inv(from, 0, size) == NULL" for
> + * userspace addresses (and is more efficient because we don't care where the
> + * first non-zero byte is).
> + *
> + * Returns:
> + * * 0: There were non-zero bytes present in the buffer.
> + * * 1: The buffer was full of zero bytes.
> + * * -EFAULT: access to userspace failed.
> + */
> +int check_zeroed_user(const void __user *from, size_t size)
> +{
> + unsigned long val;
> + uintptr_t align = (uintptr_t) from % sizeof(unsigned long);
> +
> + if (unlikely(size == 0))
> + return 1;
> +
> + from -= align;
> + size += align;
> +
> + if (!user_access_begin(from, size))
> + return -EFAULT;
> +
> + unsafe_get_user(val, (unsigned long __user *) from, err_fault);
> + if (align)
> + val &= ~aligned_byte_mask(align);
> +
> + while (size > sizeof(unsigned long)) {
> + if (unlikely(val))
> + goto done;
> +
> + from += sizeof(unsigned long);
> + size -= sizeof(unsigned long);
> +
> + unsafe_get_user(val, (unsigned long __user *) from, err_fault);
> + }
> +
> + if (size < sizeof(unsigned long))
> + val &= aligned_byte_mask(size);
> +
> +done:
> + user_access_end();
> + return (val == 0);
> +err_fault:
> + user_access_end();
> + return -EFAULT;
> +}
> +EXPORT_SYMBOL(check_zeroed_user);
> +
> +/**
> + * copy_struct_from_user: copy a struct from userspace
> + * @dst: Destination address, in kernel space. This buffer must be @ksize
> + * bytes long.
> + * @ksize: Size of @dst struct.
> + * @src: Source address, in userspace.
> + * @usize: (Alleged) size of @src struct.
> + *
> + * Copies a struct from userspace to kernel space, in a way that guarantees
> + * backwards-compatibility for struct syscall arguments (as long as future
> + * struct extensions are made such that all new fields are *appended* to the
> + * old struct, and zeroed-out new fields have the same meaning as the old
> + * struct).
> + *
> + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> + * The recommended usage is something like the following:
> + *
> + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> + * {
> + * int err;
> + * struct foo karg = {};
> + *
> + * if (usize > PAGE_SIZE)
> + * return -E2BIG;
> + * if (usize < FOO_SIZE_VER0)
> + * return -EINVAL;
> + *
> + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> + * if (err)
> + * return err;
> + *
> + * // ...
> + * }
> + *
> + * There are three cases to consider:
> + * * If @usize == @ksize, then it's copied verbatim.
> + * * If @usize < @ksize, then the userspace has passed an old struct to a
> + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> + * are to be zero-filled.
> + * * If @usize > @ksize, then the userspace has passed a new struct to an
> + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> + *
> + * Returns (in all cases, some data may have been copied):
> + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> + * * -EFAULT: access to userspace failed.
> + */
> +int copy_struct_from_user(void *dst, size_t ksize,
> + const void __user *src, size_t usize)
I'd like to see this be marked __always_inline so the dst type is known
to the compiler during the copy_from_user() size sanity checks, etc.
> +{
> + size_t size = min(ksize, usize);
> + size_t rest = max(ksize, usize) - size;
> +
> + /* Deal with trailing bytes. */
> + if (usize < ksize) {
> + memset(dst + size, 0, rest);
> + } else if (usize > ksize) {
> + int ret = check_zeroed_user(src + size, rest);
> + if (ret <= 0)
> + return ret ?: -E2BIG;
> + }
> + /* Copy the interoperable parts of the struct. */
> + if (copy_from_user(dst, src, size))
> + return -EFAULT;
> + return 0;
> +}
> +EXPORT_SYMBOL(copy_struct_from_user);
> --
> 2.23.0
>
But besides those things, yes please. :)
--
Kees Cook
^ permalink raw reply
* Re: [PATCH RESEND v3 2/4] clone3: switch to copy_struct_from_user()
From: Kees Cook @ 2019-09-30 23:42 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Aleksa Sarai, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20190930191526.19544-3-asarai@suse.de>
On Tue, Oct 01, 2019 at 05:15:24AM +1000, Aleksa Sarai wrote:
> From: Aleksa Sarai <cyphar@cyphar.com>
>
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Additionally, explicitly
> define CLONE_ARGS_SIZE_VER0 to match the other users of the
> struct-extension pattern.
>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
> include/uapi/linux/sched.h | 2 ++
> kernel/fork.c | 34 +++++++---------------------------
> 2 files changed, 9 insertions(+), 27 deletions(-)
>
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index b3105ac1381a..0945805982b4 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -47,6 +47,8 @@ struct clone_args {
> __aligned_u64 tls;
> };
>
> +#define CLONE_ARGS_SIZE_VER0 64 /* sizeof first published struct */
> +
> /*
> * Scheduling policies
> */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index f9572f416126..2ef529869c64 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2525,39 +2525,19 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
> #ifdef __ARCH_WANT_SYS_CLONE3
> noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> struct clone_args __user *uargs,
> - size_t size)
> + size_t usize)
> {
> + int err;
> struct clone_args args;
>
> - if (unlikely(size > PAGE_SIZE))
> + if (unlikely(usize > PAGE_SIZE))
> return -E2BIG;
I quickly looked through the earlier threads and couldn't find it, but
I have a memory of some discussion about moving this test into the
copy_struct_from_user() function itself? That would seems like a
reasonable idea? ("4k should be enough for any structure!")
Either way:
Reviewed-by: Kees Cook <keescook@chromium.org>
> -
> - if (unlikely(size < sizeof(struct clone_args)))
> + if (unlikely(usize < CLONE_ARGS_SIZE_VER0))
> return -EINVAL;
>
> - if (unlikely(!access_ok(uargs, size)))
> - return -EFAULT;
> -
> - if (size > sizeof(struct clone_args)) {
> - unsigned char __user *addr;
> - unsigned char __user *end;
> - unsigned char val;
> -
> - addr = (void __user *)uargs + sizeof(struct clone_args);
> - end = (void __user *)uargs + size;
> -
> - for (; addr < end; addr++) {
> - if (get_user(val, addr))
> - return -EFAULT;
> - if (val)
> - return -E2BIG;
> - }
> -
> - size = sizeof(struct clone_args);
> - }
> -
> - if (copy_from_user(&args, uargs, size))
> - return -EFAULT;
> + err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
> + if (err)
> + return err;
>
> /*
> * Verify that higher 32bits of exit_signal are unset and that
> --
> 2.23.0
>
--
Kees Cook
^ permalink raw reply
* Re: [PATCH RESEND v3 3/4] sched_setattr: switch to copy_struct_from_user()
From: Kees Cook @ 2019-09-30 23:43 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Aleksa Sarai, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20190930191526.19544-4-asarai@suse.de>
On Tue, Oct 01, 2019 at 05:15:25AM +1000, Aleksa Sarai wrote:
> From: Aleksa Sarai <cyphar@cyphar.com>
>
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Ideally we could also
> unify sched_getattr(2)-style syscalls as well, but unfortunately the
> correct semantics for such syscalls are much less clear (see [1] for
> more detail). In future we could come up with a more sane idea for how
> the syscall interface should look.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
-Kees
> ---
> kernel/sched/core.c | 43 +++++++------------------------------------
> 1 file changed, 7 insertions(+), 36 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7880f4f64d0e..dd05a378631a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5106,9 +5106,6 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
> u32 size;
> int ret;
>
> - if (!access_ok(uattr, SCHED_ATTR_SIZE_VER0))
> - return -EFAULT;
> -
> /* Zero the full structure, so that a short copy will be nice: */
> memset(attr, 0, sizeof(*attr));
>
> @@ -5116,45 +5113,19 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
> if (ret)
> return ret;
>
> - /* Bail out on silly large: */
> - if (size > PAGE_SIZE)
> - goto err_size;
> -
> /* ABI compatibility quirk: */
> if (!size)
> size = SCHED_ATTR_SIZE_VER0;
> -
> - if (size < SCHED_ATTR_SIZE_VER0)
> + if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
> goto err_size;
>
> - /*
> - * If we're handed a bigger struct than we know of,
> - * ensure all the unknown bits are 0 - i.e. new
> - * user-space does not rely on any kernel feature
> - * extensions we dont know about yet.
> - */
> - if (size > sizeof(*attr)) {
> - unsigned char __user *addr;
> - unsigned char __user *end;
> - unsigned char val;
> -
> - addr = (void __user *)uattr + sizeof(*attr);
> - end = (void __user *)uattr + size;
> -
> - for (; addr < end; addr++) {
> - ret = get_user(val, addr);
> - if (ret)
> - return ret;
> - if (val)
> - goto err_size;
> - }
> - size = sizeof(*attr);
> + ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
> + if (ret) {
> + if (ret == -E2BIG)
> + goto err_size;
> + return ret;
> }
>
> - ret = copy_from_user(attr, uattr, size);
> - if (ret)
> - return -EFAULT;
> -
> if ((attr->sched_flags & SCHED_FLAG_UTIL_CLAMP) &&
> size < SCHED_ATTR_SIZE_VER1)
> return -EINVAL;
> @@ -5354,7 +5325,7 @@ sched_attr_copy_to_user(struct sched_attr __user *uattr,
> * sys_sched_getattr - similar to sched_getparam, but with sched_attr
> * @pid: the pid in question.
> * @uattr: structure containing the extended parameters.
> - * @usize: sizeof(attr) that user-space knows about, for forwards and backwards compatibility.
> + * @usize: sizeof(attr) for fwd/bwd comp.
> * @flags: for future extension.
> */
> SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
> --
> 2.23.0
>
--
Kees Cook
^ permalink raw reply
* Re: [PATCH RESEND v3 4/4] perf_event_open: switch to copy_struct_from_user()
From: Kees Cook @ 2019-09-30 23:44 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Aleksa Sarai, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20190930191526.19544-5-asarai@suse.de>
On Tue, Oct 01, 2019 at 05:15:26AM +1000, Aleksa Sarai wrote:
> From: Aleksa Sarai <cyphar@cyphar.com>
>
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls.
>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
-Kees
> ---
> kernel/events/core.c | 47 +++++++++-----------------------------------
> 1 file changed, 9 insertions(+), 38 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4655adbbae10..3f0cb82e4fbc 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10586,55 +10586,26 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
> u32 size;
> int ret;
>
> - if (!access_ok(uattr, PERF_ATTR_SIZE_VER0))
> - return -EFAULT;
> -
> - /*
> - * zero the full structure, so that a short copy will be nice.
> - */
> + /* Zero the full structure, so that a short copy will be nice. */
> memset(attr, 0, sizeof(*attr));
>
> ret = get_user(size, &uattr->size);
> if (ret)
> return ret;
>
> - if (size > PAGE_SIZE) /* silly large */
> - goto err_size;
> -
> - if (!size) /* abi compat */
> + /* ABI compatibility quirk: */
> + if (!size)
> size = PERF_ATTR_SIZE_VER0;
> -
> - if (size < PERF_ATTR_SIZE_VER0)
> + if (size < PERF_ATTR_SIZE_VER0 || size > PAGE_SIZE)
> goto err_size;
>
> - /*
> - * If we're handed a bigger struct than we know of,
> - * ensure all the unknown bits are 0 - i.e. new
> - * user-space does not rely on any kernel feature
> - * extensions we dont know about yet.
> - */
> - if (size > sizeof(*attr)) {
> - unsigned char __user *addr;
> - unsigned char __user *end;
> - unsigned char val;
> -
> - addr = (void __user *)uattr + sizeof(*attr);
> - end = (void __user *)uattr + size;
> -
> - for (; addr < end; addr++) {
> - ret = get_user(val, addr);
> - if (ret)
> - return ret;
> - if (val)
> - goto err_size;
> - }
> - size = sizeof(*attr);
> + ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
> + if (ret) {
> + if (ret == -E2BIG)
> + goto err_size;
> + return ret;
> }
>
> - ret = copy_from_user(attr, uattr, size);
> - if (ret)
> - return -EFAULT;
> -
> attr->size = size;
>
> if (attr->__reserved_1)
> --
> 2.23.0
>
--
Kees Cook
^ permalink raw reply
* Re: [PATCH v13 7/9] open: openat2(2) syscall
From: kbuild test robot @ 2019-10-01 0:22 UTC (permalink / raw)
To: Aleksa Sarai
Cc: kbuild-all, Al Viro, Jeff Layton, J. Bruce Fields, Arnd Bergmann,
David Howells, Shuah Khan, Shuah Khan, Ingo Molnar,
Peter Zijlstra, linux-ia64, linux-sh, Alexander Shishkin,
Rasmus Villemoes, Alexei Starovoitov, linux-kernel,
linux-kselftest, sparclinux, Jiri Olsa, linux-arch, linux-s390,
Tycho Andersen, Aleksa Sarai
In-Reply-To: <20190930183316.10190-8-cyphar@cyphar.com>
[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]
Hi Aleksa,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[cannot apply to v5.4-rc1 next-20190930]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Aleksa-Sarai/namei-openat2-2-path-resolution-restrictions/20191001-025628
config: x86_64-randconfig-s0-201939 (attached as .config)
compiler: gcc-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
fs/open.c: In function '__do_sys_openat2':
>> fs/open.c:1173:8: error: implicit declaration of function 'copy_struct_from_user' [-Werror=implicit-function-declaration]
err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize);
^
cc1: some warnings being treated as errors
vim +/copy_struct_from_user +1173 fs/open.c
1163
1164 SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename,
1165 const struct open_how __user *, how, size_t, usize)
1166 {
1167 int err;
1168 struct open_how tmp;
1169
1170 if (unlikely(usize < OPEN_HOW_SIZE_VER0))
1171 return -EINVAL;
1172
> 1173 err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize);
1174 if (err)
1175 return err;
1176
1177 if (force_o_largefile())
1178 tmp.flags |= O_LARGEFILE;
1179
1180 return do_sys_open(dfd, filename, &tmp);
1181 }
1182
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30783 bytes --]
^ permalink raw reply
* Re: [PATCH RESEND v3 1/4] lib: introduce copy_struct_from_user() helper
From: Aleksa Sarai @ 2019-10-01 0:26 UTC (permalink / raw)
To: Kees Cook
Cc: Aleksa Sarai, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Christian Brauner, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <201909301622.90B70079@keescook>
[-- Attachment #1: Type: text/plain, Size: 15430 bytes --]
On 2019-09-30, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Oct 01, 2019 at 05:15:23AM +1000, Aleksa Sarai wrote:
> > From: Aleksa Sarai <cyphar@cyphar.com>
> >
> > A common pattern for syscall extensions is increasing the size of a
> > struct passed from userspace, such that the zero-value of the new fields
> > result in the old kernel behaviour (allowing for a mix of userspace and
> > kernel vintages to operate on one another in most cases).
> >
> > While this interface exists for communication in both directions, only
> > one interface is straightforward to have reasonable semantics for
> > (userspace passing a struct to the kernel). For kernel returns to
> > userspace, what the correct semantics are (whether there should be an
> > error if userspace is unaware of a new extension) is very
> > syscall-dependent and thus probably cannot be unified between syscalls
> > (a good example of this problem is [1]).
> >
> > Previously there was no common lib/ function that implemented
> > the necessary extension-checking semantics (and different syscalls
> > implemented them slightly differently or incompletely[2]). Future
> > patches replace common uses of this pattern to make use of
> > copy_struct_from_user().
> >
> > Some in-kernel selftests that insure that the handling of alignment and
> > various byte patterns are all handled identically to memchr_inv() usage.
> >
> > [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> > robustify sched_read_attr() ABI logic and code")
> >
> > [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> > similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> > always rejects differently-sized struct arguments.
> >
> > Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> > include/linux/bitops.h | 7 +++
> > include/linux/uaccess.h | 4 ++
> > lib/strnlen_user.c | 8 +--
> > lib/test_user_copy.c | 133 ++++++++++++++++++++++++++++++++++++++--
> > lib/usercopy.c | 123 +++++++++++++++++++++++++++++++++++++
> > 5 files changed, 262 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> > index cf074bce3eb3..c94a9ff9f082 100644
> > --- a/include/linux/bitops.h
> > +++ b/include/linux/bitops.h
> > @@ -4,6 +4,13 @@
> > #include <asm/types.h>
> > #include <linux/bits.h>
> >
> > +/* Set bits in the first 'n' bytes when loaded from memory */
> > +#ifdef __LITTLE_ENDIAN
> > +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> > +#else
> > +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> > +#endif
> > +
> > #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> > #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> >
> > diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> > index 70bbdc38dc37..94f20e6ec6ab 100644
> > --- a/include/linux/uaccess.h
> > +++ b/include/linux/uaccess.h
> > @@ -231,6 +231,10 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
> >
> > #endif /* ARCH_HAS_NOCACHE_UACCESS */
> >
> > +extern int check_zeroed_user(const void __user *from, size_t size);
> > +extern int copy_struct_from_user(void *dst, size_t ksize,
> > + const void __user *src, size_t usize);
> > +
> > /*
> > * probe_kernel_read(): safely attempt to read from a location
> > * @dst: pointer to the buffer that shall take the data
> > diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
> > index 28ff554a1be8..6c0005d5dd5c 100644
> > --- a/lib/strnlen_user.c
> > +++ b/lib/strnlen_user.c
> > @@ -3,16 +3,10 @@
> > #include <linux/export.h>
> > #include <linux/uaccess.h>
> > #include <linux/mm.h>
> > +#include <linux/bitops.h>
> >
> > #include <asm/word-at-a-time.h>
> >
> > -/* Set bits in the first 'n' bytes when loaded from memory */
> > -#ifdef __LITTLE_ENDIAN
> > -# define aligned_byte_mask(n) ((1ul << 8*(n))-1)
> > -#else
> > -# define aligned_byte_mask(n) (~0xfful << (BITS_PER_LONG - 8 - 8*(n)))
> > -#endif
> > -
> > /*
> > * Do a strnlen, return length of string *with* final '\0'.
> > * 'count' is the user-supplied count, while 'max' is the
> > diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
> > index 67bcd5dfd847..3a17f71029bb 100644
> > --- a/lib/test_user_copy.c
> > +++ b/lib/test_user_copy.c
> > @@ -16,6 +16,7 @@
> > #include <linux/slab.h>
> > #include <linux/uaccess.h>
> > #include <linux/vmalloc.h>
> > +#include <linux/random.h>
> >
> > /*
> > * Several 32-bit architectures support 64-bit {get,put}_user() calls.
> > @@ -31,14 +32,129 @@
> > # define TEST_U64
> > #endif
> >
> > -#define test(condition, msg) \
> > -({ \
> > - int cond = (condition); \
> > - if (cond) \
> > - pr_warn("%s\n", msg); \
> > - cond; \
> > +#define test(condition, msg, ...) \
> > +({ \
> > + int cond = (condition); \
> > + if (cond) \
> > + pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
> > + cond; \
> > })
> >
> > +static bool is_zeroed(void *from, size_t size)
> > +{
> > + return memchr_inv(from, 0x0, size) == NULL;
> > +}
> > +
> > +static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
> > +{
> > + int ret = 0;
> > + size_t start, end, i;
> > + size_t zero_start = size / 4;
> > + size_t zero_end = size - zero_start;
> > +
> > + /*
> > + * We conduct a series of check_nonzero_user() tests on a block of memory
> > + * with the following byte-pattern (trying every possible [start,end]
> > + * pair):
> > + *
> > + * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
> > + *
> > + * And we verify that check_nonzero_user() acts identically to memchr_inv().
> > + */
> > +
> > + memset(kmem, 0x0, size);
> > + for (i = 1; i < zero_start; i += 2)
> > + kmem[i] = 0xff;
> > + for (i = zero_end; i < size; i += 2)
> > + kmem[i] = 0xff;
> > +
> > + ret |= test(copy_to_user(umem, kmem, size),
> > + "legitimate copy_to_user failed");
> > +
> > + for (start = 0; start <= size; start++) {
> > + for (end = start; end <= size; end++) {
> > + size_t len = end - start;
> > + int retval = check_zeroed_user(umem + start, len);
> > + int expected = is_zeroed(kmem + start, len);
> > +
> > + ret |= test(retval != expected,
> > + "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
> > + retval, expected, start, end);
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int test_copy_struct_from_user(char *kmem, char __user *umem,
> > + size_t size)
> > +{
> > + int ret = 0;
> > + char *rand = NULL, *expected = NULL;
> > + size_t ksize, usize;
> > +
> > + rand = kmalloc(size, GFP_KERNEL);
> > + if (ret |= test(rand == NULL, "kmalloc failed"))
> > + goto out_free;
> > +
> > + expected = kmalloc(size, GFP_KERNEL);
> > + if (ret |= test(expected == NULL, "kmalloc failed"))
> > + goto out_free;
> > +
> > + /* Fill umem with random bytes. */
> > + memset(kmem, 0x0, size);
> > + prandom_bytes(rand, size);
>
> I don't really like using random() in tests on the chance that we get
> failures we can't reproduced. If you want to do this (instead of using a
> byte-fill pattern), you need to dump the entire state of the memory
> region. Why not just memset(rand, 0xff, ...)? (And obviously rename
> "rand")
Fair enough.
> > + ret |= test(copy_to_user(umem, rand, size),
> > + "legitimate copy_to_user failed");
> > +
> > + /* Check basic case -- (usize == ksize). */
> > + ksize = size;
> > + usize = size;
>
> I'd move the memset(kmem, 0x0, size); down to here.
>
> > + memcpy(expected, rand, ksize);
> > +
> > + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> > + "copy_struct_from_user(usize == ksize) failed");
> > + ret |= test(memcmp(kmem, expected, ksize),
> > + "copy_struct_from_user(usize == ksize) gives unexpected copy");
> > +
> > + /* Old userspace case -- (usize < ksize). */
> > + ksize = size;
> > + usize = ksize / 2;
> > +
>
> I would expect memset(kmem, 0x0, size); again here since a new test of
> that region is starting.
>
> > + memcpy(expected, rand, usize);
> > + memset(expected + usize, 0x0, ksize - usize);
> > +
> > + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> > + "copy_struct_from_user(usize < ksize) failed");
> > + ret |= test(memcmp(kmem, expected, ksize),
> > + "copy_struct_from_user(usize < ksize) gives unexpected copy");
> > +
> > + /* New userspace (-E2BIG) case -- (usize > ksize). */
> > + usize = size;
> > + ksize = usize / 2;
>
> and here?
>
> > +
> > + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize) != -E2BIG,
> > + "copy_struct_from_user(usize > ksize) didn't give E2BIG");
> > +
> > + /* New userspace (success) case -- (usize > ksize). */
> > + usize = size;
> > + ksize = usize / 2;
> > +
>
> and here?
Will do all of the above.
> > + memcpy(expected, rand, ksize);
> > +
> > + ret |= test(clear_user(umem + ksize, usize - ksize),
> > + "legitimate clear_user failed");
> > + ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
> > + "copy_struct_from_user(usize > ksize) failed");
> > + ret |= test(memcmp(kmem, expected, ksize),
> > + "copy_struct_from_user(usize > ksize) gives unexpected copy");
> > +
> > +out_free:
> > + kfree(expected);
> > + kfree(rand);
> > + return ret;
> > +}
> > +
> > static int __init test_user_copy_init(void)
> > {
> > int ret = 0;
> > @@ -106,6 +222,11 @@ static int __init test_user_copy_init(void)
> > #endif
> > #undef test_legit
> >
> > + /* Test usage of check_nonzero_user(). */
> > + ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
> > + /* Test usage of copy_struct_from_user(). */
> > + ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);
> > +
> > /*
> > * Invalid usage: none of these copies should succeed.
> > */
> > diff --git a/lib/usercopy.c b/lib/usercopy.c
> > index c2bfbcaeb3dc..cf7f854ed9c8 100644
> > --- a/lib/usercopy.c
> > +++ b/lib/usercopy.c
> > @@ -1,5 +1,6 @@
> > // SPDX-License-Identifier: GPL-2.0
> > #include <linux/uaccess.h>
> > +#include <linux/bitops.h>
> >
> > /* out-of-line parts */
> >
> > @@ -31,3 +32,125 @@ unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
> > }
> > EXPORT_SYMBOL(_copy_to_user);
> > #endif
> > +
> > +/**
> > + * check_zeroed_user: check if a userspace buffer only contains zero bytes
> > + * @from: Source address, in userspace.
> > + * @size: Size of buffer.
> > + *
> > + * This is effectively shorthand for "memchr_inv(from, 0, size) == NULL" for
> > + * userspace addresses (and is more efficient because we don't care where the
> > + * first non-zero byte is).
> > + *
> > + * Returns:
> > + * * 0: There were non-zero bytes present in the buffer.
> > + * * 1: The buffer was full of zero bytes.
> > + * * -EFAULT: access to userspace failed.
> > + */
> > +int check_zeroed_user(const void __user *from, size_t size)
> > +{
> > + unsigned long val;
> > + uintptr_t align = (uintptr_t) from % sizeof(unsigned long);
> > +
> > + if (unlikely(size == 0))
> > + return 1;
> > +
> > + from -= align;
> > + size += align;
> > +
> > + if (!user_access_begin(from, size))
> > + return -EFAULT;
> > +
> > + unsafe_get_user(val, (unsigned long __user *) from, err_fault);
> > + if (align)
> > + val &= ~aligned_byte_mask(align);
> > +
> > + while (size > sizeof(unsigned long)) {
> > + if (unlikely(val))
> > + goto done;
> > +
> > + from += sizeof(unsigned long);
> > + size -= sizeof(unsigned long);
> > +
> > + unsafe_get_user(val, (unsigned long __user *) from, err_fault);
> > + }
> > +
> > + if (size < sizeof(unsigned long))
> > + val &= aligned_byte_mask(size);
> > +
> > +done:
> > + user_access_end();
> > + return (val == 0);
> > +err_fault:
> > + user_access_end();
> > + return -EFAULT;
> > +}
> > +EXPORT_SYMBOL(check_zeroed_user);
> > +
> > +/**
> > + * copy_struct_from_user: copy a struct from userspace
> > + * @dst: Destination address, in kernel space. This buffer must be @ksize
> > + * bytes long.
> > + * @ksize: Size of @dst struct.
> > + * @src: Source address, in userspace.
> > + * @usize: (Alleged) size of @src struct.
> > + *
> > + * Copies a struct from userspace to kernel space, in a way that guarantees
> > + * backwards-compatibility for struct syscall arguments (as long as future
> > + * struct extensions are made such that all new fields are *appended* to the
> > + * old struct, and zeroed-out new fields have the same meaning as the old
> > + * struct).
> > + *
> > + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> > + * The recommended usage is something like the following:
> > + *
> > + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> > + * {
> > + * int err;
> > + * struct foo karg = {};
> > + *
> > + * if (usize > PAGE_SIZE)
> > + * return -E2BIG;
> > + * if (usize < FOO_SIZE_VER0)
> > + * return -EINVAL;
> > + *
> > + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> > + * if (err)
> > + * return err;
> > + *
> > + * // ...
> > + * }
> > + *
> > + * There are three cases to consider:
> > + * * If @usize == @ksize, then it's copied verbatim.
> > + * * If @usize < @ksize, then the userspace has passed an old struct to a
> > + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> > + * are to be zero-filled.
> > + * * If @usize > @ksize, then the userspace has passed a new struct to an
> > + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> > + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> > + *
> > + * Returns (in all cases, some data may have been copied):
> > + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> > + * * -EFAULT: access to userspace failed.
> > + */
> > +int copy_struct_from_user(void *dst, size_t ksize,
> > + const void __user *src, size_t usize)
>
> I'd like to see this be marked __always_inline so the dst type is known
> to the compiler during the copy_from_user() size sanity checks, etc.
Sure, will do. I'll put it in uaccess.h.
> > +{
> > + size_t size = min(ksize, usize);
> > + size_t rest = max(ksize, usize) - size;
> > +
> > + /* Deal with trailing bytes. */
> > + if (usize < ksize) {
> > + memset(dst + size, 0, rest);
> > + } else if (usize > ksize) {
> > + int ret = check_zeroed_user(src + size, rest);
> > + if (ret <= 0)
> > + return ret ?: -E2BIG;
> > + }
> > + /* Copy the interoperable parts of the struct. */
> > + if (copy_from_user(dst, src, size))
> > + return -EFAULT;
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(copy_struct_from_user);
> > --
> > 2.23.0
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* Re: [PATCH RESEND v3 2/4] clone3: switch to copy_struct_from_user()
From: Aleksa Sarai @ 2019-10-01 0:40 UTC (permalink / raw)
To: Kees Cook
Cc: Aleksa Sarai, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Christian Brauner, Rasmus Villemoes,
Al Viro, Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <201909301640.4FC92294FF@keescook>
[-- Attachment #1: Type: text/plain, Size: 3931 bytes --]
On 2019-09-30, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Oct 01, 2019 at 05:15:24AM +1000, Aleksa Sarai wrote:
> > From: Aleksa Sarai <cyphar@cyphar.com>
> >
> > The change is very straightforward, and helps unify the syscall
> > interface for struct-from-userspace syscalls. Additionally, explicitly
> > define CLONE_ARGS_SIZE_VER0 to match the other users of the
> > struct-extension pattern.
> >
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> > include/uapi/linux/sched.h | 2 ++
> > kernel/fork.c | 34 +++++++---------------------------
> > 2 files changed, 9 insertions(+), 27 deletions(-)
> >
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index b3105ac1381a..0945805982b4 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -47,6 +47,8 @@ struct clone_args {
> > __aligned_u64 tls;
> > };
> >
> > +#define CLONE_ARGS_SIZE_VER0 64 /* sizeof first published struct */
> > +
> > /*
> > * Scheduling policies
> > */
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index f9572f416126..2ef529869c64 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -2525,39 +2525,19 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
> > #ifdef __ARCH_WANT_SYS_CLONE3
> > noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> > struct clone_args __user *uargs,
> > - size_t size)
> > + size_t usize)
> > {
> > + int err;
> > struct clone_args args;
> >
> > - if (unlikely(size > PAGE_SIZE))
> > + if (unlikely(usize > PAGE_SIZE))
> > return -E2BIG;
>
> I quickly looked through the earlier threads and couldn't find it, but
> I have a memory of some discussion about moving this test into the
> copy_struct_from_user() function itself? That would seems like a
> reasonable idea? ("4k should be enough for any structure!")
Yes (and this also seemed the most reasonable way to do it to me), but
the main counter-arguments which swayed me were:
1. Putting it in the hands of the caller allows them to decide if they
want to have a limit, because if you institute a limit in one kernel
vintage then expanding it later will be less-than-ideally-smooth.
2. There is no amplification, so doing copy_struct_from_user() for a
really big usize boils down to the userspace program blocking for
the kernel to check if some of your memory is zeroed. Thus there
doesn't seem to be much DoS potential.
Not to mention that users of copy_struct_from_user() will end up doing
some kind of usize comparison anyway (to check if it's smaller than
the version-0 size).
> Either way:
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
>
>
> > -
> > - if (unlikely(size < sizeof(struct clone_args)))
> > + if (unlikely(usize < CLONE_ARGS_SIZE_VER0))
> > return -EINVAL;
> >
> > - if (unlikely(!access_ok(uargs, size)))
> > - return -EFAULT;
> > -
> > - if (size > sizeof(struct clone_args)) {
> > - unsigned char __user *addr;
> > - unsigned char __user *end;
> > - unsigned char val;
> > -
> > - addr = (void __user *)uargs + sizeof(struct clone_args);
> > - end = (void __user *)uargs + size;
> > -
> > - for (; addr < end; addr++) {
> > - if (get_user(val, addr))
> > - return -EFAULT;
> > - if (val)
> > - return -E2BIG;
> > - }
> > -
> > - size = sizeof(struct clone_args);
> > - }
> > -
> > - if (copy_from_user(&args, uargs, size))
> > - return -EFAULT;
> > + err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
> > + if (err)
> > + return err;
> >
> > /*
> > * Verify that higher 32bits of exit_signal are unset and that
> > --
> > 2.23.0
> >
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply
* [PATCH v4 0/4] lib: introduce copy_struct_from_user() helper
From: Aleksa Sarai @ 2019-10-01 1:10 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Kees Cook
Cc: Aleksa Sarai, Rasmus Villemoes, Al Viro, Linus Torvalds,
libc-alpha, linux-api, linux-kernel
Patch changelog:
v4:
* __always_inline copy_struct_from_user(). [Kees Cook]
* Rework test_user_copy.ko changes. [Kees Cook]
v3: <https://lore.kernel.org/lkml/20190930182810.6090-1-cyphar@cyphar.com/>
<https://lore.kernel.org/lkml/20190930191526.19544-1-asarai@suse.de/>
v2: <https://lore.kernel.org/lkml/20190925230332.18690-1-cyphar@cyphar.com/>
v1: <https://lore.kernel.org/lkml/20190925165915.8135-1-cyphar@cyphar.com/>
This series was split off from the openat2(2) syscall discussion[1].
However, the copy_struct_to_user() helper has been dropped, because
after some discussion it appears that there is no really obvious
semantics for how copy_struct_to_user() should work on mixed-vintages
(for instance, whether [2] is the correct semantics for all syscalls).
A common pattern for syscall extensions is increasing the size of a
struct passed from userspace, such that the zero-value of the new fields
result in the old kernel behaviour (allowing for a mix of userspace and
kernel vintages to operate on one another in most cases).
Previously there was no common lib/ function that implemented
the necessary extension-checking semantics (and different syscalls
implemented them slightly differently or incompletely[3]). This series
implements the helper and ports several syscalls to use it.
Some in-kernel selftests are included in this patch. More complete
self-tests for copy_struct_from_user() are included in the openat2()
patchset.
[1]: https://lore.kernel.org/lkml/20190904201933.10736-1-cyphar@cyphar.com/
[2]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")
[3]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
similar checks to copy_struct_from_user() while rt_sigprocmask(2)
always rejects differently-sized struct arguments.
Aleksa Sarai (4):
lib: introduce copy_struct_from_user() helper
clone3: switch to copy_struct_from_user()
sched_setattr: switch to copy_struct_from_user()
perf_event_open: switch to copy_struct_from_user()
include/linux/bitops.h | 7 ++
include/linux/uaccess.h | 70 +++++++++++++++++++
include/uapi/linux/sched.h | 2 +
kernel/events/core.c | 47 +++----------
kernel/fork.c | 34 ++--------
kernel/sched/core.c | 43 ++----------
lib/strnlen_user.c | 8 +--
lib/test_user_copy.c | 136 +++++++++++++++++++++++++++++++++++--
lib/usercopy.c | 55 +++++++++++++++
9 files changed, 288 insertions(+), 114 deletions(-)
--
2.23.0
^ permalink raw reply
* [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper
From: Aleksa Sarai @ 2019-10-01 1:10 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Kees Cook
Cc: Aleksa Sarai, Rasmus Villemoes, Al Viro, Linus Torvalds,
libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-1-cyphar@cyphar.com>
A common pattern for syscall extensions is increasing the size of a
struct passed from userspace, such that the zero-value of the new fields
result in the old kernel behaviour (allowing for a mix of userspace and
kernel vintages to operate on one another in most cases).
While this interface exists for communication in both directions, only
one interface is straightforward to have reasonable semantics for
(userspace passing a struct to the kernel). For kernel returns to
userspace, what the correct semantics are (whether there should be an
error if userspace is unaware of a new extension) is very
syscall-dependent and thus probably cannot be unified between syscalls
(a good example of this problem is [1]).
Previously there was no common lib/ function that implemented
the necessary extension-checking semantics (and different syscalls
implemented them slightly differently or incompletely[2]). Future
patches replace common uses of this pattern to make use of
copy_struct_from_user().
Some in-kernel selftests that insure that the handling of alignment and
various byte patterns are all handled identically to memchr_inv() usage.
[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")
[2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
similar checks to copy_struct_from_user() while rt_sigprocmask(2)
always rejects differently-sized struct arguments.
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
include/linux/bitops.h | 7 +++
include/linux/uaccess.h | 70 +++++++++++++++++++++
lib/strnlen_user.c | 8 +--
lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
lib/usercopy.c | 55 ++++++++++++++++
5 files changed, 263 insertions(+), 13 deletions(-)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index cf074bce3eb3..c94a9ff9f082 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -4,6 +4,13 @@
#include <asm/types.h>
#include <linux/bits.h>
+/* Set bits in the first 'n' bytes when loaded from memory */
+#ifdef __LITTLE_ENDIAN
+# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
+#else
+# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
+#endif
+
#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 70bbdc38dc37..8abbc713f7fb 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
#endif /* ARCH_HAS_NOCACHE_UACCESS */
+extern int check_zeroed_user(const void __user *from, size_t size);
+
+/**
+ * copy_struct_from_user: copy a struct from userspace
+ * @dst: Destination address, in kernel space. This buffer must be @ksize
+ * bytes long.
+ * @ksize: Size of @dst struct.
+ * @src: Source address, in userspace.
+ * @usize: (Alleged) size of @src struct.
+ *
+ * Copies a struct from userspace to kernel space, in a way that guarantees
+ * backwards-compatibility for struct syscall arguments (as long as future
+ * struct extensions are made such that all new fields are *appended* to the
+ * old struct, and zeroed-out new fields have the same meaning as the old
+ * struct).
+ *
+ * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
+ * The recommended usage is something like the following:
+ *
+ * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
+ * {
+ * int err;
+ * struct foo karg = {};
+ *
+ * if (usize > PAGE_SIZE)
+ * return -E2BIG;
+ * if (usize < FOO_SIZE_VER0)
+ * return -EINVAL;
+ *
+ * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
+ * if (err)
+ * return err;
+ *
+ * // ...
+ * }
+ *
+ * There are three cases to consider:
+ * * If @usize == @ksize, then it's copied verbatim.
+ * * If @usize < @ksize, then the userspace has passed an old struct to a
+ * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
+ * are to be zero-filled.
+ * * If @usize > @ksize, then the userspace has passed a new struct to an
+ * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
+ * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
+ *
+ * Returns (in all cases, some data may have been copied):
+ * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
+ * * -EFAULT: access to userspace failed.
+ */
+static __always_inline
+int copy_struct_from_user(void *dst, size_t ksize,
+ const void __user *src, size_t usize)
+{
+ size_t size = min(ksize, usize);
+ size_t rest = max(ksize, usize) - size;
+
+ /* Deal with trailing bytes. */
+ if (usize < ksize) {
+ memset(dst + size, 0, rest);
+ } else if (usize > ksize) {
+ int ret = check_zeroed_user(src + size, rest);
+ if (ret <= 0)
+ return ret ?: -E2BIG;
+ }
+ /* Copy the interoperable parts of the struct. */
+ if (copy_from_user(dst, src, size))
+ return -EFAULT;
+ return 0;
+}
+
/*
* probe_kernel_read(): safely attempt to read from a location
* @dst: pointer to the buffer that shall take the data
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 28ff554a1be8..6c0005d5dd5c 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -3,16 +3,10 @@
#include <linux/export.h>
#include <linux/uaccess.h>
#include <linux/mm.h>
+#include <linux/bitops.h>
#include <asm/word-at-a-time.h>
-/* Set bits in the first 'n' bytes when loaded from memory */
-#ifdef __LITTLE_ENDIAN
-# define aligned_byte_mask(n) ((1ul << 8*(n))-1)
-#else
-# define aligned_byte_mask(n) (~0xfful << (BITS_PER_LONG - 8 - 8*(n)))
-#endif
-
/*
* Do a strnlen, return length of string *with* final '\0'.
* 'count' is the user-supplied count, while 'max' is the
diff --git a/lib/test_user_copy.c b/lib/test_user_copy.c
index 67bcd5dfd847..950ee88cd6ac 100644
--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -31,14 +31,133 @@
# define TEST_U64
#endif
-#define test(condition, msg) \
-({ \
- int cond = (condition); \
- if (cond) \
- pr_warn("%s\n", msg); \
- cond; \
+#define test(condition, msg, ...) \
+({ \
+ int cond = (condition); \
+ if (cond) \
+ pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
+ cond; \
})
+static bool is_zeroed(void *from, size_t size)
+{
+ return memchr_inv(from, 0x0, size) == NULL;
+}
+
+static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
+{
+ int ret = 0;
+ size_t start, end, i;
+ size_t zero_start = size / 4;
+ size_t zero_end = size - zero_start;
+
+ /*
+ * We conduct a series of check_nonzero_user() tests on a block of memory
+ * with the following byte-pattern (trying every possible [start,end]
+ * pair):
+ *
+ * [ 00 ff 00 ff ... 00 00 00 00 ... ff 00 ff 00 ]
+ *
+ * And we verify that check_nonzero_user() acts identically to memchr_inv().
+ */
+
+ memset(kmem, 0x0, size);
+ for (i = 1; i < zero_start; i += 2)
+ kmem[i] = 0xff;
+ for (i = zero_end; i < size; i += 2)
+ kmem[i] = 0xff;
+
+ ret |= test(copy_to_user(umem, kmem, size),
+ "legitimate copy_to_user failed");
+
+ for (start = 0; start <= size; start++) {
+ for (end = start; end <= size; end++) {
+ size_t len = end - start;
+ int retval = check_zeroed_user(umem + start, len);
+ int expected = is_zeroed(kmem + start, len);
+
+ ret |= test(retval != expected,
+ "check_nonzero_user(=%d) != memchr_inv(=%d) mismatch (start=%zu, end=%zu)",
+ retval, expected, start, end);
+ }
+ }
+
+ return ret;
+}
+
+static int test_copy_struct_from_user(char *kmem, char __user *umem,
+ size_t size)
+{
+ int ret = 0;
+ char *umem_src = NULL, *expected = NULL;
+ size_t ksize, usize;
+
+ umem_src = kmalloc(size, GFP_KERNEL);
+ if (ret |= test(umem_src == NULL, "kmalloc failed"))
+ goto out_free;
+
+ expected = kmalloc(size, GFP_KERNEL);
+ if (ret |= test(expected == NULL, "kmalloc failed"))
+ goto out_free;
+
+ /* Fill umem with a fixed byte pattern. */
+ memset(umem_src, 0x3e, size);
+ ret |= test(copy_to_user(umem, umem_src, size),
+ "legitimate copy_to_user failed");
+
+ /* Check basic case -- (usize == ksize). */
+ ksize = size;
+ usize = size;
+
+ memcpy(expected, umem_src, ksize);
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize == ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize == ksize) gives unexpected copy");
+
+ /* Old userspace case -- (usize < ksize). */
+ ksize = size;
+ usize = size / 2;
+
+ memcpy(expected, umem_src, usize);
+ memset(expected + usize, 0x0, ksize - usize);
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize < ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize < ksize) gives unexpected copy");
+
+ /* New userspace (-E2BIG) case -- (usize > ksize). */
+ ksize = size / 2;
+ usize = size;
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize) != -E2BIG,
+ "copy_struct_from_user(usize > ksize) didn't give E2BIG");
+
+ /* New userspace (success) case -- (usize > ksize). */
+ ksize = size / 2;
+ usize = size;
+
+ memcpy(expected, umem_src, ksize);
+ ret |= test(clear_user(umem + ksize, usize - ksize),
+ "legitimate clear_user failed");
+
+ memset(kmem, 0x0, size);
+ ret |= test(copy_struct_from_user(kmem, ksize, umem, usize),
+ "copy_struct_from_user(usize > ksize) failed");
+ ret |= test(memcmp(kmem, expected, ksize),
+ "copy_struct_from_user(usize > ksize) gives unexpected copy");
+
+out_free:
+ kfree(expected);
+ kfree(umem_src);
+ return ret;
+}
+
static int __init test_user_copy_init(void)
{
int ret = 0;
@@ -106,6 +225,11 @@ static int __init test_user_copy_init(void)
#endif
#undef test_legit
+ /* Test usage of check_nonzero_user(). */
+ ret |= test_check_nonzero_user(kmem, usermem, 2 * PAGE_SIZE);
+ /* Test usage of copy_struct_from_user(). */
+ ret |= test_copy_struct_from_user(kmem, usermem, 2 * PAGE_SIZE);
+
/*
* Invalid usage: none of these copies should succeed.
*/
diff --git a/lib/usercopy.c b/lib/usercopy.c
index c2bfbcaeb3dc..cbb4d9ec00f2 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/uaccess.h>
+#include <linux/bitops.h>
/* out-of-line parts */
@@ -31,3 +32,57 @@ unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
}
EXPORT_SYMBOL(_copy_to_user);
#endif
+
+/**
+ * check_zeroed_user: check if a userspace buffer only contains zero bytes
+ * @from: Source address, in userspace.
+ * @size: Size of buffer.
+ *
+ * This is effectively shorthand for "memchr_inv(from, 0, size) == NULL" for
+ * userspace addresses (and is more efficient because we don't care where the
+ * first non-zero byte is).
+ *
+ * Returns:
+ * * 0: There were non-zero bytes present in the buffer.
+ * * 1: The buffer was full of zero bytes.
+ * * -EFAULT: access to userspace failed.
+ */
+int check_zeroed_user(const void __user *from, size_t size)
+{
+ unsigned long val;
+ uintptr_t align = (uintptr_t) from % sizeof(unsigned long);
+
+ if (unlikely(size == 0))
+ return 1;
+
+ from -= align;
+ size += align;
+
+ if (!user_access_begin(from, size))
+ return -EFAULT;
+
+ unsafe_get_user(val, (unsigned long __user *) from, err_fault);
+ if (align)
+ val &= ~aligned_byte_mask(align);
+
+ while (size > sizeof(unsigned long)) {
+ if (unlikely(val))
+ goto done;
+
+ from += sizeof(unsigned long);
+ size -= sizeof(unsigned long);
+
+ unsafe_get_user(val, (unsigned long __user *) from, err_fault);
+ }
+
+ if (size < sizeof(unsigned long))
+ val &= aligned_byte_mask(size);
+
+done:
+ user_access_end();
+ return (val == 0);
+err_fault:
+ user_access_end();
+ return -EFAULT;
+}
+EXPORT_SYMBOL(check_zeroed_user);
--
2.23.0
^ permalink raw reply related
* [PATCH v4 2/4] clone3: switch to copy_struct_from_user()
From: Aleksa Sarai @ 2019-10-01 1:10 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Kees Cook
Cc: Aleksa Sarai, Rasmus Villemoes, Al Viro, Linus Torvalds,
libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-1-cyphar@cyphar.com>
The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Additionally, explicitly
define CLONE_ARGS_SIZE_VER0 to match the other users of the
struct-extension pattern.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
include/uapi/linux/sched.h | 2 ++
kernel/fork.c | 34 +++++++---------------------------
2 files changed, 9 insertions(+), 27 deletions(-)
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index b3105ac1381a..0945805982b4 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -47,6 +47,8 @@ struct clone_args {
__aligned_u64 tls;
};
+#define CLONE_ARGS_SIZE_VER0 64 /* sizeof first published struct */
+
/*
* Scheduling policies
*/
diff --git a/kernel/fork.c b/kernel/fork.c
index f9572f416126..2ef529869c64 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2525,39 +2525,19 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
#ifdef __ARCH_WANT_SYS_CLONE3
noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
struct clone_args __user *uargs,
- size_t size)
+ size_t usize)
{
+ int err;
struct clone_args args;
- if (unlikely(size > PAGE_SIZE))
+ if (unlikely(usize > PAGE_SIZE))
return -E2BIG;
-
- if (unlikely(size < sizeof(struct clone_args)))
+ if (unlikely(usize < CLONE_ARGS_SIZE_VER0))
return -EINVAL;
- if (unlikely(!access_ok(uargs, size)))
- return -EFAULT;
-
- if (size > sizeof(struct clone_args)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uargs + sizeof(struct clone_args);
- end = (void __user *)uargs + size;
-
- for (; addr < end; addr++) {
- if (get_user(val, addr))
- return -EFAULT;
- if (val)
- return -E2BIG;
- }
-
- size = sizeof(struct clone_args);
- }
-
- if (copy_from_user(&args, uargs, size))
- return -EFAULT;
+ err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
+ if (err)
+ return err;
/*
* Verify that higher 32bits of exit_signal are unset and that
--
2.23.0
^ permalink raw reply related
* [PATCH v4 3/4] sched_setattr: switch to copy_struct_from_user()
From: Aleksa Sarai @ 2019-10-01 1:10 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Kees Cook
Cc: Aleksa Sarai, Rasmus Villemoes, Al Viro, Linus Torvalds,
libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-1-cyphar@cyphar.com>
The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls. Ideally we could also
unify sched_getattr(2)-style syscalls as well, but unfortunately the
correct semantics for such syscalls are much less clear (see [1] for
more detail). In future we could come up with a more sane idea for how
the syscall interface should look.
[1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
robustify sched_read_attr() ABI logic and code")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
kernel/sched/core.c | 43 +++++++------------------------------------
1 file changed, 7 insertions(+), 36 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7880f4f64d0e..dd05a378631a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5106,9 +5106,6 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
u32 size;
int ret;
- if (!access_ok(uattr, SCHED_ATTR_SIZE_VER0))
- return -EFAULT;
-
/* Zero the full structure, so that a short copy will be nice: */
memset(attr, 0, sizeof(*attr));
@@ -5116,45 +5113,19 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
if (ret)
return ret;
- /* Bail out on silly large: */
- if (size > PAGE_SIZE)
- goto err_size;
-
/* ABI compatibility quirk: */
if (!size)
size = SCHED_ATTR_SIZE_VER0;
-
- if (size < SCHED_ATTR_SIZE_VER0)
+ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
goto err_size;
- /*
- * If we're handed a bigger struct than we know of,
- * ensure all the unknown bits are 0 - i.e. new
- * user-space does not rely on any kernel feature
- * extensions we dont know about yet.
- */
- if (size > sizeof(*attr)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uattr + sizeof(*attr);
- end = (void __user *)uattr + size;
-
- for (; addr < end; addr++) {
- ret = get_user(val, addr);
- if (ret)
- return ret;
- if (val)
- goto err_size;
- }
- size = sizeof(*attr);
+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
+ if (ret) {
+ if (ret == -E2BIG)
+ goto err_size;
+ return ret;
}
- ret = copy_from_user(attr, uattr, size);
- if (ret)
- return -EFAULT;
-
if ((attr->sched_flags & SCHED_FLAG_UTIL_CLAMP) &&
size < SCHED_ATTR_SIZE_VER1)
return -EINVAL;
@@ -5354,7 +5325,7 @@ sched_attr_copy_to_user(struct sched_attr __user *uattr,
* sys_sched_getattr - similar to sched_getparam, but with sched_attr
* @pid: the pid in question.
* @uattr: structure containing the extended parameters.
- * @usize: sizeof(attr) that user-space knows about, for forwards and backwards compatibility.
+ * @usize: sizeof(attr) for fwd/bwd comp.
* @flags: for future extension.
*/
SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
--
2.23.0
^ permalink raw reply related
* [PATCH v4 4/4] perf_event_open: switch to copy_struct_from_user()
From: Aleksa Sarai @ 2019-10-01 1:10 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Kees Cook
Cc: Aleksa Sarai, Rasmus Villemoes, Al Viro, Linus Torvalds,
libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-1-cyphar@cyphar.com>
The change is very straightforward, and helps unify the syscall
interface for struct-from-userspace syscalls.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
kernel/events/core.c | 47 +++++++++-----------------------------------
1 file changed, 9 insertions(+), 38 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4655adbbae10..3f0cb82e4fbc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10586,55 +10586,26 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
u32 size;
int ret;
- if (!access_ok(uattr, PERF_ATTR_SIZE_VER0))
- return -EFAULT;
-
- /*
- * zero the full structure, so that a short copy will be nice.
- */
+ /* Zero the full structure, so that a short copy will be nice. */
memset(attr, 0, sizeof(*attr));
ret = get_user(size, &uattr->size);
if (ret)
return ret;
- if (size > PAGE_SIZE) /* silly large */
- goto err_size;
-
- if (!size) /* abi compat */
+ /* ABI compatibility quirk: */
+ if (!size)
size = PERF_ATTR_SIZE_VER0;
-
- if (size < PERF_ATTR_SIZE_VER0)
+ if (size < PERF_ATTR_SIZE_VER0 || size > PAGE_SIZE)
goto err_size;
- /*
- * If we're handed a bigger struct than we know of,
- * ensure all the unknown bits are 0 - i.e. new
- * user-space does not rely on any kernel feature
- * extensions we dont know about yet.
- */
- if (size > sizeof(*attr)) {
- unsigned char __user *addr;
- unsigned char __user *end;
- unsigned char val;
-
- addr = (void __user *)uattr + sizeof(*attr);
- end = (void __user *)uattr + size;
-
- for (; addr < end; addr++) {
- ret = get_user(val, addr);
- if (ret)
- return ret;
- if (val)
- goto err_size;
- }
- size = sizeof(*attr);
+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
+ if (ret) {
+ if (ret == -E2BIG)
+ goto err_size;
+ return ret;
}
- ret = copy_from_user(attr, uattr, size);
- if (ret)
- return -EFAULT;
-
attr->size = size;
if (attr->__reserved_1)
--
2.23.0
^ permalink raw reply related
* Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Alexei Starovoitov @ 2019-10-01 1:22 UTC (permalink / raw)
To: Kees Cook
Cc: Steven Rostedt, Andy Lutomirski, Andy Lutomirski,
Alexei Starovoitov, LSM List, James Morris, Jann Horn,
Peter Zijlstra, Masami Hiramatsu, David S. Miller,
Daniel Borkmann, Network Development, bpf, kernel-team, Linux API
In-Reply-To: <201909301129.5A1129C@keescook>
On Mon, Sep 30, 2019 at 11:31:29AM -0700, Kees Cook wrote:
> On Sat, Sep 28, 2019 at 07:37:27PM -0400, Steven Rostedt wrote:
> > On Wed, 28 Aug 2019 21:07:24 -0700
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > This won’t make me much more comfortable, since CAP_BPF lets it do an ever-growing set of nasty things. I’d much rather one or both of two things happen:
> > > >
> > > > 1. Give it CAP_TRACING only. It can leak my data, but it’s rather hard for it to crash my laptop, lose data, or cause other shenanigans.
> > > >
> > > > 2. Improve it a bit do all the privileged ops are wrapped by capset().
> > > >
> > > > Does this make sense? I’m a security person on occasion. I find
> > > > vulnerabilities and exploit them deliberately and I break things by
> > > > accident on a regular basis. In my considered opinion, CAP_TRACING
> > > > alone, even extended to cover part of BPF as I’ve described, is
> > > > decently safe. Getting root with just CAP_TRACING will be decently
> > > > challenging, especially if I don’t get to read things like sshd’s
> > > > memory, and improvements to mitigate even that could be added. I
> > > > am quite confident that attacks starting with CAP_TRACING will have
> > > > clear audit signatures if auditing is on. I am also confident that
> > > > CAP_BPF *will* allow DoS and likely privilege escalation, and this
> > > > will only get more likely as BPF gets more widely used. And, if
> > > > BPF-based auditing ever becomes a thing, writing to the audit
> > > > daemon’s maps will be a great way to cover one’s tracks.
> > >
> > > CAP_TRACING, as I'm proposing it, will allow full tracefs access.
> > > I think Steven and Massami prefer that as well.
> > > That includes kprobe with probe_kernel_read.
> > > That also means mini-DoS by installing kprobes everywhere or running
> > > too much ftrace.
> >
> > I was talking with Kees at Plumbers about this, and we were talking
> > about just using simple file permissions. I started playing with some
> > patches to allow the tracefs be visible but by default it would only be
> > visible by root.
> >
> > rwx------
> >
> > Then a start up script (or perhaps mount options) could change the
> > group owner, and change this to:
> >
> > rwxrwx---
> >
> > Where anyone in the group assigned (say "tracing") gets full access to
> > the file system.
> >
> > The more I was playing with this, the less I see the need for
> > CAP_TRACING for ftrace and reading the format files.
>
> Nice! Thanks for playing with this. I like it because it gives us a way
> to push policy into userspace (group membership, etc), and provides a
> clean way (hopefully) do separate "read" (kernel memory confidentiality)
> from "write" (kernel memory integrity), which wouldn't have been possible
> with a single new CAP_...
tracefs is a file system, so clearly file based acls are much better fit
for all tracefs operations.
But that is not the case for ftrace overall.
bpf_trace_printk() calls trace_printk() that dumps into trace pipe.
Technically it's ftrace operation, but it cannot be controlled by tracefs
and by file permissions. That's the motivation to guard bpf_trace_printk()
usage from bpf program with CAP_TRACING.
Both 'trace' and 'trace_pipe' have quirky side effects.
Like opening 'trace' file will make all parallel trace_printk() to be ignored.
While reading 'trace_pipe' file will clear it.
The point that traditional 'read' and 'write' ACLs don't map as-is
to tracefs, so I would be careful categorizing things into
confidentiality vs integrity only based on access type.
^ permalink raw reply
* Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper
From: Kees Cook @ 2019-10-01 1:58 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Christian Brauner, Rasmus Villemoes, Al Viro,
Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-2-cyphar@cyphar.com>
On Tue, Oct 01, 2019 at 11:10:52AM +1000, Aleksa Sarai wrote:
> A common pattern for syscall extensions is increasing the size of a
> struct passed from userspace, such that the zero-value of the new fields
> result in the old kernel behaviour (allowing for a mix of userspace and
> kernel vintages to operate on one another in most cases).
>
> While this interface exists for communication in both directions, only
> one interface is straightforward to have reasonable semantics for
> (userspace passing a struct to the kernel). For kernel returns to
> userspace, what the correct semantics are (whether there should be an
> error if userspace is unaware of a new extension) is very
> syscall-dependent and thus probably cannot be unified between syscalls
> (a good example of this problem is [1]).
>
> Previously there was no common lib/ function that implemented
> the necessary extension-checking semantics (and different syscalls
> implemented them slightly differently or incompletely[2]). Future
> patches replace common uses of this pattern to make use of
> copy_struct_from_user().
>
> Some in-kernel selftests that insure that the handling of alignment and
> various byte patterns are all handled identically to memchr_inv() usage.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> always rejects differently-sized struct arguments.
>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
> include/linux/bitops.h | 7 +++
> include/linux/uaccess.h | 70 +++++++++++++++++++++
> lib/strnlen_user.c | 8 +--
> lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
> lib/usercopy.c | 55 ++++++++++++++++
> 5 files changed, 263 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index cf074bce3eb3..c94a9ff9f082 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -4,6 +4,13 @@
> #include <asm/types.h>
> #include <linux/bits.h>
>
> +/* Set bits in the first 'n' bytes when loaded from memory */
> +#ifdef __LITTLE_ENDIAN
> +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> +#else
> +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> +#endif
> +
> #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
>
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 70bbdc38dc37..8abbc713f7fb 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
>
> #endif /* ARCH_HAS_NOCACHE_UACCESS */
>
> +extern int check_zeroed_user(const void __user *from, size_t size);
> +
> +/**
> + * copy_struct_from_user: copy a struct from userspace
> + * @dst: Destination address, in kernel space. This buffer must be @ksize
> + * bytes long.
> + * @ksize: Size of @dst struct.
> + * @src: Source address, in userspace.
> + * @usize: (Alleged) size of @src struct.
> + *
> + * Copies a struct from userspace to kernel space, in a way that guarantees
> + * backwards-compatibility for struct syscall arguments (as long as future
> + * struct extensions are made such that all new fields are *appended* to the
> + * old struct, and zeroed-out new fields have the same meaning as the old
> + * struct).
> + *
> + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> + * The recommended usage is something like the following:
> + *
> + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> + * {
> + * int err;
> + * struct foo karg = {};
> + *
> + * if (usize > PAGE_SIZE)
> + * return -E2BIG;
> + * if (usize < FOO_SIZE_VER0)
> + * return -EINVAL;
> + *
> + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> + * if (err)
> + * return err;
> + *
> + * // ...
> + * }
> + *
> + * There are three cases to consider:
> + * * If @usize == @ksize, then it's copied verbatim.
> + * * If @usize < @ksize, then the userspace has passed an old struct to a
> + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> + * are to be zero-filled.
> + * * If @usize > @ksize, then the userspace has passed a new struct to an
> + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> + *
> + * Returns (in all cases, some data may have been copied):
> + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> + * * -EFAULT: access to userspace failed.
> + */
> +static __always_inline
> +int copy_struct_from_user(void *dst, size_t ksize,
> + const void __user *src, size_t usize)
And of course I forgot to realize both this and check_zeroed_user()
should also have the __must_check attribute. Sorry for forgetting that
earlier!
With that, please consider it:
Reviewed-by: Kees Cook <keescook@chromium.org>
Thanks for working on this!
--
Kees Cook
^ permalink raw reply
* Re: [PATCH v4 1/4] lib: introduce copy_struct_from_user() helper
From: Christian Brauner @ 2019-10-01 2:31 UTC (permalink / raw)
To: Kees Cook
Cc: Aleksa Sarai, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Rasmus Villemoes, Al Viro,
Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <201909301856.01255535BD@keescook>
On Mon, Sep 30, 2019 at 06:58:39PM -0700, Kees Cook wrote:
> On Tue, Oct 01, 2019 at 11:10:52AM +1000, Aleksa Sarai wrote:
> > A common pattern for syscall extensions is increasing the size of a
> > struct passed from userspace, such that the zero-value of the new fields
> > result in the old kernel behaviour (allowing for a mix of userspace and
> > kernel vintages to operate on one another in most cases).
> >
> > While this interface exists for communication in both directions, only
> > one interface is straightforward to have reasonable semantics for
> > (userspace passing a struct to the kernel). For kernel returns to
> > userspace, what the correct semantics are (whether there should be an
> > error if userspace is unaware of a new extension) is very
> > syscall-dependent and thus probably cannot be unified between syscalls
> > (a good example of this problem is [1]).
> >
> > Previously there was no common lib/ function that implemented
> > the necessary extension-checking semantics (and different syscalls
> > implemented them slightly differently or incompletely[2]). Future
> > patches replace common uses of this pattern to make use of
> > copy_struct_from_user().
> >
> > Some in-kernel selftests that insure that the handling of alignment and
> > various byte patterns are all handled identically to memchr_inv() usage.
> >
> > [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> > robustify sched_read_attr() ABI logic and code")
> >
> > [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do
> > similar checks to copy_struct_from_user() while rt_sigprocmask(2)
> > always rejects differently-sized struct arguments.
> >
> > Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> > include/linux/bitops.h | 7 +++
> > include/linux/uaccess.h | 70 +++++++++++++++++++++
> > lib/strnlen_user.c | 8 +--
> > lib/test_user_copy.c | 136 ++++++++++++++++++++++++++++++++++++++--
> > lib/usercopy.c | 55 ++++++++++++++++
> > 5 files changed, 263 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> > index cf074bce3eb3..c94a9ff9f082 100644
> > --- a/include/linux/bitops.h
> > +++ b/include/linux/bitops.h
> > @@ -4,6 +4,13 @@
> > #include <asm/types.h>
> > #include <linux/bits.h>
> >
> > +/* Set bits in the first 'n' bytes when loaded from memory */
> > +#ifdef __LITTLE_ENDIAN
> > +# define aligned_byte_mask(n) ((1UL << 8*(n))-1)
> > +#else
> > +# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> > +#endif
> > +
> > #define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> > #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> >
> > diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> > index 70bbdc38dc37..8abbc713f7fb 100644
> > --- a/include/linux/uaccess.h
> > +++ b/include/linux/uaccess.h
> > @@ -231,6 +231,76 @@ __copy_from_user_inatomic_nocache(void *to, const void __user *from,
> >
> > #endif /* ARCH_HAS_NOCACHE_UACCESS */
> >
> > +extern int check_zeroed_user(const void __user *from, size_t size);
> > +
> > +/**
> > + * copy_struct_from_user: copy a struct from userspace
> > + * @dst: Destination address, in kernel space. This buffer must be @ksize
> > + * bytes long.
> > + * @ksize: Size of @dst struct.
> > + * @src: Source address, in userspace.
> > + * @usize: (Alleged) size of @src struct.
> > + *
> > + * Copies a struct from userspace to kernel space, in a way that guarantees
> > + * backwards-compatibility for struct syscall arguments (as long as future
> > + * struct extensions are made such that all new fields are *appended* to the
> > + * old struct, and zeroed-out new fields have the same meaning as the old
> > + * struct).
> > + *
> > + * @ksize is just sizeof(*dst), and @usize should've been passed by userspace.
> > + * The recommended usage is something like the following:
> > + *
> > + * SYSCALL_DEFINE2(foobar, const struct foo __user *, uarg, size_t, usize)
> > + * {
> > + * int err;
> > + * struct foo karg = {};
> > + *
> > + * if (usize > PAGE_SIZE)
> > + * return -E2BIG;
> > + * if (usize < FOO_SIZE_VER0)
> > + * return -EINVAL;
> > + *
> > + * err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize);
> > + * if (err)
> > + * return err;
> > + *
> > + * // ...
> > + * }
> > + *
> > + * There are three cases to consider:
> > + * * If @usize == @ksize, then it's copied verbatim.
> > + * * If @usize < @ksize, then the userspace has passed an old struct to a
> > + * newer kernel. The rest of the trailing bytes in @dst (@ksize - @usize)
> > + * are to be zero-filled.
> > + * * If @usize > @ksize, then the userspace has passed a new struct to an
> > + * older kernel. The trailing bytes unknown to the kernel (@usize - @ksize)
> > + * are checked to ensure they are zeroed, otherwise -E2BIG is returned.
> > + *
> > + * Returns (in all cases, some data may have been copied):
> > + * * -E2BIG: (@usize > @ksize) and there are non-zero trailing bytes in @src.
> > + * * -EFAULT: access to userspace failed.
> > + */
> > +static __always_inline
> > +int copy_struct_from_user(void *dst, size_t ksize,
> > + const void __user *src, size_t usize)
>
> And of course I forgot to realize both this and check_zeroed_user()
> should also have the __must_check attribute. Sorry for forgetting that
> earlier!
Just said to Aleksa that I'll just fix this up when I apply so he
doesn't have to resend. You ok with this, Kees?
>
> With that, please consider it:
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
^ permalink raw reply
* Re: [PATCH v4 2/4] clone3: switch to copy_struct_from_user()
From: Christian Brauner @ 2019-10-01 2:32 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Kees Cook, Rasmus Villemoes, Al Viro,
Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-3-cyphar@cyphar.com>
On Tue, Oct 01, 2019 at 11:10:53AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Additionally, explicitly
> define CLONE_ARGS_SIZE_VER0 to match the other users of the
> struct-extension pattern.
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
^ permalink raw reply
* Re: [PATCH v4 3/4] sched_setattr: switch to copy_struct_from_user()
From: Christian Brauner @ 2019-10-01 2:33 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Kees Cook, Rasmus Villemoes, Al Viro,
Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-4-cyphar@cyphar.com>
On Tue, Oct 01, 2019 at 11:10:54AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls. Ideally we could also
> unify sched_getattr(2)-style syscalls as well, but unfortunately the
> correct semantics for such syscalls are much less clear (see [1] for
> more detail). In future we could come up with a more sane idea for how
> the syscall interface should look.
>
> [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and
> robustify sched_read_attr() ABI logic and code")
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
^ permalink raw reply
* Re: [PATCH v4 4/4] perf_event_open: switch to copy_struct_from_user()
From: Christian Brauner @ 2019-10-01 2:36 UTC (permalink / raw)
To: Aleksa Sarai
Cc: Ingo Molnar, Peter Zijlstra, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Kees Cook, Rasmus Villemoes, Al Viro,
Linus Torvalds, libc-alpha, linux-api, linux-kernel
In-Reply-To: <20191001011055.19283-5-cyphar@cyphar.com>
On Tue, Oct 01, 2019 at 11:10:55AM +1000, Aleksa Sarai wrote:
> The change is very straightforward, and helps unify the syscall
> interface for struct-from-userspace syscalls.
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
^ permalink raw reply
* Re: [PATCH v13 7/9] open: openat2(2) syscall
From: kbuild test robot @ 2019-10-01 5:06 UTC (permalink / raw)
Cc: linux-ia64, linux-sh, Peter Zijlstra, Rasmus Villemoes,
Alexei Starovoitov, linux-mips, David Howells, linux-kselftest,
sparclinux, Jiri Olsa, linux-arch, linux-s390, Tycho Andersen,
Aleksa Sarai, Shuah Khan, Alexander Shishkin, Ingo Molnar,
Christian Brauner, Eric Biederman, linux-xtensa, Kees Cook,
Arnd Bergmann, Jann Horn, Oleg Nesterov, Aleksa Sarai, Al Viro
In-Reply-To: <20190930183316.10190-8-cyphar@cyphar.com>
Hi Aleksa,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[cannot apply to v5.4-rc1 next-20191001]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Aleksa-Sarai/namei-openat2-2-path-resolution-restrictions/20191001-025628
reproduce:
# apt-get install sparse
# sparse version: v0.6.1-rc1-37-gd466a02-dirty
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
fs/open.c:757:13: sparse: sparse: restricted fmode_t degrades to integer
fs/open.c:983:18: sparse: sparse: restricted fmode_t degrades to integer
>> fs/open.c:1011:36: sparse: sparse: invalid assignment: |=
>> fs/open.c:1011:36: sparse: left side has type int
>> fs/open.c:1011:36: sparse: right side has type restricted fmode_t
fs/open.c:1013:36: sparse: sparse: invalid assignment: |=
fs/open.c:1013:36: sparse: left side has type int
fs/open.c:1013:36: sparse: right side has type restricted fmode_t
>> fs/open.c:1029:24: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted fmode_t [usertype] opath_mask @@ got pe] opath_mask @@
>> fs/open.c:1029:24: sparse: expected restricted fmode_t [usertype] opath_mask
>> fs/open.c:1029:24: sparse: got int opath_mask
>> fs/open.c:1011:36: sparse: sparse: invalid assignment: |=
>> fs/open.c:1011:36: sparse: left side has type int
>> fs/open.c:1011:36: sparse: right side has type restricted fmode_t
fs/open.c:1013:36: sparse: sparse: invalid assignment: |=
fs/open.c:1013:36: sparse: left side has type int
fs/open.c:1013:36: sparse: right side has type restricted fmode_t
>> fs/open.c:1029:24: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted fmode_t [usertype] opath_mask @@ got pe] opath_mask @@
>> fs/open.c:1029:24: sparse: expected restricted fmode_t [usertype] opath_mask
>> fs/open.c:1029:24: sparse: got int opath_mask
>> fs/open.c:1011:36: sparse: sparse: invalid assignment: |=
>> fs/open.c:1011:36: sparse: left side has type int
>> fs/open.c:1011:36: sparse: right side has type restricted fmode_t
fs/open.c:1013:36: sparse: sparse: invalid assignment: |=
fs/open.c:1013:36: sparse: left side has type int
fs/open.c:1013:36: sparse: right side has type restricted fmode_t
>> fs/open.c:1029:24: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted fmode_t [usertype] opath_mask @@ got pe] opath_mask @@
>> fs/open.c:1029:24: sparse: expected restricted fmode_t [usertype] opath_mask
>> fs/open.c:1029:24: sparse: got int opath_mask
fs/open.c:1173:15: sparse: sparse: undefined identifier 'copy_struct_from_user'
vim +1011 fs/open.c
957
958 static inline int build_open_flags(const struct open_how *how,
959 struct open_flags *op)
960 {
961 int flags = how->flags;
962 int lookup_flags = 0;
963 int opath_mask = 0;
964 int acc_mode = ACC_MODE(flags);
965
966 /*
967 * Older syscalls still clear these bits before calling
968 * build_open_flags(), but openat2(2) checks all its arguments.
969 */
970 if (flags & ~VALID_OPEN_FLAGS)
971 return -EINVAL;
972 if (how->resolve & ~VALID_RESOLVE_FLAGS)
973 return -EINVAL;
974 if (!(how->flags & (O_PATH | O_CREAT | __O_TMPFILE)) && how->mode != 0)
975 return -EINVAL;
976
977 if (flags & (O_CREAT | __O_TMPFILE))
978 op->mode = (how->mode & S_IALLUGO) | S_IFREG;
979 else
980 op->mode = 0;
981
982 /* Must never be set by userspace */
> 983 flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC;
984
985 /*
986 * O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only
987 * check for O_DSYNC if the need any syncing at all we enforce it's
988 * always set instead of having to deal with possibly weird behaviour
989 * for malicious applications setting only __O_SYNC.
990 */
991 if (flags & __O_SYNC)
992 flags |= O_DSYNC;
993
994 if (flags & __O_TMPFILE) {
995 if ((flags & O_TMPFILE_MASK) != O_TMPFILE)
996 return -EINVAL;
997 if (!(acc_mode & MAY_WRITE))
998 return -EINVAL;
999 } else if (flags & O_PATH) {
1000 /*
1001 * If we have O_PATH in the open flag. Then we
1002 * cannot have anything other than the below set of flags
1003 */
1004 flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH;
1005 acc_mode = 0;
1006
1007 /* Allow userspace to restrict the re-opening of O_PATH fds. */
1008 if (how->upgrade_mask & ~VALID_UPGRADE_FLAGS)
1009 return -EINVAL;
1010 if (!(how->upgrade_mask & UPGRADE_NOREAD))
> 1011 opath_mask |= FMODE_PATH_READ;
1012 if (!(how->upgrade_mask & UPGRADE_NOWRITE))
1013 opath_mask |= FMODE_PATH_WRITE;
1014 }
1015
1016 op->open_flag = flags;
1017
1018 /* O_TRUNC implies we need access checks for write permissions */
1019 if (flags & O_TRUNC)
1020 acc_mode |= MAY_WRITE;
1021
1022 /* Allow the LSM permission hook to distinguish append
1023 access from general write access. */
1024 if (flags & O_APPEND)
1025 acc_mode |= MAY_APPEND;
1026
1027 op->acc_mode = acc_mode;
1028 op->intent = flags & O_PATH ? 0 : LOOKUP_OPEN;
> 1029 op->opath_mask = opath_mask;
1030
1031 if (flags & O_CREAT) {
1032 op->intent |= LOOKUP_CREATE;
1033 if (flags & O_EXCL)
1034 op->intent |= LOOKUP_EXCL;
1035 }
1036
1037 if (flags & O_DIRECTORY)
1038 lookup_flags |= LOOKUP_DIRECTORY;
1039 if (!(flags & O_NOFOLLOW))
1040 lookup_flags |= LOOKUP_FOLLOW;
1041 if (flags & O_EMPTYPATH)
1042 lookup_flags |= LOOKUP_EMPTY;
1043
1044 if (how->resolve & RESOLVE_NO_XDEV)
1045 lookup_flags |= LOOKUP_NO_XDEV;
1046 if (how->resolve & RESOLVE_NO_MAGICLINKS)
1047 lookup_flags |= LOOKUP_NO_MAGICLINKS;
1048 if (how->resolve & RESOLVE_NO_SYMLINKS)
1049 lookup_flags |= LOOKUP_NO_SYMLINKS;
1050 if (how->resolve & RESOLVE_BENEATH)
1051 lookup_flags |= LOOKUP_BENEATH;
1052 if (how->resolve & RESOLVE_IN_ROOT)
1053 lookup_flags |= LOOKUP_IN_ROOT;
1054
1055 op->lookup_flags = lookup_flags;
1056 return 0;
1057 }
1058
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply
* Re: [PATCH -next] treewide: remove unused argument in lock_release()
From: Qian Cai @ 2019-10-01 12:46 UTC (permalink / raw)
To: Peter Zijlstra
Cc: akpm, mingo, will, linux-kernel, linux-api, maarten.lankhorst,
mripard, sean, airlied, daniel, dri-devel, gregkh, jslaby, viro,
linux-fsdevel, joonas.lahtinen, rodrigo.vivi, intel-gfx, tytso,
jack, linux-ext4, tj, mark, jlbec, joseph.qi, ocfs2-devel, davem,
st, daniel, netdev, bpf, duyuyang, juri.lelli, vincent.guittot
In-Reply-To: <20190930072938.GK4553@hirez.programming.kicks-ass.net>
On Mon, 2019-09-30 at 09:29 +0200, Peter Zijlstra wrote:
> On Thu, Sep 19, 2019 at 12:09:40PM -0400, Qian Cai wrote:
> > Since the commit b4adfe8e05f1 ("locking/lockdep: Remove unused argument
> > in __lock_release"), @nested is no longer used in lock_release(), so
> > remove it from all lock_release() calls and friends.
>
> Right; I never did this cleanup for not wanting the churn, but as long
> as it applies I'll take it.
Not sure when you would like to merge this. As you know, the longer it is
pending, the more churn it could have. If you could give me rough timeline
(i.e., aim for v5.4-rc2 or v5.5), I'll double-check for breakage and rebase it
if necessary.
^ permalink raw reply
* Re: [PATCH] io_uring: use __kernel_timespec in timeout ABI
From: Jens Axboe @ 2019-10-01 14:09 UTC (permalink / raw)
To: Arnd Bergmann
Cc: y2038, linux-api, Alexander Viro, Stefan Bühler,
Hannes Reinecke, Jackie Liu, Andrew Morton, Hristo Venev,
linux-block, linux-fsdevel, linux-kernel
In-Reply-To: <20190930202055.1748710-1-arnd@arndb.de>
On 9/30/19 2:20 PM, Arnd Bergmann wrote:
> All system calls use struct __kernel_timespec instead of the old struct
> timespec, but this one was just added with the old-style ABI. Change it
> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
> the need for compat handlers on 32-bit architectures.
>
> Any user space caller will have to use __kernel_timespec now, but this
> is unambiguous and works for any C library regardless of the time_t
> definition. A nicer way to specify the timeout would have been a less
> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
> change that as this would impact both 32-bit and 64-bit users.
Thanks for catching that, Arnd. Applied.
--
Jens Axboe
^ permalink raw reply
* Re: [PATCH] io_uring: use __kernel_timespec in timeout ABI
From: Jens Axboe @ 2019-10-01 15:38 UTC (permalink / raw)
To: Arnd Bergmann
Cc: y2038, linux-api, Alexander Viro, Stefan Bühler,
Hannes Reinecke, Jackie Liu, Andrew Morton, Hristo Venev,
linux-block, linux-fsdevel, linux-kernel
In-Reply-To: <8d5d34da-e1f0-1ab5-461e-f3145e52c48a@kernel.dk>
On 10/1/19 8:09 AM, Jens Axboe wrote:
> On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>> All system calls use struct __kernel_timespec instead of the old struct
>> timespec, but this one was just added with the old-style ABI. Change it
>> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>> the need for compat handlers on 32-bit architectures.
>>
>> Any user space caller will have to use __kernel_timespec now, but this
>> is unambiguous and works for any C library regardless of the time_t
>> definition. A nicer way to specify the timeout would have been a less
>> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>> change that as this would impact both 32-bit and 64-bit users.
>
> Thanks for catching that, Arnd. Applied.
On second thought - since there appears to be no good 64-bit timespec
available to userspace, the alternative here is including on in liburing.
That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
value as you suggest? There's on released kernel with this feature yet, so
there's nothing stopping us from just changing the API to be based on
a single 64-bit nanosecond timeout.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index dd094b387cab..de3d14fe3025 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1892,16 +1892,13 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
unsigned count, req_dist, tail_index;
struct io_ring_ctx *ctx = req->ctx;
struct list_head *entry;
- struct timespec ts;
+ u64 timeout;
if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
return -EINVAL;
if (sqe->flags || sqe->ioprio || sqe->buf_index || sqe->timeout_flags ||
sqe->len != 1)
return -EINVAL;
- if (copy_from_user(&ts, (void __user *) (unsigned long) sqe->addr,
- sizeof(ts)))
- return -EFAULT;
/*
* sqe->off holds how many events that need to occur for this
@@ -1932,9 +1929,10 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
list_add(&req->list, entry);
spin_unlock_irq(&ctx->completion_lock);
+ timeout = READ_ONCE(sqe->addr);
hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
req->timeout.timer.function = io_timeout_fn;
- hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
+ hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),
HRTIMER_MODE_REL);
return 0;
}
--
Jens Axboe
^ permalink raw reply related
* Re: [PATCH] io_uring: use __kernel_timespec in timeout ABI
From: Arnd Bergmann @ 2019-10-01 15:49 UTC (permalink / raw)
To: Jens Axboe
Cc: y2038 Mailman List, Linux API, Alexander Viro, Stefan Bühler,
Hannes Reinecke, Jackie Liu, Andrew Morton, Hristo Venev,
linux-block, Linux FS-devel Mailing List,
linux-kernel@vger.kernel.org
In-Reply-To: <623e1d27-d3b1-3241-bfd4-eb94ce70da14@kernel.dk>
On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 10/1/19 8:09 AM, Jens Axboe wrote:
> > On 9/30/19 2:20 PM, Arnd Bergmann wrote:
> >> All system calls use struct __kernel_timespec instead of the old struct
> >> timespec, but this one was just added with the old-style ABI. Change it
> >> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
> >> the need for compat handlers on 32-bit architectures.
> >>
> >> Any user space caller will have to use __kernel_timespec now, but this
> >> is unambiguous and works for any C library regardless of the time_t
> >> definition. A nicer way to specify the timeout would have been a less
> >> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
> >> change that as this would impact both 32-bit and 64-bit users.
> >
> > Thanks for catching that, Arnd. Applied.
>
> On second thought - since there appears to be no good 64-bit timespec
> available to userspace, the alternative here is including on in liburing.
What's wrong with using __kernel_timespec? Just the name?
I suppose liburing could add a macro to give it a different name
for its users.
> That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
> value as you suggest? There's on released kernel with this feature yet, so
> there's nothing stopping us from just changing the API to be based on
> a single 64-bit nanosecond timeout.
Certainly fine with me.
> + timeout = READ_ONCE(sqe->addr);
> hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> req->timeout.timer.function = io_timeout_fn;
> - hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
> + hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),
It seems a little odd to use the 'addr' field as something that's not
an address,
and I'm not sure I understand the logic behind when you use a READ_ONCE()
as opposed to simply accessing the sqe the way it is done a few lines
earlier.
The time handling definitely looks good to me.
Arnd
^ permalink raw reply
* Re: [PATCH] io_uring: use __kernel_timespec in timeout ABI
From: Jens Axboe @ 2019-10-01 15:52 UTC (permalink / raw)
To: Arnd Bergmann
Cc: y2038 Mailman List, Linux API, Alexander Viro, Stefan Bühler,
Hannes Reinecke, Jackie Liu, Andrew Morton, Hristo Venev,
linux-block, Linux FS-devel Mailing List,
linux-kernel@vger.kernel.org
In-Reply-To: <CAK8P3a3AAFXNmpQwuirzM+jgEQGj9tMC_5oaSs4CfiEVGmTkZg@mail.gmail.com>
On 10/1/19 9:49 AM, Arnd Bergmann wrote:
> On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 10/1/19 8:09 AM, Jens Axboe wrote:
>>> On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>>>> All system calls use struct __kernel_timespec instead of the old struct
>>>> timespec, but this one was just added with the old-style ABI. Change it
>>>> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>>>> the need for compat handlers on 32-bit architectures.
>>>>
>>>> Any user space caller will have to use __kernel_timespec now, but this
>>>> is unambiguous and works for any C library regardless of the time_t
>>>> definition. A nicer way to specify the timeout would have been a less
>>>> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>>>> change that as this would impact both 32-bit and 64-bit users.
>>>
>>> Thanks for catching that, Arnd. Applied.
>>
>> On second thought - since there appears to be no good 64-bit timespec
>> available to userspace, the alternative here is including on in liburing.
>
> What's wrong with using __kernel_timespec? Just the name?
> I suppose liburing could add a macro to give it a different name
> for its users.
Just that it seems I need to make it available through liburing on
systems that don't have it yet. Not a big deal, though.
>> That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
>> value as you suggest? There's on released kernel with this feature yet, so
>> there's nothing stopping us from just changing the API to be based on
>> a single 64-bit nanosecond timeout.
>
> Certainly fine with me.
>
>> + timeout = READ_ONCE(sqe->addr);
>> hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>> req->timeout.timer.function = io_timeout_fn;
>> - hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
>> + hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),
>
> It seems a little odd to use the 'addr' field as something that's not
> an address,
> and I'm not sure I understand the logic behind when you use a READ_ONCE()
> as opposed to simply accessing the sqe the way it is done a few lines
> earlier.
>
> The time handling definitely looks good to me.
One thing that struck me about this approach - we then lose the ability to
differentiate between "don't want a timed timeout" with ts == NULL, vs
tv_sec and tv_nsec both being 0.
I think I'll stuck with that you had and just use __kernel_timespec in
liburing.
--
Jens Axboe
^ permalink raw reply
* Re: [PATCH] io_uring: use __kernel_timespec in timeout ABI
From: Arnd Bergmann @ 2019-10-01 15:57 UTC (permalink / raw)
To: Jens Axboe
Cc: Hannes Reinecke, y2038 Mailman List, Linux API, Jackie Liu,
linux-kernel@vger.kernel.org, linux-block, Stefan Bühler,
Alexander Viro, Linux FS-devel Mailing List, Andrew Morton,
Hristo Venev
In-Reply-To: <ca0a5bbe-c20e-d5be-110e-942c604ad2d7@kernel.dk>
On Tue, Oct 1, 2019 at 5:52 PM Jens Axboe <axboe@kernel.dk> wrote:
> On 10/1/19 9:49 AM, Arnd Bergmann wrote:
> > On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
> > What's wrong with using __kernel_timespec? Just the name?
> > I suppose liburing could add a macro to give it a different name
> > for its users.
>
> Just that it seems I need to make it available through liburing on
> systems that don't have it yet. Not a big deal, though.
Ah, right. I t would not cover the case of building against kernel
headers earlier than linux-5.1 but running on a 5.4+ kernel.
I assumed that that you would require new kernel headers anyway,
but if you have a copy of the io_uring header, that is not necessary.
> One thing that struck me about this approach - we then lose the ability to
> differentiate between "don't want a timed timeout" with ts == NULL, vs
> tv_sec and tv_nsec both being 0.
You could always define a special constant such as
'#define IO_URING_TIMEOUT_NEVER -1ull' if you want to
support for 'never wait if it's not already done' and 'wait indefinitely'.
> I think I'll stuck with that you had and just use __kernel_timespec in
> liburing.
Ok.
Arnd
_______________________________________________
Y2038 mailing list
Y2038@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/y2038
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox