Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [PATCH V34 23/29] bpf: Restrict bpf when kernel lockdown is in confidentiality mode
From: Kees Cook @ 2019-06-23  0:09 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Alexei Starovoitov, Matthew Garrett, netdev,
	Chun-Yi Lee, Daniel Borkmann
In-Reply-To: <20190622000358.19895-24-matthewgarrett@google.com>

On Fri, Jun 21, 2019 at 05:03:52PM -0700, Matthew Garrett wrote:
> From: David Howells <dhowells@redhat.com>
> 
> There are some bpf functions can be used to read kernel memory:
> bpf_probe_read, bpf_probe_write_user and bpf_trace_printk.  These allow
> private keys in kernel memory (e.g. the hibernation image signing key) to
> be read by an eBPF program and kernel memory to be altered without
> restriction. Disable them if the kernel has been locked down in
> confidentiality mode.
> 
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: David Howells <dhowells@redhat.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> Signed-off-by: Matthew Garrett <mjg59@google.com>
> cc: netdev@vger.kernel.org
> cc: Chun-Yi Lee <jlee@suse.com>
> cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> ---
>  include/linux/security.h     |  1 +
>  kernel/trace/bpf_trace.c     | 20 +++++++++++++++++++-
>  security/lockdown/lockdown.c |  1 +
>  3 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index e6e3e2403474..de0d37b1fe79 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -97,6 +97,7 @@ enum lockdown_reason {
>  	LOCKDOWN_INTEGRITY_MAX,
>  	LOCKDOWN_KCORE,
>  	LOCKDOWN_KPROBES,
> +	LOCKDOWN_BPF_READ,
>  	LOCKDOWN_CONFIDENTIALITY_MAX,
>  };
>  
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index d64c00afceb5..638f9b00a8df 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -137,6 +137,10 @@ BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
>  {
>  	int ret;
>  
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret)
> +		return ret;
> +
>  	ret = probe_kernel_read(dst, unsafe_ptr, size);
>  	if (unlikely(ret < 0))
>  		memset(dst, 0, size);
> @@ -156,6 +160,12 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
>  BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
>  	   u32, size)
>  {
> +	int ret;
> +
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret)
> +		return ret;
> +
>  	/*
>  	 * Ensure we're in user context which is safe for the helper to
>  	 * run. This helper has no business in a kthread.
> @@ -205,7 +215,11 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
>  	int fmt_cnt = 0;
>  	u64 unsafe_addr;
>  	char buf[64];
> -	int i;
> +	int i, ret;
> +
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret)
> +		return ret;
>  
>  	/*
>  	 * bpf_check()->check_func_arg()->check_stack_boundary()
> @@ -534,6 +548,10 @@ BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
>  {
>  	int ret;
>  
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret)
> +		return ret;
> +
>  	/*
>  	 * The strncpy_from_unsafe() call will likely not fill the entire
>  	 * buffer, but that's okay in this circumstance as we're probing
> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 5a08c17f224d..2eea2cc13117 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -33,6 +33,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
>  	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
>  	[LOCKDOWN_KCORE] = "/proc/kcore access",
>  	[LOCKDOWN_KPROBES] = "use of kprobes",
> +	[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
>  	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
>  };
>  
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V34 24/29] Lock down perf when in confidentiality mode
From: Kees Cook @ 2019-06-23  0:12 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Matthew Garrett, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo
In-Reply-To: <20190622000358.19895-25-matthewgarrett@google.com>

On Fri, Jun 21, 2019 at 05:03:53PM -0700, Matthew Garrett wrote:
> From: David Howells <dhowells@redhat.com>
> 
> Disallow the use of certain perf facilities that might allow userspace to
> access kernel data.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> ---
>  include/linux/security.h     | 1 +
>  kernel/events/core.c         | 7 +++++++
>  security/lockdown/lockdown.c | 1 +
>  3 files changed, 9 insertions(+)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index de0d37b1fe79..53ea85889a48 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -98,6 +98,7 @@ enum lockdown_reason {
>  	LOCKDOWN_KCORE,
>  	LOCKDOWN_KPROBES,
>  	LOCKDOWN_BPF_READ,
> +	LOCKDOWN_PERF,
>  	LOCKDOWN_CONFIDENTIALITY_MAX,
>  };
>  
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 72d06e302e99..77f36551756e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10731,6 +10731,13 @@ SYSCALL_DEFINE5(perf_event_open,
>  			return -EINVAL;
>  	}
>  
> +	err = security_locked_down(LOCKDOWN_PERF);
> +	if (err && (attr.sample_type & PERF_SAMPLE_REGS_INTR))
> +		/* REGS_INTR can leak data, lockdown must prevent this */
> +		return err;
> +	else
> +		err = 0;
> +
>  	/* Only privileged users can get physical addresses */
>  	if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR) &&
>  	    perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))

With moar capable() ordering fixed...

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 2eea2cc13117..a7e75c614416 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -34,6 +34,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
>  	[LOCKDOWN_KCORE] = "/proc/kcore access",
>  	[LOCKDOWN_KPROBES] = "use of kprobes",
>  	[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
> +	[LOCKDOWN_PERF] = "unsafe use of perf",
>  	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
>  };
>  
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V34 28/29] efi: Restrict efivar_ssdt_load when the kernel is locked down
From: Kees Cook @ 2019-06-23  0:14 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	Matthew Garrett, Ard Biesheuvel, linux-efi
In-Reply-To: <20190622000358.19895-29-matthewgarrett@google.com>

On Fri, Jun 21, 2019 at 05:03:57PM -0700, Matthew Garrett wrote:
> efivar_ssdt_load allows the kernel to import arbitrary ACPI code from an
> EFI variable, which gives arbitrary code execution in ring 0. Prevent
> that when the kernel is locked down.
> 
> Signed-off-by: Matthew Garrett <mjg59@google.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: linux-efi@vger.kernel.org
> ---
>  drivers/firmware/efi/efi.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 55b77c576c42..9f92a013ab27 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -31,6 +31,7 @@
>  #include <linux/acpi.h>
>  #include <linux/ucs2_string.h>
>  #include <linux/memblock.h>
> +#include <linux/security.h>
>  
>  #include <asm/early_ioremap.h>
>  
> @@ -242,6 +243,11 @@ static void generic_ops_unregister(void)
>  static char efivar_ssdt[EFIVAR_SSDT_NAME_MAX] __initdata;
>  static int __init efivar_ssdt_setup(char *str)
>  {
> +	int ret = security_locked_down(LOCKDOWN_ACPI_TABLES);
> +
> +	if (ret)
> +		return ret;
> +
>  	if (strlen(str) < sizeof(efivar_ssdt))
>  		memcpy(efivar_ssdt, str, strlen(str));
>  	else
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V34 29/29] lockdown: Print current->comm in restriction messages
From: Kees Cook @ 2019-06-23  0:25 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Matthew Garrett
In-Reply-To: <20190622000358.19895-30-matthewgarrett@google.com>

On Fri, Jun 21, 2019 at 05:03:58PM -0700, Matthew Garrett wrote:
> Print the content of current->comm in messages generated by lockdown to
> indicate a restriction that was hit.  This makes it a bit easier to find
> out what caused the message.
> 
> The message now patterned something like:
> 
>         Lockdown: <comm>: <what> is restricted; see man kernel_lockdown.7
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> ---
>  security/lockdown/lockdown.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 98f9ee0026d5..9ca6f442fbc7 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -83,8 +83,8 @@ static int lockdown_is_locked_down(enum lockdown_reason what)
>  {	
>  	if ((kernel_locked_down >= what)) {

To satisfy my paranoia, can you just add here:

		if (WARN(what > LOCKDOWN_..._MAX))
			return -EPERM;

With that:

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

>  		if (lockdown_reasons[what])
> -			pr_notice("Lockdown: %s is restricted; see man kernel_lockdown.7\n",
> -				  lockdown_reasons[what]);
> +			pr_notice("Lockdown: %s: %s is restricted; see man kernel_lockdown.7\n",
> +				  current->comm, lockdown_reasons[what]);
>  		return -EPERM;
>  	}
>  
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V34 22/29] Lock down tracing and perf kprobes when in confidentiality mode
From: Masami Hiramatsu @ 2019-06-23  1:57 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Alexei Starovoitov, Matthew Garrett,
	Naveen N . Rao, Anil S Keshavamurthy, davem, Masami Hiramatsu
In-Reply-To: <20190622000358.19895-23-matthewgarrett@google.com>

On Fri, 21 Jun 2019 17:03:51 -0700
Matthew Garrett <matthewgarrett@google.com> wrote:

> From: David Howells <dhowells@redhat.com>
> 
> Disallow the creation of perf and ftrace kprobes when the kernel is
> locked down in confidentiality mode by preventing their registration.
> This prevents kprobes from being used to access kernel memory to steal
> crypto data, but continues to allow the use of kprobes from signed
> modules.

Looks (and sounds) good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> 
> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Cc: davem@davemloft.net
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> ---
>  include/linux/security.h     | 1 +
>  kernel/trace/trace_kprobe.c  | 5 +++++
>  security/lockdown/lockdown.c | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 3875f6df2ecc..e6e3e2403474 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -96,6 +96,7 @@ enum lockdown_reason {
>  	LOCKDOWN_MMIOTRACE,
>  	LOCKDOWN_INTEGRITY_MAX,
>  	LOCKDOWN_KCORE,
> +	LOCKDOWN_KPROBES,
>  	LOCKDOWN_CONFIDENTIALITY_MAX,
>  };
>  
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 5d5129b05df7..5a76a0f79d48 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -11,6 +11,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/rculist.h>
>  #include <linux/error-injection.h>
> +#include <linux/security.h>
>  
>  #include "trace_dynevent.h"
>  #include "trace_kprobe_selftest.h"
> @@ -415,6 +416,10 @@ static int __register_trace_kprobe(struct trace_kprobe *tk)
>  {
>  	int i, ret;
>  
> +	ret = security_locked_down(LOCKDOWN_KPROBES);
> +	if (ret)
> +		return ret;
> +
>  	if (trace_probe_is_registered(&tk->tp))
>  		return -EINVAL;
>  
> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 4c9b324dfc55..5a08c17f224d 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -32,6 +32,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
>  	[LOCKDOWN_MMIOTRACE] = "unsafe mmio",
>  	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
>  	[LOCKDOWN_KCORE] = "/proc/kcore access",
> +	[LOCKDOWN_KPROBES] = "use of kprobes",
>  	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
>  };
>  
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCHv4 26/28] x86/vdso: Align VDSO functions by CPU L1 cache line
From: Andrei Vagin @ 2019-06-23  5:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dmitry Safonov, linux-kernel, Adrian Reber, Andrei Vagin,
	Andy Lutomirski, Arnd Bergmann, Christian Brauner,
	Cyrill Gorcunov, Dmitry Safonov, Eric W. Biederman,
	H. Peter Anvin, Ingo Molnar, Jann Horn, Jeff Dike, Oleg Nesterov,
	Pavel Emelyanov, Shuah Khan, Vincenzo Frascino, containers, criu,
	linux-api, x86
In-Reply-To: <alpine.DEB.2.21.1906141610060.1722@nanos.tec.linutronix.de>

On Fri, Jun 14, 2019 at 04:13:31PM +0200, Thomas Gleixner wrote:
> On Wed, 12 Jun 2019, Dmitry Safonov wrote:
> 
> > From: Andrei Vagin <avagin@gmail.com>
> > 
> > After performance testing VDSO patches a noticeable 20% regression was
> > found on gettime_perf selftest with a cold cache.
> > As it turns to be, before time namespaces introduction, VDSO functions
> > were quite aligned to cache lines, but adding a new code to adjust
> > timens offset inside namespace created a small shift and vdso functions
> > become unaligned on cache lines.
> > 
> > Add align to vdso functions with gcc option to fix performance drop.
> > 
> > Coping the resulting numbers from cover letter:
> > 
> > Hot CPU cache (more gettime_perf.c cycles - the better):
> >         | before     | CONFIG_TIME_NS=n | host        | inside timens
> > --------|------------|------------------|-------------|-------------
> > cycles  | 139887013  | 139453003        | 139899785   | 128792458
> > diff (%)| 100        | 99.7             | 100         | 92
> 
> Why is CONFIG_TIME_NS=n behaving worse than current mainline and
> worse than 'host' mode?

We had to specify a precision of these numbers, it is more than this
0.3%, so at that time I decided that here is nothing to worry about. I
did these measurments a few mounth ago for the second version of this
series. I repeated measurments for this set of patches:

        | before    | CONFIG_TIME_NS=n | host      | inside timens
--------------------------------------------------------------
        | 144645498 | 142916801        | 140364862 | 132378440
        | 143440633 | 141545739        | 140540053 | 132714190
        | 144876395 | 144650599        | 140026814 | 131843318
        | 143984551 | 144595770        | 140359260 | 131683544
        | 144875682 | 143799788        | 140692618 | 131300332
--------------------------------------------------------------
avg     | 144364551 | 143501739        | 140396721 | 131983964
diff %  | 100       | 99.4             | 97.2      | 91.4
-------------------------------------------------------------
stdev % | 0.4       | 0.9              | 0.1       | 0.4

> 
> > Cold cache (lesser tsc per gettime_perf_cold.c cycle - the better):
> >         | before     | CONFIG_TIME_NS=n | host        | inside timens
> > --------|------------|------------------|-------------|-------------
> > tsc     | 6748       | 6718             | 6862        | 12682
> > diff (%)| 100        | 99.6             | 101.7       | 188
> 
> Weird, now CONFIG_TIME_NS=n is better than current mainline and 'host' mode
> drops.

The precision of these numbers is much smaller than of the previous set.
These numbers are for the second version of this series, so I decided to
repeat measurements for this version. When I run the test, I found that
there is some degradation in compare with v5.0. I bisected and found
that the problem is in 2b539aefe9e4 ("mm/resource: Let
walk_system_ram_range() search child resources"). At this point, I
realized that my test isn't quite right. On each iteration, the test
starts a new process, then do start=rdtsc();clock_gettime();end=rdtsc()
and prints (end-start). The problem here is that when clock_gettime() is
called the first time, vdso pages are not mapped into a process address
space, so the test measures how fast vdso pages are mapped into the
process address space. I modified this test, now it uses the clflush
instruction to drop cpu caches.  Here are the results:

           | before    | CONFIG_TIME_NS=n | host      | inside timens
--------------------------------------------------------------
tsc        | 434       | 433              | 437       | 477
stdev(tsc) | 5         | 5                | 5         | 3
diff (%)   | 1         | 1	          | 100.1     | 109

Here is the source code for the modified test:
https://github.com/avagin/linux-task-diag/blob/wip/timens-rfc-v4/tools/testing/selftests/timens/gettime_perf_cold.c

This test does 10K iterations. At the first glance, the numbers look
noisy, so I sort them and take only 8K numbers in the middle:

$ ./gettime_perf_cold > raw
$ cat raw | sort -n | tail -n 9000 | head -n 8000 > results

> 
> Either I'm misreading the numbers or missing something or I'm just confused
> as usual :)
> 
> Thanks,
>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           > 	tglx

^ permalink raw reply

* Re: [PATCH V34 20/29] x86/mmiotrace: Lock down the testmmiotrace module
From: Thomas Gleixner @ 2019-06-23 11:08 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Matthew Garrett, Steven Rostedt, Ingo Molnar,
	H. Peter Anvin, x86
In-Reply-To: <20190622000358.19895-21-matthewgarrett@google.com>



On Fri, 21 Jun 2019, Matthew Garrett wrote:

> From: David Howells <dhowells@redhat.com>
> 
> The testmmiotrace module shouldn't be permitted when the kernel is locked
> down as it can be used to arbitrarily read and write MMIO space. This is
> a runtime check rather than buildtime in order to allow configurations
> where the same kernel may be run in both locked down or permissive modes
> depending on local policy.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: David Howells <dhowells@redhat.com
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> cc: Thomas Gleixner <tglx@linutronix.de>
> cc: Steven Rostedt <rostedt@goodmis.org>
> cc: Ingo Molnar <mingo@kernel.org>
> cc: "H. Peter Anvin" <hpa@zytor.com>
> cc: x86@kernel.org

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply

* [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr
From: Dmitry V. Levin @ 2019-06-23 11:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190621221339.6yj4vg4zexv4y2j7@brauner.io>

Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
is supported by the kernel or not.

While older kernels without CLONE_PIDFD support just leave unchanged
the value pointed by parent_tidptr, current implementation fails with
EINVAL if that value is non-zero.

If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
pointed by parent_tidptr also remains unchanged, which effectively
means that userspace must either check CLONE_PIDFD support beforehand
or ensure that fd 0 is not closed when invoking CLONE_PIDFD.

The check for pidfd == 0 was introduced during v5.2 release cycle
by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
CLONE_PIDFD could be potentially extended by passing in flags through
the return argument.

However, that extension would look horrendous, and with introduction of
clone3 syscall in v5.3 there is no need to extend legacy clone syscall
this way.

So remove the pidfd == 0 check.  Userspace that needs to be portable
to kernels without CLONE_PIDFD support is advised to initialize pidfd
with -1 and check the pidfd value returned by CLONE_PIDFD.

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
---
 kernel/fork.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 75675b9bf6df..39a3adaa4ad1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1822,8 +1822,6 @@ static __latent_entropy struct task_struct *copy_process(
 	}
 
 	if (clone_flags & CLONE_PIDFD) {
-		int reserved;
-
 		/*
 		 * - CLONE_PARENT_SETTID is useless for pidfds and also
 		 *   parent_tidptr is used to return pidfds.
@@ -1834,16 +1832,6 @@ static __latent_entropy struct task_struct *copy_process(
 		if (clone_flags &
 		    (CLONE_DETACHED | CLONE_PARENT_SETTID | CLONE_THREAD))
 			return ERR_PTR(-EINVAL);
-
-		/*
-		 * Verify that parent_tidptr is sane so we can potentially
-		 * reuse it later.
-		 */
-		if (get_user(reserved, parent_tidptr))
-			return ERR_PTR(-EFAULT);
-
-		if (reserved != 0)
-			return ERR_PTR(-EINVAL);
 	}
 
 	/*
-- 
ldv

^ permalink raw reply related

* [PATCH 2/2] samples: make pidfd-metadata fail gracefully on older kernels
From: Dmitry V. Levin @ 2019-06-23 11:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190623112717.GA20697@altlinux.org>

Initialize pidfd to an invalid descriptor, to fail gracefully on
those kernels that do not implement CLONE_PIDFD and leave pidfd
unchanged.

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
---
 samples/pidfd/pidfd-metadata.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
index 14b454448429..c459155daf9a 100644
--- a/samples/pidfd/pidfd-metadata.c
+++ b/samples/pidfd/pidfd-metadata.c
@@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
 
 int main(int argc, char *argv[])
 {
-	int pidfd = 0, ret = EXIT_FAILURE;
+	int pidfd = -1, ret = EXIT_FAILURE;
 	char buf[4096] = { 0 };
 	pid_t pid;
 	int procfd, statusfd;
@@ -91,7 +91,11 @@ int main(int argc, char *argv[])
 
 	pid = pidfd_clone(CLONE_PIDFD, &pidfd);
 	if (pid < 0)
-		exit(ret);
+		err(ret, "CLONE_PIDFD");
+	if (pidfd == -1) {
+		warnx("CLONE_PIDFD is not supported by the kernel");
+		goto out;
+	}
 
 	procfd = pidfd_metadata_fd(pid, pidfd);
 	close(pidfd);
-- 
ldv

^ permalink raw reply related

* Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels
From: Dmitry V. Levin @ 2019-06-23 11:32 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190621221339.6yj4vg4zexv4y2j7@brauner.io>

[-- Attachment #1: Type: text/plain, Size: 505 bytes --]

On Sat, Jun 22, 2019 at 12:13:39AM +0200, Christian Brauner wrote:
[...]
> Out of curiosity: what makes the new flag different than say
> CLONE_NEWCGROUP or any new clone flag that got introduced?
> CLONE_NEWCGROUP too would not be detectable apart from the method I gave
> you above; same for other clone flags. Why are you so keen on being able
> to detect this flag when other flags didn't seem to matter that much.

I wasn't following uapi changes closely enough those days. ;)


-- 
ldv

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH V31 07/25] kexec_file: Restrict at runtime if the kernel is locked down
From: Dave Young @ 2019-06-24  1:52 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: James Morris, Jiri Bohac, Linux API, kexec,
	Linux Kernel Mailing List, David Howells, LSM List,
	Andy Lutomirski
In-Reply-To: <CACdnJut=J1YTpM4s6g5XWCEs+=X0Jvf8otfMg+w=_oqSZmf01Q@mail.gmail.com>

On 06/21/19 at 01:18pm, Matthew Garrett wrote:
> On Thu, Jun 20, 2019 at 11:43 PM Dave Young <dyoung@redhat.com> wrote:
> >
> > On 03/26/19 at 11:27am, Matthew Garrett wrote:
> > > From: Jiri Bohac <jbohac@suse.cz>
> > >
> > > When KEXEC_SIG is not enabled, kernel should not load images through
> > > kexec_file systemcall if the kernel is locked down.
> > >
> > > [Modified by David Howells to fit with modifications to the previous patch
> > >  and to return -EPERM if the kernel is locked down for consistency with
> > >  other lockdowns. Modified by Matthew Garrett to remove the IMA
> > >  integration, which will be replaced by integrating with the IMA
> > >  architecture policy patches.]
> > >
> > > Signed-off-by: Jiri Bohac <jbohac@suse.cz>
> > > Signed-off-by: David Howells <dhowells@redhat.com>
> > > Signed-off-by: Matthew Garrett <mjg59@google.com>
> > > Reviewed-by: Jiri Bohac <jbohac@suse.cz>
> > > cc: kexec@lists.infradead.org
> > > ---
> > >  kernel/kexec_file.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > > index 67f3a866eabe..a1cc37c8b43b 100644
> > > --- a/kernel/kexec_file.c
> > > +++ b/kernel/kexec_file.c
> > > @@ -239,6 +239,12 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
> > >               }
> > >
> > >               ret = 0;
> > > +
> > > +             if (kernel_is_locked_down(reason, LOCKDOWN_INTEGRITY)) {
> > > +                     ret = -EPERM;
> > > +                     goto out;
> > > +             }
> > > +
> >
> > Checking here is late, it would be good to move the check to earlier
> > code around below code:
> >         /* We only trust the superuser with rebooting the system. */
> >         if (!capable(CAP_SYS_BOOT) || kexec_load_disabled)
> >                 return -EPERM;
> 
> I don't think so - we want it to be possible to load images if they
> have a valid signature.

I know it works like this way because of the previous patch.  But from
the patch log "When KEXEC_SIG is not enabled, kernel should not load
images", it is simple to check it early for !IS_ENABLED(CONFIG_KEXEC_SIG) && 
kernel_is_locked_down(reason, LOCKDOWN_INTEGRITY)  instead of depending
on the late code to verify signature.  In that way, easier to
understand the logic, no?

Thanks
Dave

^ permalink raw reply

* Re: [PATCH V34 08/29] kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE
From: Dave Young @ 2019-06-24  2:01 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	Jiri Bohac, David Howells, Matthew Garrett, kexec
In-Reply-To: <20190622000358.19895-9-matthewgarrett@google.com>

On 06/21/19 at 05:03pm, Matthew Garrett wrote:
> From: Jiri Bohac <jbohac@suse.cz>
> 
> This is a preparatory patch for kexec_file_load() lockdown.  A locked down
> kernel needs to prevent unsigned kernel images from being loaded with
> kexec_file_load().  Currently, the only way to force the signature
> verification is compiling with KEXEC_VERIFY_SIG.  This prevents loading
> usigned images even when the kernel is not locked down at runtime.
> 
> This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE.
> Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG
> turns on the signature verification but allows unsigned images to be
> loaded.  KEXEC_SIG_FORCE disallows images without a valid signature.
> 
> [Modified by David Howells such that:
> 
>  (1) verify_pefile_signature() differentiates between no-signature and
>      sig-didn't-match in its returned errors.
> 
>  (2) kexec fails with EKEYREJECTED if there is a signature for which we
>      have a key, but signature doesn't match - even if in non-forcing mode.
> 
>  (3) kexec fails with EBADMSG or some other error if there is a signature
>      which cannot be parsed - even if in non-forcing mode.
> 
>  (4) kexec fails with ELIBBAD if the PE file cannot be parsed to extract
>      the signature - even if in non-forcing mode.
> 
> ]

Seems I do not see EBADMSG and ELIBBAD in this patch, also kexec fails
with proper errno instead of EKEYREJECTED only.

I may missed something?  Other than the patch log issue:

Reviewed-by: Dave Young <dyoung@redhat.com>

> 
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Reviewed-by: Jiri Bohac <jbohac@suse.cz>
> cc: kexec@lists.infradead.org
> ---
>  arch/x86/Kconfig                       | 20 ++++++++---
>  crypto/asymmetric_keys/verify_pefile.c |  4 ++-
>  include/linux/kexec.h                  |  4 +--
>  kernel/kexec_file.c                    | 47 ++++++++++++++++++++++----
>  4 files changed, 60 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c1f9b3cf437c..84381dd60760 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2012,20 +2012,30 @@ config KEXEC_FILE
>  config ARCH_HAS_KEXEC_PURGATORY
>  	def_bool KEXEC_FILE
>  
> -config KEXEC_VERIFY_SIG
> +config KEXEC_SIG
>  	bool "Verify kernel signature during kexec_file_load() syscall"
>  	depends on KEXEC_FILE
>  	---help---
> -	  This option makes kernel signature verification mandatory for
> -	  the kexec_file_load() syscall.
>  
> -	  In addition to that option, you need to enable signature
> +	  This option makes the kexec_file_load() syscall check for a valid
> +	  signature of the kernel image.  The image can still be loaded without
> +	  a valid signature unless you also enable KEXEC_SIG_FORCE, though if
> +	  there's a signature that we can check, then it must be valid.
> +
> +	  In addition to this option, you need to enable signature
>  	  verification for the corresponding kernel image type being
>  	  loaded in order for this to work.
>  
> +config KEXEC_SIG_FORCE
> +	bool "Require a valid signature in kexec_file_load() syscall"
> +	depends on KEXEC_SIG
> +	---help---
> +	  This option makes kernel signature verification mandatory for
> +	  the kexec_file_load() syscall.
> +
>  config KEXEC_BZIMAGE_VERIFY_SIG
>  	bool "Enable bzImage signature verification support"
> -	depends on KEXEC_VERIFY_SIG
> +	depends on KEXEC_SIG
>  	depends on SIGNED_PE_FILE_VERIFICATION
>  	select SYSTEM_TRUSTED_KEYRING
>  	---help---
> diff --git a/crypto/asymmetric_keys/verify_pefile.c b/crypto/asymmetric_keys/verify_pefile.c
> index d178650fd524..4473cea1e877 100644
> --- a/crypto/asymmetric_keys/verify_pefile.c
> +++ b/crypto/asymmetric_keys/verify_pefile.c
> @@ -100,7 +100,7 @@ static int pefile_parse_binary(const void *pebuf, unsigned int pelen,
>  
>  	if (!ddir->certs.virtual_address || !ddir->certs.size) {
>  		pr_debug("Unsigned PE binary\n");
> -		return -EKEYREJECTED;
> +		return -ENODATA;
>  	}
>  
>  	chkaddr(ctx->header_size, ddir->certs.virtual_address,
> @@ -408,6 +408,8 @@ static int pefile_digest_pe(const void *pebuf, unsigned int pelen,
>   *  (*) 0 if at least one signature chain intersects with the keys in the trust
>   *	keyring, or:
>   *
> + *  (*) -ENODATA if there is no signature present.
> + *
>   *  (*) -ENOPKG if a suitable crypto module couldn't be found for a check on a
>   *	chain.
>   *
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index b9b1bc5f9669..58b27c7bdc2b 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -125,7 +125,7 @@ typedef void *(kexec_load_t)(struct kimage *image, char *kernel_buf,
>  			     unsigned long cmdline_len);
>  typedef int (kexec_cleanup_t)(void *loader_data);
>  
> -#ifdef CONFIG_KEXEC_VERIFY_SIG
> +#ifdef CONFIG_KEXEC_SIG
>  typedef int (kexec_verify_sig_t)(const char *kernel_buf,
>  				 unsigned long kernel_len);
>  #endif
> @@ -134,7 +134,7 @@ struct kexec_file_ops {
>  	kexec_probe_t *probe;
>  	kexec_load_t *load;
>  	kexec_cleanup_t *cleanup;
> -#ifdef CONFIG_KEXEC_VERIFY_SIG
> +#ifdef CONFIG_KEXEC_SIG
>  	kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index f1d0e00a3971..eec7e5bb2a08 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -90,7 +90,7 @@ int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
>  	return kexec_image_post_load_cleanup_default(image);
>  }
>  
> -#ifdef CONFIG_KEXEC_VERIFY_SIG
> +#ifdef CONFIG_KEXEC_SIG
>  static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
>  					  unsigned long buf_len)
>  {
> @@ -188,7 +188,8 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
>  			     const char __user *cmdline_ptr,
>  			     unsigned long cmdline_len, unsigned flags)
>  {
> -	int ret = 0;
> +	const char *reason;
> +	int ret;
>  	void *ldata;
>  	loff_t size;
>  
> @@ -207,15 +208,47 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
>  	if (ret)
>  		goto out;
>  
> -#ifdef CONFIG_KEXEC_VERIFY_SIG
> +#ifdef CONFIG_KEXEC_SIG
>  	ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
>  					   image->kernel_buf_len);
> -	if (ret) {
> -		pr_debug("kernel signature verification failed.\n");
> +#else
> +	ret = -ENODATA;
> +#endif
> +
> +	switch (ret) {
> +	case 0:
> +		break;
> +
> +		/* Certain verification errors are non-fatal if we're not
> +		 * checking errors, provided we aren't mandating that there
> +		 * must be a valid signature.
> +		 */
> +	case -ENODATA:
> +		reason = "kexec of unsigned image";
> +		goto decide;
> +	case -ENOPKG:
> +		reason = "kexec of image with unsupported crypto";
> +		goto decide;
> +	case -ENOKEY:
> +		reason = "kexec of image with unavailable key";
> +	decide:
> +		if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) {
> +			pr_notice("%s rejected\n", reason);
> +			goto out;
> +		}
> +
> +		ret = 0;
> +		break;
> +
> +		/* All other errors are fatal, including nomem, unparseable
> +		 * signatures and signature check failures - even if signatures
> +		 * aren't required.
> +		 */
> +	default:
> +		pr_notice("kernel signature verification failed (%d).\n", ret);
>  		goto out;
>  	}
> -	pr_debug("kernel signature verification successful.\n");
> -#endif
> +
>  	/* It is possible that there no initramfs is being loaded */
>  	if (!(flags & KEXEC_FILE_NO_INITRAMFS)) {
>  		ret = kernel_read_file_from_fd(initrd_fd, &image->initrd_buf,
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

^ permalink raw reply

* Re: [PATCH 05/13] vfs: don't parse "silent" option
From: Miklos Szeredi @ 2019-06-24  8:25 UTC (permalink / raw)
  To: Ian Kent; +Cc: David Howells, Al Viro, Linux API, linux-fsdevel, lkml
In-Reply-To: <1ea8ec52ce19499f021510b5c9e38be8d8ebe38f.camel@themaw.net>

On Thu, Jun 20, 2019 at 6:40 AM Ian Kent <raven@themaw.net> wrote:
>
> On Wed, 2019-06-19 at 14:30 +0200, Miklos Szeredi wrote:
> > While this is a standard option as documented in mount(8), it is ignored by
> > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > it.
> >
> > The exception is unconverted filesystems, where it is unknown if the
> > filesystem handles this or not.
> >
> > Any implementation, such as mount(8) that needs to parse this option
> > without failing should simply ignore the return value from fsconfig().
>
> In theory this is fine but every time someone has attempted
> to change the handling of this in the past autofs has had
> problems so I'm a bit wary of the change.
>
> It was originally meant to tell the file system to ignore
> invalid options such as could be found in automount maps that
> are used with multiple OS implementations that have differences
> in their options.
>
> That was, IIRC, primarily NFS although NFS should handle most
> (if not all of those) cases these days.
>
> Nevertheless I'm a bit nervous about it, ;)

What I'm saying is that with a new interface the rules need not follow
the rules of the old interface, because at the start no one is using
the new interface, so no chance of breaking anything.

Yes, there's a chance of making the interface difficult to use, but I
don't think this is one of those things.

For one, "silent" should not be needed on the new interface at all,
because error messages relating to the setup of the filesystem can be
redirected to a log buffer dedicated to the setup instance,
effectively enabling silent operation by default.

Thanks.,
Miklos

^ permalink raw reply

* Re: [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr
From: Christian Brauner @ 2019-06-24  9:49 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190623112717.GA20697@altlinux.org>

On Sun, Jun 23, 2019 at 02:27:17PM +0300, Dmitry V. Levin wrote:
> Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
> is supported by the kernel or not.
> 
> While older kernels without CLONE_PIDFD support just leave unchanged
> the value pointed by parent_tidptr, current implementation fails with
> EINVAL if that value is non-zero.
> 
> If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> pointed by parent_tidptr also remains unchanged, which effectively
> means that userspace must either check CLONE_PIDFD support beforehand
> or ensure that fd 0 is not closed when invoking CLONE_PIDFD.
> 
> The check for pidfd == 0 was introduced during v5.2 release cycle
> by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
> CLONE_PIDFD could be potentially extended by passing in flags through
> the return argument.
> 
> However, that extension would look horrendous, and with introduction of
> clone3 syscall in v5.3 there is no need to extend legacy clone syscall
> this way.
> 
> So remove the pidfd == 0 check.  Userspace that needs to be portable
> to kernels without CLONE_PIDFD support is advised to initialize pidfd
> with -1 and check the pidfd value returned by CLONE_PIDFD.
> 
> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>

Reviewed-by: Christian Brauner <christian@brauner.io>

Thank you Dmitry, queueing this up for rc7.

> ---
>  kernel/fork.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 75675b9bf6df..39a3adaa4ad1 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1822,8 +1822,6 @@ static __latent_entropy struct task_struct *copy_process(
>  	}
>  
>  	if (clone_flags & CLONE_PIDFD) {
> -		int reserved;
> -
>  		/*
>  		 * - CLONE_PARENT_SETTID is useless for pidfds and also
>  		 *   parent_tidptr is used to return pidfds.
> @@ -1834,16 +1832,6 @@ static __latent_entropy struct task_struct *copy_process(
>  		if (clone_flags &
>  		    (CLONE_DETACHED | CLONE_PARENT_SETTID | CLONE_THREAD))
>  			return ERR_PTR(-EINVAL);
> -
> -		/*
> -		 * Verify that parent_tidptr is sane so we can potentially
> -		 * reuse it later.
> -		 */
> -		if (get_user(reserved, parent_tidptr))
> -			return ERR_PTR(-EFAULT);
> -
> -		if (reserved != 0)
> -			return ERR_PTR(-EINVAL);
>  	}
>  
>  	/*
> -- 
> ldv

^ permalink raw reply

* Re: [PATCH 2/2] samples: make pidfd-metadata fail gracefully on older kernels
From: Christian Brauner @ 2019-06-24  9:50 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190623112800.GB20697@altlinux.org>

On Sun, Jun 23, 2019 at 02:28:00PM +0300, Dmitry V. Levin wrote:
> Initialize pidfd to an invalid descriptor, to fail gracefully on
> those kernels that do not implement CLONE_PIDFD and leave pidfd
> unchanged.
> 
> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>

Reviewed-by: Christian Brauner <christian@brauner.io>

Thank you Dmitry, queueing this up for rc7.

> ---
>  samples/pidfd/pidfd-metadata.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
> index 14b454448429..c459155daf9a 100644
> --- a/samples/pidfd/pidfd-metadata.c
> +++ b/samples/pidfd/pidfd-metadata.c
> @@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
>  
>  int main(int argc, char *argv[])
>  {
> -	int pidfd = 0, ret = EXIT_FAILURE;
> +	int pidfd = -1, ret = EXIT_FAILURE;
>  	char buf[4096] = { 0 };
>  	pid_t pid;
>  	int procfd, statusfd;
> @@ -91,7 +91,11 @@ int main(int argc, char *argv[])
>  
>  	pid = pidfd_clone(CLONE_PIDFD, &pidfd);
>  	if (pid < 0)
> -		exit(ret);
> +		err(ret, "CLONE_PIDFD");
> +	if (pidfd == -1) {
> +		warnx("CLONE_PIDFD is not supported by the kernel");
> +		goto out;
> +	}
>  
>  	procfd = pidfd_metadata_fd(pid, pidfd);
>  	close(pidfd);
> -- 
> ldv

^ permalink raw reply

* Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels
From: Christian Brauner @ 2019-06-24  9:52 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190623113230.GC20697@altlinux.org>

On Sun, Jun 23, 2019 at 02:32:30PM +0300, Dmitry V. Levin wrote:
> On Sat, Jun 22, 2019 at 12:13:39AM +0200, Christian Brauner wrote:
> [...]
> > Out of curiosity: what makes the new flag different than say
> > CLONE_NEWCGROUP or any new clone flag that got introduced?
> > CLONE_NEWCGROUP too would not be detectable apart from the method I gave
> > you above; same for other clone flags. Why are you so keen on being able
> > to detect this flag when other flags didn't seem to matter that much.
> 
> I wasn't following uapi changes closely enough those days. ;)

(Seriously, you had one job. :) I'm joking of course.)

What you want makes sense to me overall. This way userspace can decide
easier whether to manage a process through a pidfd or needs to fallback
to a pid.

Christian

^ permalink raw reply

* Re: [PATCH 05/13] vfs: don't parse "silent" option
From: David Howells @ 2019-06-24 10:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: dhowells, Ian Kent, Al Viro, Linux API, linux-fsdevel, lkml
In-Reply-To: <CAOssrKcU2JKDYMDbW7V6jpM7_4WFSMA91h9AjpjoYmX=H4ybeg@mail.gmail.com>

Miklos Szeredi <mszeredi@redhat.com> wrote:

> What I'm saying is that with a new interface the rules need not follow
> the rules of the old interface, because at the start no one is using
> the new interface, so no chance of breaking anything.

Er. No.  That's not true, since the old interface comes through the new one.

David

^ permalink raw reply

* Re: [PATCH 05/13] vfs: don't parse "silent" option
From: Miklos Szeredi @ 2019-06-24 10:44 UTC (permalink / raw)
  To: David Howells; +Cc: Ian Kent, Al Viro, Linux API, linux-fsdevel, lkml
In-Reply-To: <30205.1561372589@warthog.procyon.org.uk>

On Mon, Jun 24, 2019 at 12:36 PM David Howells <dhowells@redhat.com> wrote:
>
> Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> > What I'm saying is that with a new interface the rules need not follow
> > the rules of the old interface, because at the start no one is using
> > the new interface, so no chance of breaking anything.
>
> Er. No.  That's not true, since the old interface comes through the new one.

No, old interface sets SB_* directly from arg 4 of mount(2) and not
via parsing arg 5.

Thanks,
Miklos

^ permalink raw reply

* Re: [PATCH 05/13] vfs: don't parse "silent" option
From: Miklos Szeredi @ 2019-06-24 10:53 UTC (permalink / raw)
  To: David Howells; +Cc: Ian Kent, Al Viro, Linux API, linux-fsdevel, lkml
In-Reply-To: <CAOssrKdGSRVSc38X1J0zCQQN+tUhiwPA4bCL0rHCZ-O8iVzzeQ@mail.gmail.com>

On Mon, Jun 24, 2019 at 12:44 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> On Mon, Jun 24, 2019 at 12:36 PM David Howells <dhowells@redhat.com> wrote:
> >
> > Miklos Szeredi <mszeredi@redhat.com> wrote:
> >
> > > What I'm saying is that with a new interface the rules need not follow
> > > the rules of the old interface, because at the start no one is using
> > > the new interface, so no chance of breaking anything.
> >
> > Er. No.  That's not true, since the old interface comes through the new one.
>
> No, old interface sets SB_* directly from arg 4 of mount(2) and not
> via parsing arg 5.

See also 9p mess up of "posixacl" handling *on the old interface* due
to exactly because the internal API doesn't differentiate between
options coming from the old interface and ones coming from the new.

So you are right that there's breakage, but it's due to the fact that
common code parses anything, and not because it doesn't.

Thanks,
Miklos

^ permalink raw reply

* Re: [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr
From: Christian Brauner @ 2019-06-24 11:59 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190624094940.24qrteybbcp25wq7@brauner.io>

On Mon, Jun 24, 2019 at 11:49:40AM +0200, Christian Brauner wrote:
> On Sun, Jun 23, 2019 at 02:27:17PM +0300, Dmitry V. Levin wrote:
> > Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
> > is supported by the kernel or not.
> > 
> > While older kernels without CLONE_PIDFD support just leave unchanged
> > the value pointed by parent_tidptr, current implementation fails with
> > EINVAL if that value is non-zero.
> > 
> > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > pointed by parent_tidptr also remains unchanged, which effectively
> > means that userspace must either check CLONE_PIDFD support beforehand
> > or ensure that fd 0 is not closed when invoking CLONE_PIDFD.
> > 
> > The check for pidfd == 0 was introduced during v5.2 release cycle
> > by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
> > CLONE_PIDFD could be potentially extended by passing in flags through
> > the return argument.
> > 
> > However, that extension would look horrendous, and with introduction of
> > clone3 syscall in v5.3 there is no need to extend legacy clone syscall
> > this way.
> > 
> > So remove the pidfd == 0 check.  Userspace that needs to be portable
> > to kernels without CLONE_PIDFD support is advised to initialize pidfd
> > with -1 and check the pidfd value returned by CLONE_PIDFD.
> > 
> > Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
> 
> Reviewed-by: Christian Brauner <christian@brauner.io>
> 
> Thank you Dmitry, queueing this up for rc7.

This is now sitting in

https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=fixes&id=43754d05f235dd1b6c7f8ab9f42007770d721f10

I reformulated the commit message a bit and gave it a Fixes tag. Dmitry,
if you want you can take a look and tell me if that's acceptable to you.

Thanks!
Christian

^ permalink raw reply

* Re: [PATCH V34 10/29] hibernate: Disable when the kernel is locked down
From: Jiri Kosina @ 2019-06-24 13:21 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Matthew Garrett, jmorris, linux-security-module, linux-kernel,
	linux-api, Josh Boyer, David Howells, Matthew Garrett, rjw,
	Joey Lee, linux-pm
In-Reply-To: <20190622175208.GB30317@amd>

On Sat, 22 Jun 2019, Pavel Machek wrote:

> > There is currently no way to verify the resume image when returning
> > from hibernate.  This might compromise the signed modules trust model,
> > so until we can work with signed hibernate images we disable it when the
> > kernel is locked down.
> 
> I keep getting these...
> 
> IIRC suse has patches to verify the images.

Yeah, Joey Lee is taking care of those. CCing.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply

* [bug report] read-ahead can't work properly
From: Weijie Yang @ 2019-06-24 13:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: axboe, fengguang.wu, linux-api, weijie.yang@samsung.com


When try the file readahead by posix_fadvise(), I find it can't work properly.

For example, posix_fadvise(POSIX_FADV_WILLNEED) a 10MB file, the kernel
actually  readahead only 512KB data to the page cache, even if there are enough
free memory in the machine.

When trace to kernel, I find the issue is at force_page_cache_readahead():
 
        max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages);
        nr_to_read = min(nr_to_read, max_pages);

No mater what input nr_to_read is, it is limited to a very small size, such as 128 pages.

I think the min() limit code is to limit per-disk-io size, not the total nr_to_read.
and trace the git log, this issue is introduced by 6d2be915e589
after that, nr_to_read is limited at small, even if there are enough free memory.
before that, user can readahead a very large file if they have enough memory.

When read the posix_fadvise() man-page, it says readahead data depending on
virtual memory load. 
So if there are enough memory, it should read as many data as user expected.

Expect someone can clarify or/and fix it. 

Thanks 





^ permalink raw reply

* Re: [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr
From: Dmitry V. Levin @ 2019-06-24 13:45 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190624115942.g6vyis3zy4ptt3fc@brauner.io>

[-- Attachment #1: Type: text/plain, Size: 2126 bytes --]

On Mon, Jun 24, 2019 at 01:59:43PM +0200, Christian Brauner wrote:
> On Mon, Jun 24, 2019 at 11:49:40AM +0200, Christian Brauner wrote:
> > On Sun, Jun 23, 2019 at 02:27:17PM +0300, Dmitry V. Levin wrote:
> > > Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
> > > is supported by the kernel or not.
> > > 
> > > While older kernels without CLONE_PIDFD support just leave unchanged
> > > the value pointed by parent_tidptr, current implementation fails with
> > > EINVAL if that value is non-zero.
> > > 
> > > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > > pointed by parent_tidptr also remains unchanged, which effectively
> > > means that userspace must either check CLONE_PIDFD support beforehand
> > > or ensure that fd 0 is not closed when invoking CLONE_PIDFD.
> > > 
> > > The check for pidfd == 0 was introduced during v5.2 release cycle
> > > by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
> > > CLONE_PIDFD could be potentially extended by passing in flags through
> > > the return argument.
> > > 
> > > However, that extension would look horrendous, and with introduction of
> > > clone3 syscall in v5.3 there is no need to extend legacy clone syscall
> > > this way.
> > > 
> > > So remove the pidfd == 0 check.  Userspace that needs to be portable
> > > to kernels without CLONE_PIDFD support is advised to initialize pidfd
> > > with -1 and check the pidfd value returned by CLONE_PIDFD.
> > > 
> > > Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
> > 
> > Reviewed-by: Christian Brauner <christian@brauner.io>
> > 
> > Thank you Dmitry, queueing this up for rc7.
> 
> This is now sitting in
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=fixes&id=43754d05f235dd1b6c7f8ab9f42007770d721f10
> 
> I reformulated the commit message a bit and gave it a Fixes tag. Dmitry,
> if you want you can take a look and tell me if that's acceptable to you.

s/Old kernel that only support/Old kernels that only support/

Besides that, fine with me.  Thanks.


-- 
ldv

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr
From: Christian Brauner @ 2019-06-24 13:49 UTC (permalink / raw)
  To: Dmitry V. Levin
  Cc: Jann Horn, Oleg Nesterov, Arnd Bergmann, linux-api, linux-kernel
In-Reply-To: <20190624134531.GB6010@altlinux.org>

On Mon, Jun 24, 2019 at 04:45:31PM +0300, Dmitry V. Levin wrote:
> On Mon, Jun 24, 2019 at 01:59:43PM +0200, Christian Brauner wrote:
> > On Mon, Jun 24, 2019 at 11:49:40AM +0200, Christian Brauner wrote:
> > > On Sun, Jun 23, 2019 at 02:27:17PM +0300, Dmitry V. Levin wrote:
> > > > Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
> > > > is supported by the kernel or not.
> > > > 
> > > > While older kernels without CLONE_PIDFD support just leave unchanged
> > > > the value pointed by parent_tidptr, current implementation fails with
> > > > EINVAL if that value is non-zero.
> > > > 
> > > > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > > > pointed by parent_tidptr also remains unchanged, which effectively
> > > > means that userspace must either check CLONE_PIDFD support beforehand
> > > > or ensure that fd 0 is not closed when invoking CLONE_PIDFD.
> > > > 
> > > > The check for pidfd == 0 was introduced during v5.2 release cycle
> > > > by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
> > > > CLONE_PIDFD could be potentially extended by passing in flags through
> > > > the return argument.
> > > > 
> > > > However, that extension would look horrendous, and with introduction of
> > > > clone3 syscall in v5.3 there is no need to extend legacy clone syscall
> > > > this way.
> > > > 
> > > > So remove the pidfd == 0 check.  Userspace that needs to be portable
> > > > to kernels without CLONE_PIDFD support is advised to initialize pidfd
> > > > with -1 and check the pidfd value returned by CLONE_PIDFD.
> > > > 
> > > > Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
> > > 
> > > Reviewed-by: Christian Brauner <christian@brauner.io>
> > > 
> > > Thank you Dmitry, queueing this up for rc7.
> > 
> > This is now sitting in
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=fixes&id=43754d05f235dd1b6c7f8ab9f42007770d721f10
> > 
> > I reformulated the commit message a bit and gave it a Fixes tag. Dmitry,
> > if you want you can take a look and tell me if that's acceptable to you.
> 
> s/Old kernel that only support/Old kernels that only support/

Fixed.

Thanks!
Christian

^ permalink raw reply

* [PATCH 00/25] VFS: Introduce filesystem information query syscall [ver #14]
From: David Howells @ 2019-06-24 14:08 UTC (permalink / raw)
  To: viro; +Cc: dhowells, raven, mszeredi, linux-api, linux-fsdevel, linux-kernel


Hi Al,

Here are a set of patches that adds a syscall, fsinfo(), that allows
attributes of a filesystem/superblock to be queried.  Attribute values are
of four basic types:

 (1) Version dependent-length structure (size defined by type).

 (2) Variable-length string (up to PAGE_SIZE).

 (3) Array of fixed-length structures (up to INT_MAX size).

 (4) Opaque blob (up to INT_MAX size).

Attributes can have multiple values in up to two dimensions and all the
values of a particular attribute must have the same type.

Note that the attribute values *are* allowed to vary between dentries
within a single superblock, depending on the specific dentry that you're
looking at.

I've tried to make the interface as light as possible, so integer/enum
attribute selector rather than string and the core does all the allocation
and extensibility support work rather than leaving that to the filesystems.
That means that for the first two attribute types, sb->s_op->fsinfo() may
assume that the provided buffer is always present and always big enough.

Further, this removes the possibility of the filesystem gaining access to the
userspace buffer.


fsinfo() allows a variety of information to be retrieved about a filesystem
and the mount topology:

 (1) General superblock attributes:

      - The amount of space/free space in a filesystem (as statfs()).
      - Filesystem identifiers (UUID, volume label, device numbers, ...)
      - The limits on a filesystem's capabilities
      - Information on supported statx fields and attributes and IOC flags.
      - A variety single-bit flags indicating supported capabilities.
      - Timestamp resolution and range.
      - Sources (as per mount(2), but fsconfig() allows multiple sources).
      - In-filesystem filename format information.
      - Filesystem parameters ("mount -o xxx"-type things).
      - LSM parameters (again "mount -o xxx"-type things).

 (2) Filesystem-specific superblock attributes:

      - Server names and addresses.
      - Cell name.

 (3) Filesystem configuration metadata attributes:

      - Filesystem parameter type descriptions.
      - Name -> parameter mappings.
      - Simple enumeration name -> value mappings.

 (4) Mount topology:

      - General information about a mount object.
      - Mount device name(s).
      - Children of a mount object and their relative paths.

 (5) Information about what the fsinfo() syscall itself supports, including
     the number of attibutes supported and the number of capability bits
     supported.


The system is extensible:

 (1) New attributes can be added.  There is no requirement that a
     filesystem implement every attribute.  Note that the core VFS keeps a
     table of types and sizes so it can handle future extensibility rather
     than delegating this to the filesystems.

 (2) Version length-dependent structure attributes can be made larger and
     have additional information tacked on the end, provided it keeps the
     layout of the existing fields.  If an older process asks for a shorter
     structure, it will only be given the bits it asks for.  If a newer
     process asks for a longer structure on an older kernel, the extra
     space will be set to 0.  In all cases, the size of the data actually
     available is returned.

     In essence, the size of a structure is that structure's version: a
     smaller size is an earlier version and a later version includes
     everything that the earlier version did.

 (3) New single-bit capability flags can be added.  This is a structure-typed
     attribute and, as such, (2) applies.  Any bits you wanted but the kernel
     doesn't support are automatically set to 0.

If a filesystem-specific attribute is added, it should just take up the next
number in the enumeration.  Currently, I do not intend that the number space
should be subdivided between interested parties.


fsinfo() may be called like the following, for example:

	struct fsinfo_params params = {
		.at_flags	= AT_SYMLINK_NOFOLLOW,
		.request	= FSINFO_ATTR_SERVER_ADDRESS;
		.Nth		= 2;
		.Mth		= 1;
	};
	struct fsinfo_server_address address;

	len = fsinfo(AT_FDCWD, "/afs/grand.central.org/doc", &params,
		     &address, sizeof(address));

The above example would query a network filesystem, such as AFS or NFS, and
ask what the 2nd address (Mth) of the 3rd server (Nth) that the superblock is
using is.  Whereas:

	struct fsinfo_params params = {
		.at_flags	= AT_SYMLINK_NOFOLLOW,
		.request	= FSINFO_ATTR_CELL_NAME;
	};
	char cell_name[256];

	len = fsinfo(AT_FDCWD, "/afs/grand.central.org/doc", &params,
		     &cell_name, sizeof(cell_name));

would retrieve the name of an AFS cell as a string.

fsinfo() can also be used to query a context from fsopen() or fspick():

	fd = fsopen("ext4", 0);
	struct fsinfo_params params = {
		.request	= FSINFO_ATTR_PARAM_DESCRIPTION;
	};
	struct fsinfo_param_description desc;
	fsinfo(fd, NULL, &params, &desc, sizeof(desc));

even if that context doesn't currently have a superblock attached (though if
there's no superblock attached, only filesystem-specific things like parameter
descriptions can be accessed).

The patches can be found here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

on branch:

	fsinfo


===================
SIGNIFICANT CHANGES
===================

 ver #14:

 (*) Increase to 128-bit the fields for number of blocks and files in the
     filesystem and also the max file size and max inode number fields.

 (*) Increase to 64-bit the fields for max hard links and max xattr body
     length.

 (*) Provide struct fsinfo_timestamp_one to represent the characteristics
     of a single timestamp and move the range into it.  FAT, for example,
     has different ranges for different timestamps.  Each timestamp is then
     represented by one of these structs.

 (*) Don't expose MS_* flags (such as MS_RDONLY) through this interface as
     they ought to be considered deprecated; instead anyone who wants them
     should parse FSINFO_ATTR_PARAMETERS for the string equivalents.

 (*) Add a flag, AT_FSINFO_FROM_FSOPEN, to indicate that the fd being
     accessed is from fsopen()/fspick() and that fsinfo() should look
     inside and access the filesystem referred to by the fs_context.

 (*) If the filesystem implements FSINFO_ATTR_PARAMETERS for itself, don't
     automatically include flags for the SB_* bits that are normally
     rendered by, say, /proc/mounts (such as SB_RDONLY).  Rather, a helper
     is provided that the filesystem must call with an appropriately
     wangled s_flags.

 (*) Drop the NFS fsinfo patch for now as NFS fs_context support is
     unlikely to get upstream in the upcoming merge window.

 ver #13:

 (*) Provided a "fixed-struct array" type so that the list of children of a
     mount and all their change counters can be read atomically.

 (*) Additional filesystem examples.

 (*) Documented the API.

 ver #12:

 (*) Rename ->get_fsinfo() to ->fsinfo().

 (*) Pass the path through to to ->fsinfo() as it's needed for NFS to
     retrocalculate the source name.

 (*) Indicated which is the source parameter in the param-description
     attribute.

 (*) Dropped the realm attribute.

David
---
David Howells (17):
      vfs: syscall: Add fsinfo() to query filesystem information
      fsinfo: Add syscalls to other arches
      vfs: Allow fsinfo() to query what's in an fs_context
      vfs: Allow fsinfo() to be used to query an fs parameter description
      vfs: Implement parameter value retrieval with fsinfo()
      fsinfo: Implement retrieval of LSM parameters with fsinfo()
      vfs: Introduce a non-repeating system-unique superblock ID
      vfs: Allow fsinfo() to look up a mount object by ID
      vfs: Add mount notification count
      vfs: Allow mount information to be queried by fsinfo()
      vfs: fsinfo sample: Mount listing program
      fsinfo: Add API documentation
      hugetlbfs: Add support for fsinfo()
      kernfs, cgroup: Add fsinfo support
      fsinfo: Support SELinux superblock parameter retrieval
      fsinfo: Support Smack superblock parameter retrieval
      afs: Support fsinfo()

Ian Kent (8):
      fsinfo: proc - add sb operation fsinfo()
      fsinfo: autofs - add sb operation fsinfo()
      fsinfo: shmem - add tmpfs sb operation fsinfo()
      fsinfo: devpts - add sb operation fsinfo()
      fsinfo: pstore - add sb operation fsinfo()
      fsinfo: debugfs - add sb operation fsinfo()
      fsinfo: bpf - add sb operation fsinfo()
      fsinfo: ufs - add sb operation fsinfo()


 Documentation/filesystems/fsinfo.rst        |  596 ++++++++++++++++++
 arch/alpha/kernel/syscalls/syscall.tbl      |    1 
 arch/arm/tools/syscall.tbl                  |    1 
 arch/arm64/include/asm/unistd.h             |    2 
 arch/ia64/kernel/syscalls/syscall.tbl       |    1 
 arch/m68k/kernel/syscalls/syscall.tbl       |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl   |    1 
 arch/parisc/kernel/syscalls/syscall.tbl     |    1 
 arch/powerpc/kernel/syscalls/syscall.tbl    |    1 
 arch/s390/kernel/syscalls/syscall.tbl       |    1 
 arch/sh/kernel/syscalls/syscall.tbl         |    1 
 arch/sparc/kernel/syscalls/syscall.tbl      |    1 
 arch/x86/entry/syscalls/syscall_32.tbl      |    1 
 arch/x86/entry/syscalls/syscall_64.tbl      |    1 
 arch/xtensa/kernel/syscalls/syscall.tbl     |    1 
 fs/Kconfig                                  |    7 
 fs/Makefile                                 |    1 
 fs/afs/internal.h                           |    1 
 fs/afs/super.c                              |  180 +++++-
 fs/autofs/inode.c                           |   64 ++
 fs/d_path.c                                 |    2 
 fs/debugfs/inode.c                          |   38 +
 fs/devpts/inode.c                           |   43 +
 fs/fsinfo.c                                 |  877 +++++++++++++++++++++++++++
 fs/hugetlbfs/inode.c                        |   57 ++
 fs/internal.h                               |   11 
 fs/kernfs/mount.c                           |   20 +
 fs/mount.h                                  |   22 +
 fs/namespace.c                              |  307 +++++++++
 fs/proc/inode.c                             |   37 +
 fs/pstore/inode.c                           |   32 +
 fs/statfs.c                                 |    2 
 fs/super.c                                  |   24 +
 fs/ufs/super.c                              |   58 ++
 include/linux/fs.h                          |    8 
 include/linux/fsinfo.h                      |   70 ++
 include/linux/kernfs.h                      |    4 
 include/linux/lsm_hooks.h                   |   13 
 include/linux/security.h                    |   11 
 include/linux/syscalls.h                    |    4 
 include/uapi/asm-generic/unistd.h           |    4 
 include/uapi/linux/fcntl.h                  |    3 
 include/uapi/linux/fsinfo.h                 |  319 ++++++++++
 kernel/bpf/inode.c                          |   25 +
 kernel/cgroup/cgroup-v1.c                   |   44 +
 kernel/cgroup/cgroup.c                      |   19 +
 kernel/sys_ni.c                             |    1 
 mm/shmem.c                                  |   72 ++
 samples/vfs/Makefile                        |    9 
 samples/vfs/test-fs-query.c                 |  138 ++++
 samples/vfs/test-fsinfo.c                   |  682 +++++++++++++++++++++
 samples/vfs/test-mntinfo.c                  |  241 +++++++
 security/security.c                         |   12 
 security/selinux/hooks.c                    |   41 +
 security/selinux/include/security.h         |    2 
 security/selinux/ss/services.c              |   49 ++
 security/smack/smack_lsm.c                  |   43 +
 60 files changed, 4202 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/filesystems/fsinfo.rst
 create mode 100644 fs/fsinfo.c
 create mode 100644 include/linux/fsinfo.h
 create mode 100644 include/uapi/linux/fsinfo.h
 create mode 100644 samples/vfs/test-fs-query.c
 create mode 100644 samples/vfs/test-fsinfo.c
 create mode 100644 samples/vfs/test-mntinfo.c

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox