Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* [PATCH bpf-next v10 04/10] seccomp,landlock: Enforce Landlock programs per process hierarchy
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

The seccomp(2) syscall can be used by a task to apply a Landlock program
to itself. As a seccomp filter, a Landlock program is enforced for the
current task and all its future children. A program is immutable and a
task can only add new restricting programs to itself, forming a list of
programss.

A Landlock program is tied to a Landlock hook. If the action on a kernel
object is allowed by the other Linux security mechanisms (e.g. DAC,
capabilities, other LSM), then a Landlock hook related to this kind of
object is triggered. The list of programs for this hook is then
evaluated. Each program return a binary value which can deny the action
on a kernel object with a non-zero value. If every programs of the list
return zero, then the action on the object is allowed.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
Link: https://lkml.kernel.org/r/c10a503d-5e35-7785-2f3d-25ed8dd63fab@digikod.net
---

Changes since v9:
* replace subtype with expected_attach_type and expected_attach_triggers

Changes since v8:
* Remove the chaining concept from the eBPF program contexts (chain and
  cookie). We need to keep these subtypes this way to be able to make
  them evolve, though.

Changes since v7:
* handle and verify program chains
* split and rename providers.c to enforce.c and enforce_seccomp.c
* rename LANDLOCK_SUBTYPE_* to LANDLOCK_*

Changes since v6:
* rename some functions with more accurate names to reflect that an eBPF
  program for Landlock could be used for something else than a rule
* reword rule "appending" to "prepending" and explain it
* remove the superfluous no_new_privs check, only check global
  CAP_SYS_ADMIN when prepending a Landlock rule (needed for containers)
* create and use {get,put}_seccomp_landlock() (suggested by Kees Cook)
* replace ifdef with static inlined function (suggested by Kees Cook)
* use get_user() (suggested by Kees Cook)
* replace atomic_t with refcount_t (requested by Kees Cook)
* move struct landlock_{rule,events} from landlock.h to common.h
* cleanup headers

Changes since v5:
* remove struct landlock_node and use a similar inheritance mechanisme
  as seccomp-bpf (requested by Andy Lutomirski)
* rename SECCOMP_ADD_LANDLOCK_RULE to SECCOMP_APPEND_LANDLOCK_RULE
* rename file manager.c to providers.c
* add comments
* typo and cosmetic fixes

Changes since v4:
* merge manager and seccomp patches
* return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
  if Landlock is supported
* only allow a process with the global CAP_SYS_ADMIN to use Landlock
  (will be lifted in the future)
* add an early check to exit as soon as possible if the current process
  does not have Landlock rules

Changes since v3:
* remove the hard link with seccomp (suggested by Andy Lutomirski and
  Kees Cook):
  * remove the cookie which could imply multiple evaluation of Landlock
    rules
  * remove the origin field in struct landlock_data
* remove documentation fix (merged upstream)
* rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
* internal renaming
* split commit
* new design to be able to inherit on the fly the parent rules

Changes since v2:
* Landlock programs can now be run without seccomp filter but for any
  syscall (from the process) or interruption
* move Landlock related functions and structs into security/landlock/*
  (to manage cgroups as well)
* fix seccomp filter handling: run Landlock programs for each of their
  legitimate seccomp filter
* properly clean up all seccomp results
* cosmetic changes to ease the understanding
* fix some ifdef
---
 include/linux/landlock.h            |  34 ++++
 include/linux/seccomp.h             |   5 +
 include/uapi/linux/seccomp.h        |   1 +
 kernel/fork.c                       |   8 +-
 kernel/seccomp.c                    |   4 +
 security/landlock/Makefile          |   3 +-
 security/landlock/common.h          |  38 ++++
 security/landlock/enforce.c         | 272 ++++++++++++++++++++++++++++
 security/landlock/enforce.h         |  18 ++
 security/landlock/enforce_seccomp.c |  92 ++++++++++
 10 files changed, 473 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/landlock.h
 create mode 100644 security/landlock/enforce.c
 create mode 100644 security/landlock/enforce.h
 create mode 100644 security/landlock/enforce_seccomp.c

diff --git a/include/linux/landlock.h b/include/linux/landlock.h
new file mode 100644
index 000000000000..8ac7942f50fc
--- /dev/null
+++ b/include/linux/landlock.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Landlock LSM - public kernel headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _LINUX_LANDLOCK_H
+#define _LINUX_LANDLOCK_H
+
+#include <linux/errno.h>
+#include <linux/sched.h> /* task_struct */
+
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+extern int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd);
+extern void put_seccomp_landlock(struct task_struct *tsk);
+extern void get_seccomp_landlock(struct task_struct *tsk);
+#else /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+static inline int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd)
+{
+		return -EINVAL;
+}
+static inline void put_seccomp_landlock(struct task_struct *tsk)
+{
+}
+static inline void get_seccomp_landlock(struct task_struct *tsk)
+{
+}
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+
+#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index 84868d37b35d..106a0ceff3d7 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -11,6 +11,7 @@
 
 #ifdef CONFIG_SECCOMP
 
+#include <linux/landlock.h>
 #include <linux/thread_info.h>
 #include <asm/seccomp.h>
 
@@ -22,6 +23,7 @@ struct seccomp_filter;
  *         system calls available to a process.
  * @filter: must always point to a valid seccomp-filter or NULL as it is
  *          accessed without locking during system call entry.
+ * @landlock_prog_set: contains a set of Landlock programs.
  *
  *          @filter must only be accessed from the context of current as there
  *          is no read locking.
@@ -29,6 +31,9 @@ struct seccomp_filter;
 struct seccomp {
 	int mode;
 	struct seccomp_filter *filter;
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+	struct landlock_prog_set *landlock_prog_set;
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
 };
 
 #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 90734aa5aa36..bce6534e7feb 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -16,6 +16,7 @@
 #define SECCOMP_SET_MODE_FILTER		1
 #define SECCOMP_GET_ACTION_AVAIL	2
 #define SECCOMP_GET_NOTIF_SIZES		3
+#define SECCOMP_PREPEND_LANDLOCK_PROG	4
 
 /* Valid flags for SECCOMP_SET_MODE_FILTER */
 #define SECCOMP_FILTER_FLAG_TSYNC		(1UL << 0)
diff --git a/kernel/fork.c b/kernel/fork.c
index 8f3e2d97d771..6c43517abdb9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -51,6 +51,7 @@
 #include <linux/security.h>
 #include <linux/hugetlb.h>
 #include <linux/seccomp.h>
+#include <linux/landlock.h>
 #include <linux/swap.h>
 #include <linux/syscalls.h>
 #include <linux/jiffies.h>
@@ -458,6 +459,7 @@ void free_task(struct task_struct *tsk)
 	rt_mutex_debug_task_free(tsk);
 	ftrace_graph_exit_task(tsk);
 	put_seccomp_filter(tsk);
+	put_seccomp_landlock(tsk);
 	arch_release_task_struct(tsk);
 	if (tsk->flags & PF_KTHREAD)
 		free_kthread_struct(tsk);
@@ -888,7 +890,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	 * the usage counts on the error path calling free_task.
 	 */
 	tsk->seccomp.filter = NULL;
-#endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+	tsk->seccomp.landlock_prog_set = NULL;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+#endif /* CONFIG_SECCOMP */
 
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
@@ -1604,6 +1609,7 @@ static void copy_seccomp(struct task_struct *p)
 
 	/* Ref-count the new filter user, and assign it. */
 	get_seccomp_filter(current);
+	get_seccomp_landlock(current);
 	p->seccomp = current->seccomp;
 
 	/*
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index dba52a7db5e8..af542a2d21e7 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -41,6 +41,7 @@
 #include <linux/tracehook.h>
 #include <linux/uaccess.h>
 #include <linux/anon_inodes.h>
+#include <linux/landlock.h>
 
 enum notify_state {
 	SECCOMP_NOTIFY_INIT,
@@ -1397,6 +1398,9 @@ static long do_seccomp(unsigned int op, unsigned int flags,
 			return -EINVAL;
 
 		return seccomp_get_notif_sizes(uargs);
+	case SECCOMP_PREPEND_LANDLOCK_PROG:
+		return landlock_seccomp_prepend_prog(flags,
+				(const int __user *)uargs);
 	default:
 		return -EINVAL;
 	}
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 7205f9a7a2ee..2a1a7082a365 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := init.o
+landlock-y := init.o \
+	enforce.o enforce_seccomp.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
index 80dc36f4d0ac..2cf36dbf4560 100644
--- a/security/landlock/common.h
+++ b/security/landlock/common.h
@@ -28,6 +28,44 @@ enum landlock_hook_type {
 	LANDLOCK_HOOK_FS_WALK,
 };
 
+struct landlock_prog_list {
+	struct landlock_prog_list *prev;
+	struct bpf_prog *prog;
+	refcount_t usage;
+};
+
+/**
+ * struct landlock_prog_set - Landlock programs enforced on a thread
+ *
+ * This is used for low performance impact when forking a process. Instead of
+ * copying the full array and incrementing the usage of each entries, only
+ * create a pointer to &struct landlock_prog_set and increments its usage. When
+ * prepending a new program, if &struct landlock_prog_set is shared with other
+ * tasks, then duplicate it and prepend the program to this new &struct
+ * landlock_prog_set.
+ *
+ * @usage: reference count to manage the object lifetime. When a thread need to
+ *	   add Landlock programs and if @usage is greater than 1, then the
+ *	   thread must duplicate &struct landlock_prog_set to not change the
+ *	   children's programs as well.
+ * @programs: array of non-NULL &struct landlock_prog_list pointers
+ */
+struct landlock_prog_set {
+	struct landlock_prog_list *programs[_LANDLOCK_HOOK_LAST];
+	refcount_t usage;
+};
+
+/**
+ * get_hook_index - get an index for the programs of struct landlock_prog_set
+ *
+ * @type: a Landlock hook type
+ */
+static inline int get_hook_index(enum landlock_hook_type type)
+{
+	/* type ID > 0 for loaded programs */
+	return type - 1;
+}
+
 static inline enum landlock_hook_type get_hook_type(const struct bpf_prog *prog)
 {
 	switch (prog->expected_attach_type) {
diff --git a/security/landlock/enforce.c b/security/landlock/enforce.c
new file mode 100644
index 000000000000..b6979de69d3e
--- /dev/null
+++ b/security/landlock/enforce.c
@@ -0,0 +1,272 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - enforcing helpers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <asm/barrier.h> /* smp_store_release() */
+#include <asm/page.h> /* PAGE_SIZE */
+#include <linux/bpf.h> /* bpf_prog_put() */
+#include <linux/compiler.h> /* READ_ONCE() */
+#include <linux/err.h> /* PTR_ERR() */
+#include <linux/errno.h>
+#include <linux/filter.h> /* struct bpf_prog */
+#include <linux/refcount.h>
+#include <linux/slab.h> /* alloc(), kfree() */
+
+#include "common.h" /* struct landlock_prog_list */
+
+/* TODO: use a dedicated kmem_cache_alloc() instead of k*alloc() */
+
+static void put_landlock_prog_list(struct landlock_prog_list *prog_list)
+{
+	struct landlock_prog_list *orig = prog_list;
+
+	/* clean up single-reference branches iteratively */
+	while (orig && refcount_dec_and_test(&orig->usage)) {
+		struct landlock_prog_list *freeme = orig;
+
+		if (orig->prog)
+			bpf_prog_put(orig->prog);
+		orig = orig->prev;
+		kfree(freeme);
+	}
+}
+
+void landlock_put_prog_set(struct landlock_prog_set *prog_set)
+{
+	if (prog_set && refcount_dec_and_test(&prog_set->usage)) {
+		size_t i;
+
+		for (i = 0; i < ARRAY_SIZE(prog_set->programs); i++)
+			put_landlock_prog_list(prog_set->programs[i]);
+		kfree(prog_set);
+	}
+}
+
+void landlock_get_prog_set(struct landlock_prog_set *prog_set)
+{
+	if (!prog_set)
+		return;
+	refcount_inc(&prog_set->usage);
+}
+
+static struct landlock_prog_set *new_landlock_prog_set(void)
+{
+	struct landlock_prog_set *ret;
+
+	/* array filled with NULL values */
+	ret = kzalloc(sizeof(*ret), GFP_KERNEL);
+	if (!ret)
+		return ERR_PTR(-ENOMEM);
+	refcount_set(&ret->usage, 1);
+	return ret;
+}
+
+/**
+ * store_landlock_prog - prepend and deduplicate a Landlock prog_list
+ *
+ * Prepend @prog to @init_prog_set while ignoring @prog
+ * if they are already in @ref_prog_set.  Whatever is the result of this
+ * function call, you can call bpf_prog_put(@prog) after.
+ *
+ * @init_prog_set: empty prog_set to prepend to
+ * @ref_prog_set: prog_set to check for duplicate programs
+ * @prog: program to prepend
+ *
+ * Return -errno on error or 0 if @prog was successfully stored.
+ */
+static int store_landlock_prog(struct landlock_prog_set *init_prog_set,
+		const struct landlock_prog_set *ref_prog_set,
+		struct bpf_prog *prog)
+{
+	struct landlock_prog_list *tmp_list = NULL;
+	int err;
+	u32 hook_idx;
+	enum landlock_hook_type last_type;
+	struct bpf_prog *new = prog;
+
+	/* allocate all the memory we need */
+	struct landlock_prog_list *new_list;
+
+	last_type = get_hook_type(new);
+
+	/* ignore duplicate programs */
+	if (ref_prog_set) {
+		struct landlock_prog_list *ref;
+
+		hook_idx = get_hook_index(get_hook_type(new));
+		for (ref = ref_prog_set->programs[hook_idx];
+				ref; ref = ref->prev) {
+			if (ref->prog == new)
+				return -EINVAL;
+		}
+	}
+
+	new = bpf_prog_inc(new);
+	if (IS_ERR(new)) {
+		err = PTR_ERR(new);
+		goto put_tmp_list;
+	}
+	new_list = kzalloc(sizeof(*new_list), GFP_KERNEL);
+	if (!new_list) {
+		bpf_prog_put(new);
+		err = -ENOMEM;
+		goto put_tmp_list;
+	}
+	/* ignore Landlock types in this tmp_list */
+	new_list->prog = new;
+	new_list->prev = tmp_list;
+	refcount_set(&new_list->usage, 1);
+	tmp_list = new_list;
+
+	if (!tmp_list)
+		/* inform user space that this program was already added */
+		return -EEXIST;
+
+	/* properly store the list (without error cases) */
+	while (tmp_list) {
+		struct landlock_prog_list *new_list;
+
+		new_list = tmp_list;
+		tmp_list = tmp_list->prev;
+		/* do not increment the previous prog list usage */
+		hook_idx = get_hook_index(get_hook_type(new_list->prog));
+		new_list->prev = init_prog_set->programs[hook_idx];
+		/* no need to add from the last program to the first because
+		 * each of them are a different Landlock type */
+		smp_store_release(&init_prog_set->programs[hook_idx], new_list);
+	}
+	return 0;
+
+put_tmp_list:
+	put_landlock_prog_list(tmp_list);
+	return err;
+}
+
+/* limit Landlock programs set to 256KB */
+#define LANDLOCK_PROGRAMS_MAX_PAGES (1 << 6)
+
+/**
+ * landlock_prepend_prog - attach a Landlock prog_list to @current_prog_set
+ *
+ * Whatever is the result of this function call, you can call
+ * bpf_prog_put(@prog) after.
+ *
+ * @current_prog_set: landlock_prog_set pointer, must be locked (if needed) to
+ *                    prevent a concurrent put/free. This pointer must not be
+ *                    freed after the call.
+ * @prog: non-NULL Landlock prog_list to prepend to @current_prog_set. @prog
+ *	  will be owned by landlock_prepend_prog() and freed if an error
+ *	  happened.
+ *
+ * Return @current_prog_set or a new pointer when OK. Return a pointer error
+ * otherwise.
+ */
+struct landlock_prog_set *landlock_prepend_prog(
+		struct landlock_prog_set *current_prog_set,
+		struct bpf_prog *prog)
+{
+	struct landlock_prog_set *new_prog_set = current_prog_set;
+	unsigned long pages;
+	int err;
+	size_t i;
+	struct landlock_prog_set tmp_prog_set = {};
+
+	if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK)
+		return ERR_PTR(-EINVAL);
+
+	/* validate memory size allocation */
+	pages = prog->pages;
+	if (current_prog_set) {
+		size_t i;
+
+		for (i = 0; i < ARRAY_SIZE(current_prog_set->programs); i++) {
+			struct landlock_prog_list *walker_p;
+
+			for (walker_p = current_prog_set->programs[i];
+					walker_p; walker_p = walker_p->prev)
+				pages += walker_p->prog->pages;
+		}
+		/* count a struct landlock_prog_set if we need to allocate one */
+		if (refcount_read(&current_prog_set->usage) != 1)
+			pages += round_up(sizeof(*current_prog_set), PAGE_SIZE)
+				/ PAGE_SIZE;
+	}
+	if (pages > LANDLOCK_PROGRAMS_MAX_PAGES)
+		return ERR_PTR(-E2BIG);
+
+	/* ensure early that we can allocate enough memory for the new
+	 * prog_lists */
+	err = store_landlock_prog(&tmp_prog_set, current_prog_set, prog);
+	if (err)
+		return ERR_PTR(err);
+
+	/*
+	 * Each task_struct points to an array of prog list pointers.  These
+	 * tables are duplicated when additions are made (which means each
+	 * table needs to be refcounted for the processes using it). When a new
+	 * table is created, all the refcounters on the prog_list are bumped (to
+	 * track each table that references the prog). When a new prog is
+	 * added, it's just prepended to the list for the new table to point
+	 * at.
+	 *
+	 * Manage all the possible errors before this step to not uselessly
+	 * duplicate current_prog_set and avoid a rollback.
+	 */
+	if (!new_prog_set) {
+		/*
+		 * If there is no Landlock program set used by the current task,
+		 * then create a new one.
+		 */
+		new_prog_set = new_landlock_prog_set();
+		if (IS_ERR(new_prog_set))
+			goto put_tmp_lists;
+	} else if (refcount_read(&current_prog_set->usage) > 1) {
+		/*
+		 * If the current task is not the sole user of its Landlock
+		 * program set, then duplicate them.
+		 */
+		new_prog_set = new_landlock_prog_set();
+		if (IS_ERR(new_prog_set))
+			goto put_tmp_lists;
+		for (i = 0; i < ARRAY_SIZE(new_prog_set->programs); i++) {
+			new_prog_set->programs[i] =
+				READ_ONCE(current_prog_set->programs[i]);
+			if (new_prog_set->programs[i])
+				refcount_inc(&new_prog_set->programs[i]->usage);
+		}
+
+		/*
+		 * Landlock program set from the current task will not be freed
+		 * here because the usage is strictly greater than 1. It is
+		 * only prevented to be freed by another task thanks to the
+		 * caller of landlock_prepend_prog() which should be locked if
+		 * needed.
+		 */
+		landlock_put_prog_set(current_prog_set);
+	}
+
+	/* prepend tmp_prog_set to new_prog_set */
+	for (i = 0; i < ARRAY_SIZE(tmp_prog_set.programs); i++) {
+		/* get the last new list */
+		struct landlock_prog_list *last_list =
+			tmp_prog_set.programs[i];
+
+		if (last_list) {
+			while (last_list->prev)
+				last_list = last_list->prev;
+			/* no need to increment usage (pointer replacement) */
+			last_list->prev = new_prog_set->programs[i];
+			new_prog_set->programs[i] = tmp_prog_set.programs[i];
+		}
+	}
+	return new_prog_set;
+
+put_tmp_lists:
+	for (i = 0; i < ARRAY_SIZE(tmp_prog_set.programs); i++)
+		put_landlock_prog_list(tmp_prog_set.programs[i]);
+	return new_prog_set;
+}
diff --git a/security/landlock/enforce.h b/security/landlock/enforce.h
new file mode 100644
index 000000000000..39b800d9999f
--- /dev/null
+++ b/security/landlock/enforce.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - enforcing helpers headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_ENFORCE_H
+#define _SECURITY_LANDLOCK_ENFORCE_H
+
+struct landlock_prog_set *landlock_prepend_prog(
+		struct landlock_prog_set *current_prog_set,
+		struct bpf_prog *prog);
+void landlock_put_prog_set(struct landlock_prog_set *prog_set);
+void landlock_get_prog_set(struct landlock_prog_set *prog_set);
+
+#endif /* _SECURITY_LANDLOCK_ENFORCE_H */
diff --git a/security/landlock/enforce_seccomp.c b/security/landlock/enforce_seccomp.c
new file mode 100644
index 000000000000..c38c81e6b01a
--- /dev/null
+++ b/security/landlock/enforce_seccomp.c
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - enforcing with seccomp
+ *
+ * Copyright © 2016-2018 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifdef CONFIG_SECCOMP_FILTER
+
+#include <linux/bpf.h> /* bpf_prog_put() */
+#include <linux/capability.h>
+#include <linux/err.h> /* PTR_ERR() */
+#include <linux/errno.h>
+#include <linux/filter.h> /* struct bpf_prog */
+#include <linux/landlock.h>
+#include <linux/refcount.h>
+#include <linux/sched.h> /* current */
+#include <linux/uaccess.h> /* get_user() */
+
+#include "enforce.h"
+
+/* headers in include/linux/landlock.h */
+
+/**
+ * landlock_seccomp_prepend_prog - attach a Landlock program to the current
+ *                                 process
+ *
+ * current->seccomp.landlock_state->prog_set is lazily allocated. When a
+ * process fork, only a pointer is copied.  When a new program is added by a
+ * process, if there is other references to this process' prog_set, then a new
+ * allocation is made to contain an array pointing to Landlock program lists.
+ * This design enable low-performance impact and is memory efficient while
+ * keeping the property of prepend-only programs.
+ *
+ * For now, installing a Landlock prog requires that the requesting task has
+ * the global CAP_SYS_ADMIN. We cannot force the use of no_new_privs to not
+ * exclude containers where a process may legitimately acquire more privileges
+ * thanks to an SUID binary.
+ *
+ * @flags: not used for now, but could be used for TSYNC
+ * @user_bpf_fd: file descriptor pointing to a loaded Landlock prog
+ */
+int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd)
+{
+	struct landlock_prog_set *new_prog_set;
+	struct bpf_prog *prog;
+	int bpf_fd, err;
+
+	/* planned to be replaced with a no_new_privs check to allow
+	 * unprivileged tasks */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+	/* enable to check if Landlock is supported with early EFAULT */
+	if (!user_bpf_fd)
+		return -EFAULT;
+	if (flags)
+		return -EINVAL;
+	err = get_user(bpf_fd, user_bpf_fd);
+	if (err)
+		return err;
+
+	prog = bpf_prog_get(bpf_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	/*
+	 * We don't need to lock anything for the current process hierarchy,
+	 * everything is guarded by the atomic counters.
+	 */
+	new_prog_set = landlock_prepend_prog(
+			current->seccomp.landlock_prog_set, prog);
+	bpf_prog_put(prog);
+	/* @prog is managed/freed by landlock_prepend_prog() */
+	if (IS_ERR(new_prog_set))
+		return PTR_ERR(new_prog_set);
+	current->seccomp.landlock_prog_set = new_prog_set;
+	return 0;
+}
+
+void put_seccomp_landlock(struct task_struct *tsk)
+{
+	landlock_put_prog_set(tsk->seccomp.landlock_prog_set);
+}
+
+void get_seccomp_landlock(struct task_struct *tsk)
+{
+	landlock_get_prog_set(tsk->seccomp.landlock_prog_set);
+}
+
+#endif /* CONFIG_SECCOMP_FILTER */
-- 
2.22.0


^ permalink raw reply related

* [PATCH bpf-next v10 03/10] bpf,landlock: Define an eBPF program type for Landlock hooks
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

Add a new type of eBPF program used by Landlock hooks.  The goal of this
type of program is to accept or deny a requested access from userspace
to a kernel object (e.g. a file).

This new BPF program type will be registered with the Landlock LSM
initialization.

Add an initial Landlock Kconfig and update the MAINTAINERS file.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---

Changes since v9:
* handle inode put and map put, which fix unmount (reported by Al Viro)
* replace subtype with expected_attach_type and expected_attach_triggers
* check eBPF program return code

Changes since v8:
* Remove the chaining concept from the eBPF program contexts (chain and
  cookie). We need to keep these subtypes this way to be able to make
  them evolve, though.
* remove bpf_landlock_put_extra() because there is no more a "previous"
  field to free (for now)

Changes since v7:
* cosmetic fixes
* rename LANDLOCK_SUBTYPE_* to LANDLOCK_*
* cleanup UAPI definitions and move them from bpf.h to landlock.h
  (suggested by Alexei Starovoitov)
* disable Landlock by default (suggested by Alexei Starovoitov)
* rename BPF_PROG_TYPE_LANDLOCK_{RULE,HOOK}
* update the Kconfig
* update the MAINTAINERS file
* replace the IOCTL, LOCK and FCNTL events with FS_PICK, FS_WALK and
  FS_GET hook types
* add the ability to chain programs with an eBPF program file descriptor
  (i.e. the "previous" field in a Landlock subtype) and keep a state
  with a "cookie" value available from the context
* add a "triggers" subtype bitfield to match specific actions (e.g.
  append, chdir, read...)

Changes since v6:
* add 3 more sub-events: IOCTL, LOCK, FCNTL
  https://lkml.kernel.org/r/2fbc99a6-f190-f335-bd14-04bdeed35571@digikod.net
* rename LANDLOCK_VERSION to LANDLOCK_ABI to better reflect its purpose,
  and move it from landlock.h to common.h
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE: an eBPF
  program could be used for something else than a rule
* simplify struct landlock_context by removing the arch and syscall_nr fields
* remove all eBPF map functions call, remove ABILITY_WRITE
* refactor bpf_landlock_func_proto() (suggested by Kees Cook)
* constify pointers
* fix doc inclusion

Changes since v5:
* rename file hooks.c to init.c
* fix spelling

Changes since v4:
* merge a minimal (not enabled) LSM code and Kconfig in this commit

Changes since v3:
* split commit
* revamp the landlock_context:
  * add arch, syscall_nr and syscall_cmd (ioctl, fcntl…) to be able to
    cross-check action with the event type
  * replace args array with dedicated fields to ease the addition of new
    fields
---
 MAINTAINERS                         |  13 ++++
 include/uapi/linux/bpf.h            |   1 +
 include/uapi/linux/landlock.h       |  94 ++++++++++++++++++++++++
 kernel/bpf/verifier.c               |   1 +
 security/Kconfig                    |   1 +
 security/Makefile                   |   2 +
 security/landlock/Kconfig           |  18 +++++
 security/landlock/Makefile          |   3 +
 security/landlock/common.h          |  44 +++++++++++
 security/landlock/init.c            | 100 +++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h      |   1 +
 tools/include/uapi/linux/landlock.h | 109 ++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.c              |   1 +
 tools/lib/bpf/libbpf_probes.c       |   1 +
 14 files changed, 389 insertions(+)
 create mode 100644 include/uapi/linux/landlock.h
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/init.c
 create mode 100644 tools/include/uapi/linux/landlock.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 211ea3a199bd..d30b535a613a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8947,6 +8947,19 @@ F:	net/core/skmsg.c
 F:	net/core/sock_map.c
 F:	net/ipv4/tcp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:	Mickaël Salaün <mic@digikod.net>
+S:	Supported
+F:	Documentation/security/landlock/
+F:	include/linux/landlock.h
+F:	include/uapi/linux/landlock.h
+F:	samples/bpf/landlock*
+F:	security/landlock/
+F:	tools/include/uapi/linux/landlock.h
+F:	tools/testing/selftests/landlock/
+K:	landlock
+K:	LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M:	Hauke Mehrtens <hauke@hauke-m.de>
 L:	netdev@vger.kernel.org
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1664d260861b..d68613f737f3 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -171,6 +171,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SYSCTL,
 	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
+	BPF_PROG_TYPE_LANDLOCK_HOOK,
 };
 
 enum bpf_attach_type {
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
new file mode 100644
index 000000000000..be3c4757a3ad
--- /dev/null
+++ b/include/uapi/linux/landlock.h
@@ -0,0 +1,94 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Landlock - UAPI headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _UAPI__LINUX_LANDLOCK_H__
+#define _UAPI__LINUX_LANDLOCK_H__
+
+#include <linux/types.h>
+
+#define LANDLOCK_RET_ALLOW	0
+#define LANDLOCK_RET_DENY	1
+
+/**
+ * DOC: landlock_triggers
+ *
+ * A landlock trigger is used as a bitmask in subtype.landlock_hook.triggers
+ * for a fs_pick program.  It defines a set of actions for which the program
+ * should verify an access request.
+ *
+ * - %LANDLOCK_TRIGGER_FS_PICK_APPEND
+ * - %LANDLOCK_TRIGGER_FS_PICK_CHDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_CHROOT
+ * - %LANDLOCK_TRIGGER_FS_PICK_CREATE
+ * - %LANDLOCK_TRIGGER_FS_PICK_EXECUTE
+ * - %LANDLOCK_TRIGGER_FS_PICK_FCNTL
+ * - %LANDLOCK_TRIGGER_FS_PICK_GETATTR
+ * - %LANDLOCK_TRIGGER_FS_PICK_IOCTL
+ * - %LANDLOCK_TRIGGER_FS_PICK_LINK
+ * - %LANDLOCK_TRIGGER_FS_PICK_LINKTO
+ * - %LANDLOCK_TRIGGER_FS_PICK_LOCK
+ * - %LANDLOCK_TRIGGER_FS_PICK_MAP
+ * - %LANDLOCK_TRIGGER_FS_PICK_MOUNTON
+ * - %LANDLOCK_TRIGGER_FS_PICK_OPEN
+ * - %LANDLOCK_TRIGGER_FS_PICK_READ
+ * - %LANDLOCK_TRIGGER_FS_PICK_READDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_RECEIVE
+ * - %LANDLOCK_TRIGGER_FS_PICK_RENAME
+ * - %LANDLOCK_TRIGGER_FS_PICK_RENAMETO
+ * - %LANDLOCK_TRIGGER_FS_PICK_RMDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_SETATTR
+ * - %LANDLOCK_TRIGGER_FS_PICK_TRANSFER
+ * - %LANDLOCK_TRIGGER_FS_PICK_UNLINK
+ * - %LANDLOCK_TRIGGER_FS_PICK_WRITE
+ */
+#define LANDLOCK_TRIGGER_FS_PICK_APPEND			(1ULL << 0)
+#define LANDLOCK_TRIGGER_FS_PICK_CHDIR			(1ULL << 1)
+#define LANDLOCK_TRIGGER_FS_PICK_CHROOT			(1ULL << 2)
+#define LANDLOCK_TRIGGER_FS_PICK_CREATE			(1ULL << 3)
+#define LANDLOCK_TRIGGER_FS_PICK_EXECUTE		(1ULL << 4)
+#define LANDLOCK_TRIGGER_FS_PICK_FCNTL			(1ULL << 5)
+#define LANDLOCK_TRIGGER_FS_PICK_GETATTR		(1ULL << 6)
+#define LANDLOCK_TRIGGER_FS_PICK_IOCTL			(1ULL << 7)
+#define LANDLOCK_TRIGGER_FS_PICK_LINK			(1ULL << 8)
+#define LANDLOCK_TRIGGER_FS_PICK_LINKTO			(1ULL << 9)
+#define LANDLOCK_TRIGGER_FS_PICK_LOCK			(1ULL << 10)
+#define LANDLOCK_TRIGGER_FS_PICK_MAP			(1ULL << 11)
+#define LANDLOCK_TRIGGER_FS_PICK_MOUNTON		(1ULL << 12)
+#define LANDLOCK_TRIGGER_FS_PICK_OPEN			(1ULL << 13)
+#define LANDLOCK_TRIGGER_FS_PICK_READ			(1ULL << 14)
+#define LANDLOCK_TRIGGER_FS_PICK_READDIR		(1ULL << 15)
+#define LANDLOCK_TRIGGER_FS_PICK_RECEIVE		(1ULL << 16)
+#define LANDLOCK_TRIGGER_FS_PICK_RENAME			(1ULL << 17)
+#define LANDLOCK_TRIGGER_FS_PICK_RENAMETO		(1ULL << 18)
+#define LANDLOCK_TRIGGER_FS_PICK_RMDIR			(1ULL << 19)
+#define LANDLOCK_TRIGGER_FS_PICK_SETATTR		(1ULL << 20)
+#define LANDLOCK_TRIGGER_FS_PICK_TRANSFER		(1ULL << 21)
+#define LANDLOCK_TRIGGER_FS_PICK_UNLINK			(1ULL << 22)
+#define LANDLOCK_TRIGGER_FS_PICK_WRITE			(1ULL << 23)
+
+/**
+ * struct landlock_ctx_fs_pick - context accessible to a fs_pick program
+ *
+ * @inode: pointer to the current kernel object that can be used to compare
+ *         inodes from an inode map.
+ */
+struct landlock_ctx_fs_pick {
+	__u64 inode;
+};
+
+/**
+ * struct landlock_ctx_fs_walk - context accessible to a fs_walk program
+ *
+ * @inode: pointer to the current kernel object that can be used to compare
+ *         inodes from an inode map.
+ */
+struct landlock_ctx_fs_walk {
+	__u64 inode;
+};
+
+#endif /* _UAPI__LINUX_LANDLOCK_H__ */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 94a43d7c8175..026c68cb9116 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6116,6 +6116,7 @@ static int check_return_code(struct bpf_verifier_env *env)
 	case BPF_PROG_TYPE_CGROUP_DEVICE:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
 		break;
 	default:
 		return 0;
diff --git a/security/Kconfig b/security/Kconfig
index 06a30851511a..549822ea60b7 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -237,6 +237,7 @@ source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index c598b904938f..396ff107f70d 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -11,6 +11,7 @@ subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
+subdir-$(CONFIG_SECURITY_LANDLOCK)		+= landlock
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -27,6 +28,7 @@ obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
 obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
+obj-$(CONFIG_SECURITY_LANDLOCK)	+= landlock/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..8bd103102008
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,18 @@
+config SECURITY_LANDLOCK
+	bool "Landlock support"
+	depends on SECURITY
+	depends on BPF_SYSCALL
+	depends on SECCOMP_FILTER
+	default n
+	help
+	  This selects Landlock, a programmatic access control.  It enables to
+	  restrict processes on the fly (i.e. create a sandbox).  The security
+	  policy is a set of eBPF programs, dedicated to deny a list of actions
+	  on specific kernel objects (e.g. file).
+
+	  You need to enable seccomp filter to apply a security policy to a
+	  process hierarchy (e.g. application with built-in sandboxing).
+
+	  See Documentation/security/landlock/ for further information.
+
+	  If you are unsure how to answer this question, answer N.
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..7205f9a7a2ee
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := init.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index 000000000000..80dc36f4d0ac
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - private headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#include <linux/bpf.h> /* enum bpf_attach_type */
+#include <linux/filter.h> /* bpf_prog */
+#include <linux/refcount.h> /* refcount_t */
+#include <uapi/linux/landlock.h> /* LANDLOCK_TRIGGER_* */
+
+#define LANDLOCK_NAME "landlock"
+
+/* UAPI bounds and bitmasks */
+
+#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_FS_WALK
+
+#define _LANDLOCK_TRIGGER_FS_PICK_LAST	LANDLOCK_TRIGGER_FS_PICK_WRITE
+#define _LANDLOCK_TRIGGER_FS_PICK_MASK	((_LANDLOCK_TRIGGER_FS_PICK_LAST << 1ULL) - 1)
+
+enum landlock_hook_type {
+	LANDLOCK_HOOK_FS_PICK = 1,
+	LANDLOCK_HOOK_FS_WALK,
+};
+
+static inline enum landlock_hook_type get_hook_type(const struct bpf_prog *prog)
+{
+	switch (prog->expected_attach_type) {
+	case BPF_LANDLOCK_FS_PICK:
+		return LANDLOCK_HOOK_FS_PICK;
+	case BPF_LANDLOCK_FS_WALK:
+		return LANDLOCK_HOOK_FS_WALK;
+	default:
+		WARN_ON(1);
+		return BPF_LANDLOCK_FS_PICK;
+	}
+}
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/init.c b/security/landlock/init.c
new file mode 100644
index 000000000000..8dfd5fea3c1f
--- /dev/null
+++ b/security/landlock/init.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - init
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <linux/bpf.h> /* enum bpf_access_type */
+#include <linux/capability.h> /* capable */
+#include <linux/filter.h> /* struct bpf_prog */
+
+#include "common.h" /* LANDLOCK_* */
+
+static bool bpf_landlock_is_valid_access(int off, int size,
+		enum bpf_access_type type, const struct bpf_prog *prog,
+		struct bpf_insn_access_aux *info)
+{
+	enum bpf_reg_type reg_type = NOT_INIT;
+	int max_size = 0;
+
+	if (WARN_ON(!prog->expected_attach_type))
+		return false;
+
+	if (off < 0)
+		return false;
+	if (size <= 0 || size > sizeof(__u64))
+		return false;
+
+	/* check memory range access */
+	switch (reg_type) {
+	case NOT_INIT:
+		return false;
+	case SCALAR_VALUE:
+		/* allow partial raw value */
+		if (size > max_size)
+			return false;
+		info->ctx_field_size = max_size;
+		break;
+	default:
+		/* deny partial pointer */
+		if (size != max_size)
+			return false;
+	}
+
+	info->reg_type = reg_type;
+	return true;
+}
+
+static bool bpf_landlock_is_valid_triggers(const struct bpf_prog *prog)
+{
+	u64 triggers;
+
+	if (!prog)
+		return false;
+	triggers = prog->aux->expected_attach_triggers;
+
+	switch (get_hook_type(prog)) {
+	case LANDLOCK_HOOK_FS_PICK:
+		if (!triggers || triggers & ~_LANDLOCK_TRIGGER_FS_PICK_MASK)
+			return false;
+		break;
+	case LANDLOCK_HOOK_FS_WALK:
+		if (triggers)
+			return false;
+		break;
+	}
+	return true;
+}
+
+static const struct bpf_func_proto *bpf_landlock_func_proto(
+		enum bpf_func_id func_id,
+		const struct bpf_prog *prog)
+{
+	if (WARN_ON(!prog->expected_attach_type))
+		return NULL;
+
+	/* generic functions */
+	/* TODO: do we need/want update/delete functions for every LL prog?
+	 * => impurity vs. audit */
+	switch (func_id) {
+	case BPF_FUNC_map_lookup_elem:
+		return &bpf_map_lookup_elem_proto;
+	case BPF_FUNC_map_update_elem:
+		return &bpf_map_update_elem_proto;
+	case BPF_FUNC_map_delete_elem:
+		return &bpf_map_delete_elem_proto;
+	default:
+		break;
+	}
+	return NULL;
+}
+
+const struct bpf_verifier_ops landlock_verifier_ops = {
+	.get_func_proto	= bpf_landlock_func_proto,
+	.is_valid_access = bpf_landlock_is_valid_access,
+	.is_valid_triggers = bpf_landlock_is_valid_triggers,
+};
+
+const struct bpf_prog_ops landlock_prog_ops = {};
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 232747393405..7b7a4f6c3104 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -171,6 +171,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SYSCTL,
 	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
+	BPF_PROG_TYPE_LANDLOCK_HOOK,
 };
 
 enum bpf_attach_type {
diff --git a/tools/include/uapi/linux/landlock.h b/tools/include/uapi/linux/landlock.h
new file mode 100644
index 000000000000..9e6d8e10ec2c
--- /dev/null
+++ b/tools/include/uapi/linux/landlock.h
@@ -0,0 +1,109 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Landlock - UAPI headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _UAPI__LINUX_LANDLOCK_H__
+#define _UAPI__LINUX_LANDLOCK_H__
+
+#include <linux/types.h>
+
+#define LANDLOCK_RET_ALLOW	0
+#define LANDLOCK_RET_DENY	1
+
+/**
+ * enum landlock_hook_type - hook type for which a Landlock program is called
+ *
+ * A hook is a policy decision point which exposes the same context type for
+ * each program evaluation.
+ *
+ * @LANDLOCK_HOOK_FS_PICK: called for the last element of a file path
+ * @LANDLOCK_HOOK_FS_WALK: called for each directory of a file path (excluding
+ *			   the directory passed to fs_pick, if any)
+ */
+enum landlock_hook_type {
+	LANDLOCK_HOOK_FS_PICK = 1,
+	LANDLOCK_HOOK_FS_WALK,
+};
+
+/**
+ * DOC: landlock_triggers
+ *
+ * A landlock trigger is used as a bitmask in subtype.landlock_hook.triggers
+ * for a fs_pick program.  It defines a set of actions for which the program
+ * should verify an access request.
+ *
+ * - %LANDLOCK_TRIGGER_FS_PICK_APPEND
+ * - %LANDLOCK_TRIGGER_FS_PICK_CHDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_CHROOT
+ * - %LANDLOCK_TRIGGER_FS_PICK_CREATE
+ * - %LANDLOCK_TRIGGER_FS_PICK_EXECUTE
+ * - %LANDLOCK_TRIGGER_FS_PICK_FCNTL
+ * - %LANDLOCK_TRIGGER_FS_PICK_GETATTR
+ * - %LANDLOCK_TRIGGER_FS_PICK_IOCTL
+ * - %LANDLOCK_TRIGGER_FS_PICK_LINK
+ * - %LANDLOCK_TRIGGER_FS_PICK_LINKTO
+ * - %LANDLOCK_TRIGGER_FS_PICK_LOCK
+ * - %LANDLOCK_TRIGGER_FS_PICK_MAP
+ * - %LANDLOCK_TRIGGER_FS_PICK_MOUNTON
+ * - %LANDLOCK_TRIGGER_FS_PICK_OPEN
+ * - %LANDLOCK_TRIGGER_FS_PICK_READ
+ * - %LANDLOCK_TRIGGER_FS_PICK_READDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_RECEIVE
+ * - %LANDLOCK_TRIGGER_FS_PICK_RENAME
+ * - %LANDLOCK_TRIGGER_FS_PICK_RENAMETO
+ * - %LANDLOCK_TRIGGER_FS_PICK_RMDIR
+ * - %LANDLOCK_TRIGGER_FS_PICK_SETATTR
+ * - %LANDLOCK_TRIGGER_FS_PICK_TRANSFER
+ * - %LANDLOCK_TRIGGER_FS_PICK_UNLINK
+ * - %LANDLOCK_TRIGGER_FS_PICK_WRITE
+ */
+#define LANDLOCK_TRIGGER_FS_PICK_APPEND			(1ULL << 0)
+#define LANDLOCK_TRIGGER_FS_PICK_CHDIR			(1ULL << 1)
+#define LANDLOCK_TRIGGER_FS_PICK_CHROOT			(1ULL << 2)
+#define LANDLOCK_TRIGGER_FS_PICK_CREATE			(1ULL << 3)
+#define LANDLOCK_TRIGGER_FS_PICK_EXECUTE		(1ULL << 4)
+#define LANDLOCK_TRIGGER_FS_PICK_FCNTL			(1ULL << 5)
+#define LANDLOCK_TRIGGER_FS_PICK_GETATTR		(1ULL << 6)
+#define LANDLOCK_TRIGGER_FS_PICK_IOCTL			(1ULL << 7)
+#define LANDLOCK_TRIGGER_FS_PICK_LINK			(1ULL << 8)
+#define LANDLOCK_TRIGGER_FS_PICK_LINKTO			(1ULL << 9)
+#define LANDLOCK_TRIGGER_FS_PICK_LOCK			(1ULL << 10)
+#define LANDLOCK_TRIGGER_FS_PICK_MAP			(1ULL << 11)
+#define LANDLOCK_TRIGGER_FS_PICK_MOUNTON		(1ULL << 12)
+#define LANDLOCK_TRIGGER_FS_PICK_OPEN			(1ULL << 13)
+#define LANDLOCK_TRIGGER_FS_PICK_READ			(1ULL << 14)
+#define LANDLOCK_TRIGGER_FS_PICK_READDIR		(1ULL << 15)
+#define LANDLOCK_TRIGGER_FS_PICK_RECEIVE		(1ULL << 16)
+#define LANDLOCK_TRIGGER_FS_PICK_RENAME			(1ULL << 17)
+#define LANDLOCK_TRIGGER_FS_PICK_RENAMETO		(1ULL << 18)
+#define LANDLOCK_TRIGGER_FS_PICK_RMDIR			(1ULL << 19)
+#define LANDLOCK_TRIGGER_FS_PICK_SETATTR		(1ULL << 20)
+#define LANDLOCK_TRIGGER_FS_PICK_TRANSFER		(1ULL << 21)
+#define LANDLOCK_TRIGGER_FS_PICK_UNLINK			(1ULL << 22)
+#define LANDLOCK_TRIGGER_FS_PICK_WRITE			(1ULL << 23)
+
+/**
+ * struct landlock_ctx_fs_pick - context accessible to a fs_pick program
+ *
+ * @inode: pointer to the current kernel object that can be used to compare
+ *         inodes from an inode map.
+ */
+struct landlock_ctx_fs_pick {
+	__u64 inode;
+};
+
+/**
+ * struct landlock_ctx_fs_walk - context accessible to a fs_walk program
+ *
+ * @inode: pointer to the current kernel object that can be used to compare
+ *         inodes from an inode map.
+ */
+struct landlock_ctx_fs_walk {
+	__u64 inode;
+};
+
+#endif /* _UAPI__LINUX_LANDLOCK_H__ */
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ed07789b3e62..ab3b8b510b8a 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2665,6 +2665,7 @@ static bool bpf_prog_type__needs_kver(enum bpf_prog_type type)
 	case BPF_PROG_TYPE_PERF_EVENT:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
 		return false;
 	case BPF_PROG_TYPE_KPROBE:
 	default:
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index ace1a0708d99..03c910d1f84c 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -102,6 +102,7 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
 	case BPF_PROG_TYPE_FLOW_DISSECTOR:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
 	default:
 		break;
 	}
-- 
2.22.0


^ permalink raw reply related

* [PATCH bpf-next v10 00/10] Landlock LSM: Toward unprivileged sandboxing
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev

Hi,

This tenth series mainly replace the previous [1] inode map
implementation with a hash map, which assure uniqueness of keys, improve
performance, and switch to arbitrary value size.  The inode and map
lifetime are now handled by LSM hooks.  The previous subtype is replaced
with the already existing expected attach type and a new expected attach
triggers field.

Landlock is a stackable LSM [4] intended to be used as a low-level
framework to build custom access-control systems or safe endpoint
security agents.  There is two types of Landlock hooks: FS_WALK and
FS_PICK.  Each of them accepts a dedicated eBPF program, called a
Landlock program.  The set of actions on a file is well defined (e.g.
read, write, ioctl, append, lock, mount...) taking inspiration from the
major Linux LSMs and some other access-controls like Capsicum.

The example patch show how a file system access control can be built
based on a list of denied files and directories.  From a security point
of view, it may be preferable to use a whitelist instead of a blacklist,
but this series only enable to match a specific list of files.  Bringing
back a way to evaluate a path is planned for a future dedicated series,
once this base Landlock framework is merged.  I may take inspiration
from the LOOKUP_BENEATH approach [5], but from an eBPF point of view.

The documentation patch contains some kernel documentation and
explanations on how to use Landlock.  The compiled documentation and
some talks can be found here: https://landlock.io
This patch series can be found in a Git repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v10

This is the first step of the roadmap discussed at LPC [2].  While the
intended final goal is to allow unprivileged users to use Landlock, this
series allows only a process with global CAP_SYS_ADMIN to load and
enforce a rule.  This may help to get feedback and avoid unexpected
behaviors.

This series can be applied on top of bpf-next, commit 88091ff56b71
("selftests, bpf: Add test for veth native XDP").  This can be tested
with CONFIG_SECCOMP_FILTER and CONFIG_SECURITY_LANDLOCK.  I would really
appreciate constructive comments on the design and the code.


# Landlock LSM

The goal of this new Linux Security Module (LSM) called Landlock is to
allow any process, including unprivileged ones, to create powerful
security sandboxes comparable to XNU Sandbox or OpenBSD Pledge (which
could be implemented with Landlock).  This kind of sandbox is expected
to help mitigate the security impact of bugs or unexpected/malicious
behaviors in user-space applications.

The approach taken is to add the minimum amount of code while still
allowing the user-space application to create quite complex access
rules.  A dedicated security policy language such as the one used by
SELinux, AppArmor and other major LSMs involves a lot of code and is
usually permitted to only a trusted user (i.e. root).  On the contrary,
eBPF programs already exist and are designed to be safely loaded by
unprivileged user-space.

This design does not seem too intrusive but is flexible enough to allow
a powerful sandbox mechanism accessible by any process on Linux. The use
of seccomp and Landlock is more suitable with the help of a user-space
library (e.g.  libseccomp) that could help to specify a high-level
language to express a security policy instead of raw eBPF programs.
Moreover, thanks to the LLVM front-end, it is quite easy to write an
eBPF program with a subset of the C language.


# Frequently asked questions

## Why is seccomp-bpf not enough?

A seccomp filter can access only raw syscall arguments (i.e. the
register values) which means that it is not possible to filter according
to the value pointed to by an argument, such as a file pathname. As an
embryonic Landlock version demonstrated, filtering at the syscall level
is complicated (e.g. need to take care of race conditions). This is
mainly because the access control checkpoints of the kernel are not at
this high-level but more underneath, at the LSM-hook level. The LSM
hooks are designed to handle this kind of checks.  Landlock abstracts
this approach to leverage the ability of unprivileged users to limit
themselves.

Cf. section "What it isn't?" in Documentation/prctl/seccomp_filter.txt


## Why use the seccomp(2) syscall?

Landlock use the same semantic as seccomp to apply access rule
restrictions. It add a new layer of security for the current process
which is inherited by its children. It makes sense to use an unique
access-restricting syscall (that should be allowed by seccomp filters)
which can only drop privileges. Moreover, a Landlock rule could come
from outside a process (e.g.  passed through a UNIX socket). It is then
useful to differentiate the creation/load of Landlock eBPF programs via
bpf(2), from rule enforcement via seccomp(2).


## Why a new LSM? Are SELinux, AppArmor, Smack and Tomoyo not good
   enough?

The current access control LSMs are fine for their purpose which is to
give the *root* the ability to enforce a security policy for the
*system*. What is missing is a way to enforce a security policy for any
application by its developer and *unprivileged user* as seccomp can do
for raw syscall filtering.

Differences from other (access control) LSMs:
* not only dedicated to administrators (i.e. no_new_priv);
* limited kernel attack surface (e.g. policy parsing);
* constrained policy rules (no DoS: deterministic execution time);
* do not leak more information than the loader process can legitimately
  have access to (minimize metadata inference).


# Changes since v9

* replace subtype with expected_attach_type and a new expected_attach_triggers
  and update libbpf accordingly
* handle inode and map lifetime with LSM hooks
* use a hash map for the inode map: integrate inodemap.c into hashtab.c
* allow arbitrary value size instead of 64-bits


# Changes since v8

* fit with the new LSM stacking framework (security blobs were tested
  but are not use in this series because of the code reduction)
* remove the Landlock program chaining and the file path evaluation
  feature to get a minimum viable product and ease the review
* replace the example with a simple blacklist policy
* rebase on bpf-next


# Changes since v7

* major revamp of the file system enforcement:
  * new eBPF map dedicated to tie an inode with an arbitrary 64-bits
    value, which can be used to tag files
  * three new Landlock hooks: FS_WALK, FS_PICK and FS_GET
  * add the ability to chain Landlock programs
  * add a new eBPF map type to compare inodes
  * don't use macros anymore
* replace subtype fields:
  * triggers: fine-grained bitfiel of actions on which a Landlock
    program may be called (if it comes from a sandbox process)
  * previous: a parent chained program
* upstreamed patches:
  * commit 369130b63178 ("selftests: Enhance kselftest_harness.h to
    print which assert failed")


# Changes since v6

* upstreamed patches:
  * commit 752ba56fb130 ("bpf: Extend check_uarg_tail_zero() checks")
  * commit 0b40808a1084 ("selftests: Make test_harness.h more generally
    available") and related ones
  * commit 3bb857e47e49 ("LSM: Enable multiple calls to
    security_add_hooks() for the same LSM")
* simplify the landlock_context (remove syscall_* fields) and add three
  FS sub-events: IOCTL, LOCK, FCNTL
* minimize the number of callable BPF functions from a Landlock rule
* do not split put_seccomp_filter() with put_seccomp()
* rename Landlock version to Landlock ABI
* miscellaneous fixes
* rebase on net-next


# Changes since v5

* eBPF program subtype:
  * use a prog_subtype pointer instead of inlining it into bpf_attr
  * enable a future-proof behavior (reject unhandled data/size)
  * add tests
* use a simple rule hierarchy (similar to seccomp-bpf)
* add a ptrace scope protection
* add more tests
* add more documentation
* rename some files
* miscellaneous fixes
* rebase on net-next


# Changes since v4

* upstreamed patches:
  * commit d498f8719a09 ("bpf: Rebuild bpf.o for any dependency update")
  * commit a734fb5d6006 ("samples/bpf: Reset global variables") and
    related ones
  * commit f4874d01beba ("bpf: Use bpf_create_map() from the library")
    and related ones
  * commit d02d8986a768 ("bpf: Always test unprivileged programs")
  * commit 640eb7e7b524 ("fs: Constify path_is_under()'s arguments")
  * commit 535e7b4b5ef2 ("bpf: Use u64_to_user_ptr()")
* revamp Landlock to not expose an LSM hook interface but wrap and
  abstract them with Landlock events (currently one for all filesystem
  related operations: LANDLOCK_SUBTYPE_EVENT_FS)
* wrap all filesystem kernel objects through the same FS handle (struct
  landlock_handle_fs): struct file, struct inode, struct path and struct
  dentry
* a rule don't return an errno code but only a boolean to allow or deny
  an access request
* handle all filesystem related LSM hooks
* add some tests and a sample:
  * BPF context tests
  * Landlock sandboxing tests and sample
  * write Landlock rules in C and compile them with LLVM
* change field names of eBPF program subtype
* remove arraymap of handles for now (will be replaced with a revamped
  map)
* remove cgroup handling for now
* add user and kernel documentation
* rebase on net-next


# Changes since v3

* upstreamed patch:
  * commit 1955351da41c ("bpf: Set register type according to
    is_valid_access()")
* use abstract LSM hook arguments with custom types (e.g.
  *_LANDLOCK_ARG_FS for struct file, struct inode and struct path)
* add more LSM hooks to support full filesystem access control
* improve the sandbox example
* fix races and RCU issues:
  * eBPF program execution and eBPF helpers
  * revamp the arraymap of handles to cleanly deal with update/delete
* eBPF program subtype for Landlock:
  * remove the "origin" field
  * add an "option" field
* rebase onto Daniel Mack's patches v7 [3]
* remove merged commit 1955351da41c ("bpf: Set register type according
  to is_valid_access()")
* fix spelling mistakes
* cleanup some type and variable names
* split patches
* for now, remove cgroup delegation handling for unprivileged user
* remove extra access check for cgroup_get_from_fd()
* remove unused example code dealing with skb
* remove seccomp-bpf link:
  * no more seccomp cookie
  * for now, it is no more possible to check the current syscall
    properties


# Changes since v2

* revamp cgroup handling:
  * use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
  * remove bpf_landlock_cmp_cgroup_beneath()
  * make BPF_PROG_ATTACH usable with delegated cgroups
  * add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
  * handle Landlock sandboxing for cgroups hierarchy
  * allow unprivileged processes to attach Landlock eBPF program to
    cgroups
* add subtype to eBPF programs:
  * replace Landlock hook identification by custom eBPF program types
    with a dedicated subtype field
  * manage fine-grained privileged Landlock programs
  * register Landlock programs for dedicated trigger origins (e.g.
    syscall, return from seccomp filter and/or interruption)
* performance and memory optimizations: use an array to access Landlock
  hooks directly but do not duplicated it for each thread
  (seccomp-based)
* allow running Landlock programs without seccomp filter
* fix seccomp-related issues
* remove extra errno bounding check for Landlock programs
* add some examples for optional eBPF functions or context access
  (network related) according to security checks to allow more features
  for privileged programs (e.g. Checmate)


# Changes since v1

* focus on the LSM hooks, not the syscalls:
  * much more simple implementation
  * does not need audit cache tricks to avoid race conditions
  * more simple to use and more generic because using the LSM hook
    abstraction directly
  * more efficient because only checking in LSM hooks
  * architecture agnostic
* switch from cBPF to eBPF:
  * new eBPF program types dedicated to Landlock
  * custom functions used by the eBPF program
  * gain some new features (e.g. 10 registers, can load values of
    different size, LLVM translator) but only a few functions allowed
    and a dedicated map type
  * new context: LSM hook ID, cookie and LSM hook arguments
  * need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default
    value) to be able to load hook filters as unprivileged users
* smaller and simpler:
  * no more checker groups but dedicated arraymap of handles
  * simpler userland structs thanks to eBPF functions
* distinctive name: Landlock


[1] https://lore.kernel.org/linux-security-module/20190625215239.11136-1-mic@digikod.net/
[2] https://lore.kernel.org/lkml/5828776A.1010104@digikod.net/
[3] https://lore.kernel.org/netdev/1477390454-12553-1-git-send-email-daniel@zonque.org/
[4] https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/
[5] https://lore.kernel.org/lkml/20190520133305.11925-1-cyphar@cyphar.com/

Regards,

Mickaël Salaün (10):
  fs,security: Add a new file access type: MAY_CHROOT
  bpf: Add expected_attach_triggers and a is_valid_triggers() verifier
  bpf,landlock: Define an eBPF program type for Landlock hooks
  seccomp,landlock: Enforce Landlock programs per process hierarchy
  landlock: Handle filesystem access control
  bpf,landlock: Add a new map type: inode
  landlock: Add ptrace restrictions
  bpf: Add a Landlock sandbox example
  bpf,landlock: Add tests for Landlock
  landlock: Add user and kernel documentation for Landlock

 Documentation/security/index.rst              |   1 +
 Documentation/security/landlock/index.rst     |  20 +
 Documentation/security/landlock/kernel.rst    |  99 +++
 Documentation/security/landlock/user.rst      | 147 ++++
 MAINTAINERS                                   |  13 +
 fs/open.c                                     |   3 +-
 include/linux/bpf.h                           |  18 +
 include/linux/bpf_types.h                     |   6 +
 include/linux/fs.h                            |   1 +
 include/linux/landlock.h                      |  38 ++
 include/linux/lsm_hooks.h                     |   1 +
 include/linux/seccomp.h                       |   5 +
 include/uapi/linux/bpf.h                      |  16 +-
 include/uapi/linux/landlock.h                 |  94 +++
 include/uapi/linux/seccomp.h                  |   1 +
 kernel/bpf/core.c                             |   2 +
 kernel/bpf/hashtab.c                          | 253 +++++++
 kernel/bpf/syscall.c                          |  41 +-
 kernel/bpf/verifier.c                         |  26 +
 kernel/fork.c                                 |   8 +-
 kernel/seccomp.c                              |   4 +
 samples/bpf/.gitignore                        |   1 +
 samples/bpf/Makefile                          |   3 +
 samples/bpf/landlock1.h                       |   8 +
 samples/bpf/landlock1_kern.c                  |  55 ++
 samples/bpf/landlock1_user.c                  | 250 +++++++
 security/Kconfig                              |   1 +
 security/Makefile                             |   2 +
 security/landlock/Kconfig                     |  18 +
 security/landlock/Makefile                    |   5 +
 security/landlock/common.h                    | 105 +++
 security/landlock/enforce.c                   | 272 ++++++++
 security/landlock/enforce.h                   |  18 +
 security/landlock/enforce_seccomp.c           |  92 +++
 security/landlock/hooks.c                     |  94 +++
 security/landlock/hooks.h                     |  31 +
 security/landlock/hooks_fs.c                  | 639 ++++++++++++++++++
 security/landlock/hooks_fs.h                  |  31 +
 security/landlock/hooks_ptrace.c              | 121 ++++
 security/landlock/hooks_ptrace.h              |   8 +
 security/landlock/init.c                      | 148 ++++
 security/security.c                           |  15 +
 tools/include/uapi/linux/bpf.h                |  16 +-
 tools/include/uapi/linux/landlock.h           | 109 +++
 tools/lib/bpf/bpf.h                           |   1 +
 tools/lib/bpf/libbpf.c                        |  44 +-
 tools/lib/bpf/libbpf.h                        |   7 +-
 tools/lib/bpf/libbpf.map                      |   2 +
 tools/lib/bpf/libbpf_probes.c                 |   2 +
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/bpf/bpf_helpers.h     |   2 +
 .../selftests/bpf/test_section_names.c        |   2 +-
 .../selftests/bpf/test_sockopt_multi.c        |   4 +-
 tools/testing/selftests/bpf/test_sockopt_sk.c |   2 +-
 tools/testing/selftests/bpf/test_verifier.c   |   1 +
 .../testing/selftests/bpf/verifier/landlock.c |  24 +
 tools/testing/selftests/landlock/.gitignore   |   4 +
 tools/testing/selftests/landlock/Makefile     |  39 ++
 tools/testing/selftests/landlock/test.h       |  50 ++
 tools/testing/selftests/landlock/test_base.c  |  24 +
 tools/testing/selftests/landlock/test_fs.c    | 256 +++++++
 .../testing/selftests/landlock/test_ptrace.c  | 148 ++++
 62 files changed, 3432 insertions(+), 20 deletions(-)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst
 create mode 100644 include/linux/landlock.h
 create mode 100644 include/uapi/linux/landlock.h
 create mode 100644 samples/bpf/landlock1.h
 create mode 100644 samples/bpf/landlock1_kern.c
 create mode 100644 samples/bpf/landlock1_user.c
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/enforce.c
 create mode 100644 security/landlock/enforce.h
 create mode 100644 security/landlock/enforce_seccomp.c
 create mode 100644 security/landlock/hooks.c
 create mode 100644 security/landlock/hooks.h
 create mode 100644 security/landlock/hooks_fs.c
 create mode 100644 security/landlock/hooks_fs.h
 create mode 100644 security/landlock/hooks_ptrace.c
 create mode 100644 security/landlock/hooks_ptrace.h
 create mode 100644 security/landlock/init.c
 create mode 100644 tools/include/uapi/linux/landlock.h
 create mode 100644 tools/testing/selftests/bpf/verifier/landlock.c
 create mode 100644 tools/testing/selftests/landlock/.gitignore
 create mode 100644 tools/testing/selftests/landlock/Makefile
 create mode 100644 tools/testing/selftests/landlock/test.h
 create mode 100644 tools/testing/selftests/landlock/test_base.c
 create mode 100644 tools/testing/selftests/landlock/test_fs.c
 create mode 100644 tools/testing/selftests/landlock/test_ptrace.c

-- 
2.22.0


^ permalink raw reply

* [PATCH bpf-next v10 07/10] landlock: Add ptrace restrictions
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

A landlocked process has less privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process' rules.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---

Changes since v6:
* factor out ptrace check
* constify pointers
* cleanup headers
* use the new security_add_hooks()
---
 security/landlock/Makefile       |   2 +-
 security/landlock/hooks_ptrace.c | 121 +++++++++++++++++++++++++++++++
 security/landlock/hooks_ptrace.h |   8 ++
 security/landlock/init.c         |   2 +
 4 files changed, 132 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/hooks_ptrace.c
 create mode 100644 security/landlock/hooks_ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 270ece5d93de..4500ddb0767e 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -2,4 +2,4 @@ obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := init.o \
 	enforce.o enforce_seccomp.o \
-	hooks.o hooks_fs.o
+	hooks.o hooks_fs.o hooks_ptrace.o
diff --git a/security/landlock/hooks_ptrace.c b/security/landlock/hooks_ptrace.c
new file mode 100644
index 000000000000..7f5e8b994e93
--- /dev/null
+++ b/security/landlock/hooks_ptrace.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - ptrace hooks
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ */
+
+#include <asm/current.h>
+#include <linux/errno.h>
+#include <linux/kernel.h> /* ARRAY_SIZE */
+#include <linux/lsm_hooks.h>
+#include <linux/sched.h> /* struct task_struct */
+#include <linux/seccomp.h>
+
+#include "common.h" /* struct landlock_prog_set */
+#include "hooks.h" /* landlocked() */
+#include "hooks_ptrace.h"
+
+static bool progs_are_subset(const struct landlock_prog_set *parent,
+		const struct landlock_prog_set *child)
+{
+	size_t i;
+
+	if (!parent || !child)
+		return false;
+	if (parent == child)
+		return true;
+
+	for (i = 0; i < ARRAY_SIZE(child->programs); i++) {
+		struct landlock_prog_list *walker;
+		bool found_parent = false;
+
+		if (!parent->programs[i])
+			continue;
+		for (walker = child->programs[i]; walker;
+				walker = walker->prev) {
+			if (walker == parent->programs[i]) {
+				found_parent = true;
+				break;
+			}
+		}
+		if (!found_parent)
+			return false;
+	}
+	return true;
+}
+
+static bool task_has_subset_progs(const struct task_struct *parent,
+		const struct task_struct *child)
+{
+#ifdef CONFIG_SECCOMP_FILTER
+	if (progs_are_subset(parent->seccomp.landlock_prog_set,
+				child->seccomp.landlock_prog_set))
+		/* must be ANDed with other providers (i.e. cgroup) */
+		return true;
+#endif /* CONFIG_SECCOMP_FILTER */
+	return false;
+}
+
+static int task_ptrace(const struct task_struct *parent,
+		const struct task_struct *child)
+{
+	if (!landlocked(parent))
+		return 0;
+
+	if (!landlocked(child))
+		return -EPERM;
+
+	if (task_has_subset_progs(parent, child))
+		return 0;
+
+	return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - determine whether the current process may access
+ *			      another
+ *
+ * @child: the process to be accessed
+ * @mode: the mode of attachment
+ *
+ * If the current task has Landlock programs, then the child must have at least
+ * the same programs.  Else denied.
+ *
+ * Determine whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *child,
+		unsigned int mode)
+{
+	return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - determine whether another process may trace the
+ *			 current one
+ *
+ * @parent: the task proposed to be the tracer
+ *
+ * If the parent has Landlock programs, then the current task must have the
+ * same or more programs.
+ * Else denied.
+ *
+ * Determine whether the nominated task is permitted to trace the current
+ * process, returning 0 if permission is granted, -errno if denied.
+ */
+static int hook_ptrace_traceme(struct task_struct *parent)
+{
+	return task_ptrace(parent, current);
+}
+
+static struct security_hook_list landlock_hooks[] = {
+	LSM_HOOK_INIT(ptrace_access_check, hook_ptrace_access_check),
+	LSM_HOOK_INIT(ptrace_traceme, hook_ptrace_traceme),
+};
+
+__init void landlock_add_hooks_ptrace(void)
+{
+	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+			LANDLOCK_NAME);
+}
diff --git a/security/landlock/hooks_ptrace.h b/security/landlock/hooks_ptrace.h
new file mode 100644
index 000000000000..2c2b8a13037f
--- /dev/null
+++ b/security/landlock/hooks_ptrace.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - ptrace hooks
+ *
+ * Copyright © 2017 Mickaël Salaün <mic@digikod.net>
+ */
+
+__init void landlock_add_hooks_ptrace(void);
diff --git a/security/landlock/init.c b/security/landlock/init.c
index eec4467cb5ee..35165fc8a595 100644
--- a/security/landlock/init.c
+++ b/security/landlock/init.c
@@ -13,6 +13,7 @@
 
 #include "common.h" /* LANDLOCK_* */
 #include "hooks_fs.h"
+#include "hooks_ptrace.h"
 
 static bool bpf_landlock_is_valid_access(int off, int size,
 		enum bpf_access_type type, const struct bpf_prog *prog,
@@ -130,6 +131,7 @@ const struct bpf_prog_ops landlock_prog_ops = {};
 static int __init landlock_init(void)
 {
 	pr_info(LANDLOCK_NAME ": Initializing (sandbox with seccomp)\n");
+	landlock_add_hooks_ptrace();
 	landlock_add_hooks_fs();
 	return 0;
 }
-- 
2.22.0


^ permalink raw reply related

* [PATCH bpf-next v10 08/10] bpf: Add a Landlock sandbox example
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

Add a basic sandbox tool to launch a command which is denied access to a
list of files and directories.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---

Changes since v9:
* replace subtype with expected_attach_type and expected_attach_triggers
* add the ability to parse Landlock programs and triggers to libbpf
* use the new bpf_inode_map_lookup_elem()
* use read-only inode map for Landlock programs
* remove bpf_load.c modifications

Changes since v8:
* rewrite the landlock1 sample which deny access to a set of files or
  directories (i.e. simple blacklist) to fit with the previous patches
* add "landlock1" to .gitignore
* in bpf_load.c, pass the subtype with a call to
  bpf_load_program_xattr()

Changes since v7:
* rewrite the example using an inode map
* add to bpf_load the ability to handle subtypes per program type

Changes since v6:
* check return value of load_and_attach()
* allow to write on pipes
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
* rename Landlock version to ABI to better reflect its purpose
* use const variable (suggested by Kees Cook)
* remove useless definitions (suggested by Kees Cook)
* add detailed explanations (suggested by Kees Cook)

Changes since v5:
* cosmetic fixes
* rebase

Changes since v4:
* write Landlock rule in C and compiled it with LLVM
* remove cgroup handling
* remove path handling: only handle a read-only environment
* remove errno return codes

Changes since v3:
* remove seccomp and origin field: completely free from seccomp programs
* handle more FS-related hooks
* handle inode hooks and directory traversal
* add faked but consistent view thanks to ENOENT
* add /lib64 in the example
* fix spelling
* rename some types and definitions (e.g. SECCOMP_ADD_LANDLOCK_RULE)

Changes since v2:
* use BPF_PROG_ATTACH for cgroup handling
---
 samples/bpf/.gitignore                        |   1 +
 samples/bpf/Makefile                          |   3 +
 samples/bpf/landlock1.h                       |   8 +
 samples/bpf/landlock1_kern.c                  |  55 ++++
 samples/bpf/landlock1_user.c                  | 250 ++++++++++++++++++
 tools/lib/bpf/libbpf.c                        |  43 ++-
 tools/lib/bpf/libbpf.h                        |   7 +-
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/testing/selftests/bpf/bpf_helpers.h     |   2 +
 .../selftests/bpf/test_section_names.c        |   2 +-
 .../selftests/bpf/test_sockopt_multi.c        |   4 +-
 tools/testing/selftests/bpf/test_sockopt_sk.c |   2 +-
 12 files changed, 364 insertions(+), 14 deletions(-)
 create mode 100644 samples/bpf/landlock1.h
 create mode 100644 samples/bpf/landlock1_kern.c
 create mode 100644 samples/bpf/landlock1_user.c

diff --git a/samples/bpf/.gitignore b/samples/bpf/.gitignore
index 74d31fd3c99c..a4c9c806f739 100644
--- a/samples/bpf/.gitignore
+++ b/samples/bpf/.gitignore
@@ -2,6 +2,7 @@ cpustat
 fds_example
 hbm
 ibumad
+landlock1
 lathist
 lwt_len_hist
 map_perf_test
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index f90daadfbc89..b0309ed7c1c9 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -53,6 +53,7 @@ hostprogs-y += task_fd_query
 hostprogs-y += xdp_sample_pkts
 hostprogs-y += ibumad
 hostprogs-y += hbm
+hostprogs-y += landlock1
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -109,6 +110,7 @@ task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
 xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
 ibumad-objs := bpf_load.o ibumad_user.o $(TRACE_HELPERS)
 hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS)
+landlock1-objs := bpf_load.o landlock1_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -170,6 +172,7 @@ always += xdp_sample_pkts_kern.o
 always += ibumad_kern.o
 always += hbm_out_kern.o
 always += hbm_edt_kern.o
+always += landlock1_kern.o
 
 KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
 KBUILD_HOSTCFLAGS += -I$(srctree)/tools/lib/bpf/
diff --git a/samples/bpf/landlock1.h b/samples/bpf/landlock1.h
new file mode 100644
index 000000000000..53b0a9447855
--- /dev/null
+++ b/samples/bpf/landlock1.h
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock sample 1 - common header
+ *
+ * Copyright © 2018-2019 Mickaël Salaün <mic@digikod.net>
+ */
+
+#define MAP_FLAG_DENY		(1ULL << 0)
diff --git a/samples/bpf/landlock1_kern.c b/samples/bpf/landlock1_kern.c
new file mode 100644
index 000000000000..d6946659f891
--- /dev/null
+++ b/samples/bpf/landlock1_kern.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock sample 1 - whitelist of read only or read-write file hierarchy
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ */
+
+/*
+ * This file contains a function that will be compiled to eBPF bytecode thanks
+ * to LLVM/Clang.
+ *
+ * Each SEC() means that the following function or variable will be part of a
+ * custom ELF section. This sections are then processed by the userspace part
+ * (see landlock1_user.c) to extract eBPF bytecode and metadata.
+ */
+
+#include <uapi/linux/bpf.h>
+#include <uapi/linux/landlock.h>
+
+#include "bpf_helpers.h"
+#include "landlock1.h" /* MAP_FLAG_DENY */
+
+#define MAP_MAX_ENTRIES		20
+
+struct bpf_map_def SEC("maps") inode_map = {
+	.type = BPF_MAP_TYPE_INODE,
+	.key_size = sizeof(u32),
+	.value_size = sizeof(u64),
+	.max_entries = MAP_MAX_ENTRIES,
+	.map_flags = BPF_F_RDONLY_PROG,
+};
+
+static __always_inline __u64 get_access(void *inode)
+{
+	u64 *flags;
+
+	flags = bpf_inode_map_lookup_elem(&inode_map, inode);
+	if (flags && (*flags & MAP_FLAG_DENY))
+		return LANDLOCK_RET_DENY;
+	return LANDLOCK_RET_ALLOW;
+}
+
+SEC("landlock/fs_walk")
+int fs_walk(struct landlock_ctx_fs_walk *ctx)
+{
+	return get_access((void *)ctx->inode);
+}
+
+SEC("landlock/fs_pick")
+int fs_pick_ro(struct landlock_ctx_fs_pick *ctx)
+{
+	return get_access((void *)ctx->inode);
+}
+
+static const char SEC("license") _license[] = "GPL";
diff --git a/samples/bpf/landlock1_user.c b/samples/bpf/landlock1_user.c
new file mode 100644
index 000000000000..2082ca367f94
--- /dev/null
+++ b/samples/bpf/landlock1_user.c
@@ -0,0 +1,250 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock sample 1 - deny access to a set of directories (blacklisting)
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ */
+
+#include "bpf/libbpf.h"
+#include "bpf_load.h"
+#include "landlock1.h" /* MAP_FLAG_DENY */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h> /* open() */
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/landlock.h>
+#include <linux/prctl.h>
+#include <linux/seccomp.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#ifndef seccomp
+static int seccomp(unsigned int op, unsigned int flags, void *args)
+{
+	errno = 0;
+	return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+static int apply_sandbox(int prog_fd)
+{
+	int ret = 0;
+
+	/* set up the test sandbox */
+	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+		perror("prctl(no_new_priv)");
+		return 1;
+	}
+	if (seccomp(SECCOMP_PREPEND_LANDLOCK_PROG, 0, &prog_fd)) {
+		perror("seccomp(set_hook)");
+		ret = 1;
+	}
+	close(prog_fd);
+
+	return ret;
+}
+
+#define ENV_FS_PATH_DENY_NAME "LL_PATH_DENY"
+#define ENV_PATH_TOKEN ":"
+
+static int parse_path(char *env_path, const char ***path_list)
+{
+	int i, path_nb = 0;
+
+	if (env_path) {
+		path_nb++;
+		for (i = 0; env_path[i]; i++) {
+			if (env_path[i] == ENV_PATH_TOKEN[0])
+				path_nb++;
+		}
+	}
+	*path_list = malloc(path_nb * sizeof(**path_list));
+	for (i = 0; i < path_nb; i++)
+		(*path_list)[i] = strsep(&env_path, ENV_PATH_TOKEN);
+
+	return path_nb;
+}
+
+static int populate_map(const char *env_var, unsigned long long value,
+		int map_fd)
+{
+	int path_nb, ref_fd, i;
+	char *env_path_name;
+	const char **path_list = NULL;
+
+	env_path_name = getenv(env_var);
+	if (!env_path_name)
+		return 0;
+	env_path_name = strdup(env_path_name);
+	path_nb = parse_path(env_path_name, &path_list);
+
+	for (i = 0; i < path_nb; i++) {
+		ref_fd = open(path_list[i], O_RDONLY | O_CLOEXEC);
+		if (ref_fd < 0) {
+			fprintf(stderr, "Failed to open \"%s\": %s\n",
+					path_list[i],
+					strerror(errno));
+			return 1;
+		}
+		if (bpf_map_update_elem(map_fd, &ref_fd, &value, BPF_ANY)) {
+			fprintf(stderr, "Failed to update the map with"
+					" \"%s\": %s\n", path_list[i],
+					strerror(errno));
+			return 1;
+		}
+		close(ref_fd);
+	}
+	free(env_path_name);
+	return 0;
+}
+
+/* need to call bpf_object__close(obj) once every FD is used */
+static int ll_load_file(const char *filename, struct bpf_object **obj,
+		int *ll_map, int *ll_prog_walk, int *ll_prog_pick)
+{
+	int first_bpf_prog, map_fd, prog_walk_fd, prog_pick_fd, err;
+	struct bpf_map *map;
+	struct bpf_program *prog;
+	struct bpf_object *tmp_obj;
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type = BPF_PROG_TYPE_UNSPEC,
+		.file = filename,
+	};
+
+	/*
+	 * allowed:
+	 * - LANDLOCK_TRIGGER_FS_PICK_LINK
+	 * - LANDLOCK_TRIGGER_FS_PICK_LINKTO
+	 * - LANDLOCK_TRIGGER_FS_PICK_RECEIVE
+	 * - LANDLOCK_TRIGGER_FS_PICK_MOUNTON
+	 */
+	prog_load_attr.expected_attach_triggers =
+		LANDLOCK_TRIGGER_FS_PICK_APPEND |
+		LANDLOCK_TRIGGER_FS_PICK_CHDIR |
+		LANDLOCK_TRIGGER_FS_PICK_CHROOT |
+		LANDLOCK_TRIGGER_FS_PICK_CREATE |
+		LANDLOCK_TRIGGER_FS_PICK_EXECUTE |
+		LANDLOCK_TRIGGER_FS_PICK_FCNTL |
+		LANDLOCK_TRIGGER_FS_PICK_GETATTR |
+		LANDLOCK_TRIGGER_FS_PICK_IOCTL |
+		LANDLOCK_TRIGGER_FS_PICK_LOCK |
+		LANDLOCK_TRIGGER_FS_PICK_MAP |
+		LANDLOCK_TRIGGER_FS_PICK_OPEN |
+		LANDLOCK_TRIGGER_FS_PICK_READ |
+		LANDLOCK_TRIGGER_FS_PICK_READDIR |
+		LANDLOCK_TRIGGER_FS_PICK_RENAME |
+		LANDLOCK_TRIGGER_FS_PICK_RENAMETO |
+		LANDLOCK_TRIGGER_FS_PICK_RMDIR |
+		LANDLOCK_TRIGGER_FS_PICK_SETATTR |
+		LANDLOCK_TRIGGER_FS_PICK_TRANSFER |
+		LANDLOCK_TRIGGER_FS_PICK_UNLINK |
+		LANDLOCK_TRIGGER_FS_PICK_WRITE;
+
+	if (access(filename, O_RDONLY) < 0) {
+		printf("Failed to access file %s: %s\n", filename,
+				strerror(errno));
+		return 1;
+	}
+	err = bpf_prog_load_xattr(&prog_load_attr, &tmp_obj, &first_bpf_prog);
+	if (err) {
+		printf("Failed to parse file %s: %s\n", filename, strerror(err));
+		goto error_load;
+	}
+
+	map = bpf_object__find_map_by_name(tmp_obj, "inode_map");
+	map_fd = bpf_map__fd(map);
+	if (map_fd < 0) {
+		printf("Map not found: %s\n", strerror(map_fd));
+		goto put_obj;
+	}
+
+	prog = bpf_object__find_program_by_title(tmp_obj, "landlock/fs_walk");
+	if (!prog) {
+		printf("Program for FS_WALK not found in file %s\n", filename);
+		goto put_obj;
+	}
+	prog_walk_fd = bpf_program__fd(prog);
+	if (prog_walk_fd < 0) {
+		printf("Failed to load the FS_WALK program from file %s\n",
+				strerror(prog_walk_fd));
+		goto put_obj;
+	}
+
+	prog = bpf_object__find_program_by_title(tmp_obj, "landlock/fs_pick");
+	if (!prog) {
+		printf("Failed to get a file descriptor for program %s from file %s\n",
+				bpf_program__title(prog, false), filename);
+		goto put_obj;
+	}
+	prog_pick_fd = bpf_program__fd(prog);
+	if (prog_pick_fd < 0) {
+		printf("Failed to get a file descriptor for program %s from file %s\n",
+				bpf_program__title(prog, false), filename);
+		goto put_obj;
+	}
+
+	*obj = tmp_obj;
+	*ll_prog_walk = prog_walk_fd;
+	*ll_prog_pick = prog_pick_fd;
+	*ll_map = map_fd;
+	return 0;
+
+put_obj:
+	/* All FDs are closed with bpf_object__close() */
+	bpf_object__close(tmp_obj);
+error_load:
+	printf("ERROR: load_bpf_file failed for: %s\n", filename);
+	printf("  Output from verifier:\n%s\n------\n", bpf_log_buf);
+	return 1;
+}
+
+int main(int argc, char * const argv[], char * const *envp)
+{
+	char filename[256];
+	char *cmd_path;
+	char * const *cmd_argv;
+	struct bpf_object *obj;
+	int ll_map, ll_prog_walk, ll_prog_pick;
+
+	if (argc < 2) {
+		fprintf(stderr, "usage: %s <cmd> [args]...\n\n", argv[0]);
+		fprintf(stderr, "Launch a command in a restricted environment.\n\n");
+		fprintf(stderr, "Environment variables containing paths, each separated by a colon:\n");
+		fprintf(stderr, "* %s: list of files and directories which are denied\n",
+				ENV_FS_PATH_DENY_NAME);
+		fprintf(stderr, "\nexample:\n"
+				"%s=\"${HOME}/.ssh:${HOME}/Images\" "
+				"%s /bin/sh -i\n",
+				ENV_FS_PATH_DENY_NAME, argv[0]);
+		return 1;
+	}
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	if (ll_load_file(filename, &obj, &ll_map, &ll_prog_walk, &ll_prog_pick))
+		return 1;
+
+	if (populate_map(ENV_FS_PATH_DENY_NAME, MAP_FLAG_DENY, ll_map))
+		return 1;
+	//close(ll_map);
+
+	fprintf(stderr, "Launching a new sandboxed process\n");
+	if (apply_sandbox(ll_prog_walk))
+		return 1;
+	//close(ll_prog_walk);
+	if (apply_sandbox(ll_prog_pick))
+		return 1;
+	//close(ll_prog_pick);
+	//bpf_object__close(obj);
+	cmd_path = argv[1];
+	cmd_argv = argv + 1;
+	execve(cmd_path, cmd_argv, envp);
+	perror("Failed to call execve");
+	return 1;
+}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ab3b8b510b8a..f043e97bca0c 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -181,6 +181,7 @@ struct bpf_program {
 	bpf_program_clear_priv_t clear_priv;
 
 	enum bpf_attach_type expected_attach_type;
+	__u64 expected_attach_triggers;
 	int btf_fd;
 	void *func_info;
 	__u32 func_info_rec_size;
@@ -2459,6 +2460,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
 	load_attr.prog_type = prog->type;
 	load_attr.expected_attach_type = prog->expected_attach_type;
+	load_attr.expected_attach_triggers = prog->expected_attach_triggers;
 	if (prog->caps->name)
 		load_attr.name = prog->name;
 	load_attr.insns = insns;
@@ -3540,19 +3542,29 @@ void bpf_program__set_expected_attach_type(struct bpf_program *prog,
 	prog->expected_attach_type = type;
 }
 
-#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype) \
-	{ string, sizeof(string) - 1, ptype, eatype, is_attachable, atype }
+void bpf_program__set_expected_attach_triggers(struct bpf_program *prog,
+					       __u64 triggers)
+{
+	prog->expected_attach_triggers = triggers;
+}
+
+#define BPF_PROG_SEC_IMPL(string, ptype, eatype, is_attachable, atype, has_triggers) \
+	{ string, sizeof(string) - 1, ptype, eatype, is_attachable, atype, has_triggers }
 
 /* Programs that can NOT be attached. */
-#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0)
+#define BPF_PROG_SEC(string, ptype) BPF_PROG_SEC_IMPL(string, ptype, 0, 0, 0, false)
 
 /* Programs that can be attached. */
 #define BPF_APROG_SEC(string, ptype, atype) \
-	BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype)
+	BPF_PROG_SEC_IMPL(string, ptype, 0, 1, atype, false)
 
 /* Programs that must specify expected attach type at load time. */
 #define BPF_EAPROG_SEC(string, ptype, eatype) \
-	BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype)
+	BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype, false)
+
+/* Programs that must specify expected attach type at load time and has triggers. */
+#define BPF_TEAPROG_SEC(string, ptype, eatype) \
+	BPF_PROG_SEC_IMPL(string, ptype, eatype, 1, eatype, true)
 
 /* Programs that can be attached but attach type can't be identified by section
  * name. Kept for backward compatibility.
@@ -3566,6 +3578,7 @@ static const struct {
 	enum bpf_attach_type expected_attach_type;
 	int is_attachable;
 	enum bpf_attach_type attach_type;
+	bool has_triggers;
 } section_names[] = {
 	BPF_PROG_SEC("socket",			BPF_PROG_TYPE_SOCKET_FILTER),
 	BPF_PROG_SEC("kprobe/",			BPF_PROG_TYPE_KPROBE),
@@ -3628,6 +3641,10 @@ static const struct {
 						BPF_CGROUP_GETSOCKOPT),
 	BPF_EAPROG_SEC("cgroup/setsockopt",	BPF_PROG_TYPE_CGROUP_SOCKOPT,
 						BPF_CGROUP_SETSOCKOPT),
+	BPF_EAPROG_SEC("landlock/fs_walk",	BPF_PROG_TYPE_LANDLOCK_HOOK,
+						BPF_LANDLOCK_FS_WALK),
+	BPF_TEAPROG_SEC("landlock/fs_pick",	BPF_PROG_TYPE_LANDLOCK_HOOK,
+						BPF_LANDLOCK_FS_PICK),
 };
 
 #undef BPF_PROG_SEC_IMPL
@@ -3665,7 +3682,8 @@ static char *libbpf_get_type_names(bool attach_type)
 }
 
 int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
-			     enum bpf_attach_type *expected_attach_type)
+			     enum bpf_attach_type *expected_attach_type,
+			     bool *has_triggers)
 {
 	char *type_names;
 	int i;
@@ -3678,6 +3696,7 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
 			continue;
 		*prog_type = section_names[i].prog_type;
 		*expected_attach_type = section_names[i].expected_attach_type;
+		*has_triggers = section_names[i].has_triggers;
 		return 0;
 	}
 	pr_warning("failed to guess program type based on ELF section name '%s'\n", name);
@@ -3720,10 +3739,11 @@ int libbpf_attach_type_by_name(const char *name,
 static int
 bpf_program__identify_section(struct bpf_program *prog,
 			      enum bpf_prog_type *prog_type,
-			      enum bpf_attach_type *expected_attach_type)
+			      enum bpf_attach_type *expected_attach_type,
+			      bool *has_triggers)
 {
 	return libbpf_prog_type_by_name(prog->section_name, prog_type,
-					expected_attach_type);
+					expected_attach_type, has_triggers);
 }
 
 int bpf_map__fd(const struct bpf_map *map)
@@ -3898,6 +3918,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 	struct bpf_object *obj;
 	struct bpf_map *map;
 	int err;
+	bool has_triggers = false;
 
 	if (!attr)
 		return -EINVAL;
@@ -3921,7 +3942,8 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 		expected_attach_type = attr->expected_attach_type;
 		if (prog_type == BPF_PROG_TYPE_UNSPEC) {
 			err = bpf_program__identify_section(prog, &prog_type,
-							    &expected_attach_type);
+							    &expected_attach_type,
+							    &has_triggers);
 			if (err < 0) {
 				bpf_object__close(obj);
 				return -EINVAL;
@@ -3931,6 +3953,9 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 		bpf_program__set_type(prog, prog_type);
 		bpf_program__set_expected_attach_type(prog,
 						      expected_attach_type);
+		if (has_triggers)
+			bpf_program__set_expected_attach_triggers(prog,
+					attr->expected_attach_triggers);
 
 		prog->log_level = attr->log_level;
 		prog->prog_flags = attr->prog_flags;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 5cbf459ece0b..07e153cebd5d 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -123,7 +123,8 @@ LIBBPF_API void *bpf_object__priv(const struct bpf_object *prog);
 
 LIBBPF_API int
 libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
-			 enum bpf_attach_type *expected_attach_type);
+			 enum bpf_attach_type *expected_attach_type,
+			 bool *has_triggers);
 LIBBPF_API int libbpf_attach_type_by_name(const char *name,
 					  enum bpf_attach_type *attach_type);
 
@@ -266,6 +267,9 @@ LIBBPF_API void bpf_program__set_type(struct bpf_program *prog,
 LIBBPF_API void
 bpf_program__set_expected_attach_type(struct bpf_program *prog,
 				      enum bpf_attach_type type);
+LIBBPF_API void
+bpf_program__set_expected_attach_triggers(struct bpf_program *prog,
+					  __u64 triggers);
 
 LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog);
 LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog);
@@ -345,6 +349,7 @@ struct bpf_prog_load_attr {
 	const char *file;
 	enum bpf_prog_type prog_type;
 	enum bpf_attach_type expected_attach_type;
+	__u64 expected_attach_triggers;
 	int ifindex;
 	int log_level;
 	int prog_flags;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 36ac26bdfda0..4eb930bfc1d8 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -83,6 +83,7 @@ LIBBPF_0.0.1 {
 		bpf_program__prev;
 		bpf_program__priv;
 		bpf_program__set_expected_attach_type;
+		bpf_program__set_expected_attach_triggers;
 		bpf_program__set_ifindex;
 		bpf_program__set_kprobe;
 		bpf_program__set_perf_event;
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 5a3d92c8bec8..db2a84a88f5c 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -228,6 +228,8 @@ static void *(*bpf_sk_storage_get)(void *map, struct bpf_sock *sk,
 static int (*bpf_sk_storage_delete)(void *map, struct bpf_sock *sk) =
 	(void *)BPF_FUNC_sk_storage_delete;
 static int (*bpf_send_signal)(unsigned sig) = (void *)BPF_FUNC_send_signal;
+static void *(*bpf_inode_map_lookup_elem)(void *map, const void *key) =
+	(void *) BPF_FUNC_inode_map_lookup_elem;
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/tools/testing/selftests/bpf/test_section_names.c b/tools/testing/selftests/bpf/test_section_names.c
index 29833aeaf0de..2d08df9156bd 100644
--- a/tools/testing/selftests/bpf/test_section_names.c
+++ b/tools/testing/selftests/bpf/test_section_names.c
@@ -153,7 +153,7 @@ static int test_prog_type_by_name(const struct sec_name_test *test)
 	int rc;
 
 	rc = libbpf_prog_type_by_name(test->sec_name, &prog_type,
-				      &expected_attach_type);
+				      &expected_attach_type, false);
 
 	if (rc != test->expected_load.rc) {
 		warnx("prog: unexpected rc=%d for %s", rc, test->sec_name);
diff --git a/tools/testing/selftests/bpf/test_sockopt_multi.c b/tools/testing/selftests/bpf/test_sockopt_multi.c
index 4be3441db867..e499c91f2953 100644
--- a/tools/testing/selftests/bpf/test_sockopt_multi.c
+++ b/tools/testing/selftests/bpf/test_sockopt_multi.c
@@ -23,7 +23,7 @@ static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title)
 	struct bpf_program *prog;
 	int err;
 
-	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type);
+	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type, false);
 	if (err) {
 		log_err("Failed to deduct types for %s BPF program", title);
 		return -1;
@@ -52,7 +52,7 @@ static int prog_detach(struct bpf_object *obj, int cgroup_fd, const char *title)
 	struct bpf_program *prog;
 	int err;
 
-	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type);
+	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type, false);
 	if (err)
 		return -1;
 
diff --git a/tools/testing/selftests/bpf/test_sockopt_sk.c b/tools/testing/selftests/bpf/test_sockopt_sk.c
index 036b652e5ca9..2d1ff616b139 100644
--- a/tools/testing/selftests/bpf/test_sockopt_sk.c
+++ b/tools/testing/selftests/bpf/test_sockopt_sk.c
@@ -129,7 +129,7 @@ static int prog_attach(struct bpf_object *obj, int cgroup_fd, const char *title)
 	struct bpf_program *prog;
 	int err;
 
-	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type);
+	err = libbpf_prog_type_by_name(title, &prog_type, &attach_type, false);
 	if (err) {
 		log_err("Failed to deduct types for %s BPF program", title);
 		return -1;
-- 
2.22.0


^ permalink raw reply related

* [PATCH bpf-next v10 10/10] landlock: Add user and kernel documentation for Landlock
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

This documentation can be built with the Sphinx framework.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---

Changes since v9:
* update with expected attach type and expected attach triggers

Changes since v8:
* remove documentation related to chaining and tagging according to this
  patch series

Changes since v7:
* update documentation according to the Landlock revamp

Changes since v6:
* add a check for ctx->event
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
* rename Landlock version to ABI to better reflect its purpose and add a
  dedicated changelog section
* update tables
* relax no_new_privs recommendations
* remove ABILITY_WRITE related functions
* reword rule "appending" to "prepending" and explain it
* cosmetic fixes

Changes since v5:
* update the rule hierarchy inheritance explanation
* briefly explain ctx->arg2
* add ptrace restrictions
* explain EPERM
* update example (subtype)
* use ":manpage:"
---
 Documentation/security/index.rst           |   1 +
 Documentation/security/landlock/index.rst  |  20 +++
 Documentation/security/landlock/kernel.rst |  99 ++++++++++++++
 Documentation/security/landlock/user.rst   | 147 +++++++++++++++++++++
 4 files changed, 267 insertions(+)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index aad6d92ffe31..32b4c1db2325 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -12,3 +12,4 @@ Security Documentation
    SCTP
    self-protection
    tpm/index
+   landlock/index
diff --git a/Documentation/security/landlock/index.rst b/Documentation/security/landlock/index.rst
new file mode 100644
index 000000000000..d0af868d1582
--- /dev/null
+++ b/Documentation/security/landlock/index.rst
@@ -0,0 +1,20 @@
+=========================================
+Landlock LSM: programmatic access control
+=========================================
+
+Landlock is a stackable Linux Security Module (LSM) that makes it possible to
+create security sandboxes, programmable access-controls or safe endpoint
+security agents.  This kind of sandbox is expected to help mitigate the
+security impact of bugs or unexpected/malicious behaviors in user-space
+applications.  The current version allows only a process with the global
+CAP_SYS_ADMIN capability to create such sandboxes but the ultimate goal of
+Landlock is to empower any process, including unprivileged ones, to securely
+restrict themselves.  Landlock is inspired by seccomp-bpf but instead of
+filtering syscalls and their raw arguments, a Landlock rule can inspect the use
+of kernel objects like files and hence make a decision according to the kernel
+semantic.
+
+.. toctree::
+
+    user
+    kernel
diff --git a/Documentation/security/landlock/kernel.rst b/Documentation/security/landlock/kernel.rst
new file mode 100644
index 000000000000..7d1e06d544bf
--- /dev/null
+++ b/Documentation/security/landlock/kernel.rst
@@ -0,0 +1,99 @@
+==============================
+Landlock: kernel documentation
+==============================
+
+eBPF properties
+===============
+
+To get an expressive language while still being safe and small, Landlock is
+based on eBPF. Landlock should be usable by untrusted processes and must
+therefore expose a minimal attack surface. The eBPF bytecode is minimal,
+powerful, widely used and designed to be used by untrusted applications. Thus,
+reusing the eBPF support in the kernel enables a generic approach while
+minimizing new code.
+
+An eBPF program has access to an eBPF context containing some fields used to
+inspect the current object. These arguments can be used directly (e.g. cookie)
+or passed to helper functions according to their types (e.g. inode pointer). It
+is then possible to do complex access checks without race conditions or
+inconsistent evaluation (i.e.  `incorrect mirroring of the OS code and state
+<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
+
+A Landlock hook describes a particular access type.  For now, there is two
+hooks dedicated to filesystem related operations: LANDLOCK_HOOK_FS_PICK and
+LANDLOCK_HOOK_FS_WALK.  A Landlock program is tied to one hook.  This makes it
+possible to statically check context accesses, potentially performed by such
+program, and hence prevents kernel address leaks and ensure the right use of
+hook arguments with eBPF functions.  Any user can add multiple Landlock
+programs per Landlock hook.  They are stacked and evaluated one after the
+other, starting from the most recent program, as seccomp-bpf does with its
+filters.  Underneath, a hook is an abstraction over a set of LSM hooks.
+
+
+Guiding principles
+==================
+
+Unprivileged use
+----------------
+
+* Landlock helpers and context should be usable by any unprivileged and
+  untrusted program while following the system security policy enforced by
+  other access control mechanisms (e.g. DAC, LSM).
+
+
+Landlock hook and context
+-------------------------
+
+* A Landlock hook shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* A Landlock context provided by a hook shall express the minimal and more
+  generic interface to control an access for a kernel object.
+* A hook shall guaranty that all the BPF function calls from a program are
+  safe.  Thus, the related Landlock context arguments shall always be of the
+  same type for a particular hook.  For example, a network hook could share
+  helpers with a file hook because of UNIX socket.  However, the same helpers
+  may not be compatible for a file system handle and a net handle.
+* Multiple hooks may use the same context interface.
+
+
+Landlock helpers
+----------------
+
+* Landlock helpers shall be as generic as possible while at the same time being
+  as simple as possible and following the syscall creation principles (cf.
+  *Documentation/adding-syscalls.txt*).
+* The only behavior change allowed on a helper is to fix a (logical) bug to
+  match the initial semantic.
+* Helpers shall be reentrant, i.e. only take inputs from arguments (e.g. from
+  the BPF context), to enable a hook to use a cache.  Future program options
+  might change this cache behavior.
+* It is quite easy to add new helpers to extend Landlock.  The main concern
+  should be about the possibility to leak information from the kernel that may
+  not be accessible otherwise (i.e. side-channel attack).
+
+
+Questions and answers
+=====================
+
+Why not create a custom hook for each kind of action?
+-----------------------------------------------------
+
+Landlock programs can handle these checks.  Adding more exceptions to the
+kernel code would lead to more code complexity.  A decision to ignore a kind of
+action can and should be done at the beginning of a Landlock program.
+
+
+Why a program does not return an errno or a kill code?
+------------------------------------------------------
+
+seccomp filters can return multiple kind of code, including an errno value or a
+kill signal, which may be convenient for access control.  Those return codes
+are hardwired in the userland ABI.  Instead, Landlock's approach is to return a
+boolean to allow or deny an action, which is much simpler and more generic.
+Moreover, we do not really have a choice because, unlike to seccomp, Landlock
+programs are not enforced at the syscall entry point but may be executed at any
+point in the kernel (through LSM hooks) where an errno return code may not make
+sense.  However, with this simple ABI and with the ability to call helpers,
+Landlock may gain features similar to seccomp-bpf in the future while being
+compatible with previous programs.
diff --git a/Documentation/security/landlock/user.rst b/Documentation/security/landlock/user.rst
new file mode 100644
index 000000000000..14c4f3b377bd
--- /dev/null
+++ b/Documentation/security/landlock/user.rst
@@ -0,0 +1,147 @@
+================================
+Landlock: userland documentation
+================================
+
+Landlock programs
+=================
+
+eBPF programs are used to create security programs.  They are contained and can
+call only a whitelist of dedicated functions. Moreover, they can only loop
+under strict conditions, which protects from denial of service.  More
+information on BPF can be found in *Documentation/networking/filter.txt*.
+
+
+Writing a program
+-----------------
+
+To enforce a security policy, a thread first needs to create a Landlock program.
+The easiest way to write an eBPF program depicting a security program is to write
+it in the C language.  As described in *samples/bpf/README.rst*, LLVM can
+compile such programs.  Files *samples/bpf/landlock1_kern.c* and those in
+*tools/testing/selftests/landlock/* can be used as examples.
+
+Once the eBPF program is created, the next step is to create the metadata
+describing the Landlock program.  This metadata includes an expected attach type which
+contains the hook type to which the program is tied, and expected attach
+triggers which identify the actions for which the program should be run.
+
+A hook is a policy decision point which exposes the same context type for
+each program evaluation.
+
+A Landlock hook describes the kind of kernel object for which a program will be
+triggered to allow or deny an action.  For example, the hook
+BPF_LANDLOCK_FS_PICK can be triggered every time a landlocked thread performs a
+set of action related to the filesystem (e.g. open, read, write, mount...).
+This actions are identified by the `triggers` bitfield.
+
+The next step is to fill a :c:type:`struct bpf_load_program_attr
+<bpf_load_program_attr>` with BPF_PROG_TYPE_LANDLOCK_HOOK, the expected attach
+type and other BPF program metadata.  This bpf_attr must then be passed to the
+:manpage:`bpf(2)` syscall alongside the BPF_PROG_LOAD command.  If everything
+is deemed correct by the kernel, the thread gets a file descriptor referring to
+this program.
+
+In the following code, the *insn* variable is an array of BPF instructions
+which can be extracted from an ELF file as is done in bpf_load_file() from
+*samples/bpf/bpf_load.c*.
+
+.. code-block:: c
+
+    int prog_fd;
+    struct bpf_load_program_attr load_attr;
+
+    memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
+    load_attr.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK;
+    load_attr.expected_attach_type = BPF_LANDLOCK_FS_PICK;
+    load_attr.expected_attach_triggers = LANDLOCK_TRIGGER_FS_PICK_OPEN;
+    load_attr.insns = insns;
+    load_attr.insns_cnt = sizeof(insn) / sizeof(struct bpf_insn);
+    load_attr.license = "GPL";
+
+    prog_fd = bpf_load_program_xattr(&load_attr, log_buf, log_buf_sz);
+    if (prog_fd == -1)
+        exit(1);
+
+
+Enforcing a program
+-------------------
+
+Once the Landlock program has been created or received (e.g. through a UNIX
+socket), the thread willing to sandbox itself (and its future children) should
+perform the following two steps.
+
+The thread should first request to never be allowed to get new privileges with a
+call to :manpage:`prctl(2)` and the PR_SET_NO_NEW_PRIVS option.  More
+information can be found in *Documentation/prctl/no_new_privs.txt*.
+
+.. code-block:: c
+
+    if (prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0))
+        exit(1);
+
+A thread can apply a program to itself by using the :manpage:`seccomp(2)` syscall.
+The operation is SECCOMP_PREPEND_LANDLOCK_PROG, the flags must be empty and the
+*args* argument must point to a valid Landlock program file descriptor.
+
+.. code-block:: c
+
+    if (seccomp(SECCOMP_PREPEND_LANDLOCK_PROG, 0, &fd))
+        exit(1);
+
+If the syscall succeeds, the program is now enforced on the calling thread and
+will be enforced on all its subsequently created children of the thread as
+well.  Once a thread is landlocked, there is no way to remove this security
+policy, only stacking more restrictions is allowed.  The program evaluation is
+performed from the newest to the oldest.
+
+When a syscall ask for an action on a kernel object, if this action is denied,
+then an EACCES errno code is returned through the syscall.
+
+
+.. _inherited_programs:
+
+Inherited programs
+------------------
+
+Every new thread resulting from a :manpage:`clone(2)` inherits Landlock program
+restrictions from its parent.  This is similar to the seccomp inheritance as
+described in *Documentation/prctl/seccomp_filter.txt*.
+
+
+Ptrace restrictions
+-------------------
+
+A landlocked process has less privileges than a non-landlocked process and must
+then be subject to additional restrictions when manipulating another process.
+To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
+process, a landlocked process must have a subset of the target process programs.
+
+
+Landlock structures and constants
+=================================
+
+Hook types
+----------
+
+.. kernel-doc:: include/uapi/linux/landlock.h
+    :functions: landlock_hook_type
+
+
+Contexts
+--------
+
+.. kernel-doc:: include/uapi/linux/landlock.h
+    :functions: landlock_ctx_fs_pick landlock_ctx_fs_walk landlock_ctx_fs_get
+
+
+Triggers for fs_pick
+--------------------
+
+.. kernel-doc:: include/uapi/linux/landlock.h
+    :functions: landlock_triggers
+
+
+Additional documentation
+========================
+
+See https://landlock.io
-- 
2.22.0


^ permalink raw reply related

* [PATCH bpf-next v10 01/10] fs,security: Add a new file access type: MAY_CHROOT
From: Mickaël Salaün @ 2019-07-21 21:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Andy Lutomirski, Arnaldo Carvalho de Melo,
	Casey Schaufler, Daniel Borkmann, David Drysdale,
	David S . Miller, Eric W . Biederman, James Morris, Jann Horn,
	John Johansen, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Shuah Khan, Stephen Smalley, Tejun Heo,
	Tetsuo Handa, Thomas Graf, Tycho Andersen, Will Drewry,
	kernel-hardening, linux-api, linux-fsdevel, linux-security-module,
	netdev
In-Reply-To: <20190721213116.23476-1-mic@digikod.net>

For compatibility reason, MAY_CHROOT is always set with MAY_CHDIR.
However, this new flag enable to differentiate a chdir form a chroot.

This is needed for the Landlock LSM to be able to evaluate a new root
directory.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/open.c          | 3 ++-
 include/linux/fs.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/open.c b/fs/open.c
index b5b80469b93d..e8767318fd03 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -494,7 +494,8 @@ int ksys_chroot(const char __user *filename)
 	if (error)
 		goto out;
 
-	error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_CHDIR);
+	error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_CHDIR |
+			MAY_CHROOT);
 	if (error)
 		goto dput_and_out;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 75f2ed289a3f..7a0d92b1da85 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -99,6 +99,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 #define MAY_CHDIR		0x00000040
 /* called from RCU mode, don't block */
 #define MAY_NOT_BLOCK		0x00000080
+#define MAY_CHROOT		0x00000100
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH v5 15/23] LSM: Specify which LSM to display
From: John Johansen @ 2019-07-19 23:37 UTC (permalink / raw)
  To: Stephen Smalley, Casey Schaufler, casey.schaufler, jmorris,
	linux-security-module, selinux
  Cc: keescook, penguin-kernel, paul, linux-audit@redhat.com
In-Reply-To: <c592f686-e026-642e-b8b3-bca08d0a0f05@tycho.nsa.gov>

On 7/9/19 2:34 PM, Stephen Smalley wrote:
> On 7/9/19 5:18 PM, Casey Schaufler wrote:
>> On 7/9/2019 11:12 AM, Stephen Smalley wrote:
>>> On 7/9/19 1:51 PM, Casey Schaufler wrote:
>>>> On 7/9/2019 10:13 AM, Stephen Smalley wrote:
>>>>> On 7/3/19 5:25 PM, Casey Schaufler wrote:
>>>>>> Create a new entry "display" in /proc/.../attr for controlling
>>>>>> which LSM security information is displayed for a process.
>>>>>> The name of an active LSM that supplies hooks for human readable
>>>>>> data may be written to "display" to set the value. The name of
>>>>>> the LSM currently in use can be read from "display".
>>>>>> At this point there can only be one LSM capable of display
>>>>>> active. A helper function lsm_task_display() to get the display
>>>>>> slot for a task_struct.
>>>>>
>>>>> As I explained previously, this is a security hole waiting to happen. It still permits a process to affect the output of audit, alter the result of reading or writing /proc/self/attr nodes even by setuid/setgid/file-caps/context-changing programs, alter the contexts generated in netlink messages delivered to other processes (I think?), and possibly other effects beyond affecting the process' own view of things.
>>>>
>>>> I would very much like some feedback regarding which of the
>>>> possible formats for putting multiple subject contexts in
>>>> audit records would be preferred:
>>>>
>>>>      lsm=selinux,subj=xyzzy_t lsm=smack,subj=Xyzzy
>>>>      lsm=selinux,smack subj=xyzzy_t,Xyzzy
>>>>      subj="selinux='xyzzy_t',smack='Xyzzy'"
>>>
>>> (cc'd linux-audit mailing list)
>>>
>>>>
>>>> Or something else. Free bikeshedding!
>>>>
>>>> I don't see how you have a problem with netlink. My look
>>>> at what's in the kernel didn't expose anything, but I am
>>>> willing to be educated.
>>>
>>> I haven't traced through it in detail, but it wasn't clear to me that the security_secid_to_secctx() call always occurs in the context of the receiving process (and hence use its display value).  If not, then the display of the sender can affect what is reported to the receiver; hence, there is a forgery concern similar to the binder issue.  It would be cleaner if we didn't alter the default behavior of security_secid_to_secctx() and security_secctx_to_secid() and instead introduced new hooks for any case where we truly want the display to take effect.
>>
>> If the context is generated by security_secid_to_secctx() we
>> retain the slot number of the module that created it in lsmcontext.
>> We have to to ensure it is released correctly. If the potential
>> issue you're describing for netlink does in fact occur, we can check
>> the slot in lsmcontext to verify that it is the same.
>>
>> security_secid_to_secctx() is called nowhere in net/netlink,
>> at least not that grep finds. Where are you seeing this potential
>> problem?
> 
> Look under net/netfilter.
> 
>>
>>>
>>>>
>>>>> Before:
>>>>> $ id
>>>>> uid=1002(sds2) gid=1002(sds2) groups=1002(sds2) context=staff_u:staff_r:staff_t:s0-s0:c0.c1023
>>>>> $ su
>>>>> Password:
>>>>> su: Authentication failure
>>>>>
>>>>> syscall audit record:
>>>>> type=SYSCALL msg=audit(07/09/2019 11:52:49.784:365) : arch=x86_64 syscall=openat
>>>>>    success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x560897e58e00 a2=O_
>>>>> WRONLY a3=0x0 items=1 ppid=3258 pid=3781 auid=sds2 uid=sds2 gid=sds2 euid=root s
>>>>> uid=root fsuid=root egid=sds2 sgid=sds2 fsgid=sds2 tty=pts2 ses=6 comm=su exe=/usr/bin/su subj=staff_u:staff_r:staff_t:s0-s0:c0.c1023 key=(null)
>>>>>
>>>>> After:
>>>>> $ id
>>>>> uid=1002(sds2) gid=1002(sds2) groups=1002(sds2) context=staff_u:staff_r:staff_t:s0-s0:c0.c1023
>>>>> $ echo apparmor > /proc/self/attr/display
>>>>> $ su
>>>>> Password:
>>>>> su: Authentication failure
>>>>>
>>>>> audit record:
>>>>> type=SYSCALL msg=audit(07/09/2019 12:05:32.402:406) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x556b41e1ae00 a2=O_WRONLY a3=0x0 items=1 ppid=3258 pid=9426 auid=sds2 uid=sds2 gid=sds2 euid=root suid=root fsuid=root egid=sds2 sgid=sds2 fsgid=sds2 tty=pts2 ses=6 comm=su exe=/usr/bin/su subj==unconfined key=(null)
>>>>>
>>>>> NB The subj= field of the SYSCALL audit record is longer accurate and is potentially under the control of a process that would not be authorized to set its subject label to that value by SELinux.
>>>>
>>>> It's still accurate, it's just not complete. It's a matter
>>>> of how best to complete it.
>>>>
>>>>>
>>>>> Now, let's play with userspace.
>>>>>
>>>>> Before:
>>>>> # id
>>>>> uid=0(root) gid=0(root) groups=0(root) context=staff_u:staff_r:staff_t:s0-s0:c0.c1023
>>>>> # passwd root
>>>>> passwd: SELinux deny access due to security policy.
>>>>>
>>>>> audit record:
>>>>> type=USER_AVC msg=audit(07/09/2019 12:24:35.135:812) : pid=12693 uid=root auid=sds2 ses=7 subj=staff_u:staff_r:passwd_t:s0-s0:c0.c1023 msg='avc:  denied  { passwd } for scontext=staff_u:staff_r:staff_t:s0-s0:c0.c1023 tcontext=staff_u:staff_r:staff_t:s0-s0:c0.c1023 tclass=passwd permissive=0  exe=/usr/bin/passwd sauid=root hostname=? addr=? terminal=pts/2'
>>>>> type=USER_CHAUTHTOK msg=audit(07/09/2019 12:24:35.135:813) : pid=12693 uid=root auid=sds2 ses=7 subj=staff_u:staff_r:passwd_t:s0-s0:c0.c1023 msg='op=attempted-to-change-password id=root exe=/usr/bin/passwd hostname=moss-pluto.infosec.tycho.ncsc.mil addr=? terminal=pts/2 res=failed'
>>>>>
>>>>> After:
>>>>> # id
>>>>> uid=0(root) gid=0(root) groups=0(root) context=staff_u:staff_r:staff_t:s0-s0:c0.c1023
>>>>> # echo apparmor > /proc/self/attr/display
>>>>> # passwd root
>>>>> passwd: SELinux deny access due to security policy.
>>>>>
>>>>> audit record:
>>>>> type=USER_CHAUTHTOK msg=audit(07/09/2019 12:28:41.349:832) : pid=13083 uid=root auid=sds2 ses=7 subj==unconfined msg='op=attempted-to-change-password id=root exe=/usr/bin/passwd hostname=moss-pluto.infosec.tycho.ncsc.mil addr=? terminal=pts/2 res=failed'
>>>>>
>>>>> Here we again get the wrong value for subj= in the USER_CHAUTHTOK audit record, and we further lose the USER_AVC record entirely because it didn't even reach the point of the permission check due to not being able to get the caller context.
>>>>>
>>>>> The situation gets worse if the caller can set things up such that it can set an attribute value for one security module that is valid and privileged with respect to another security module.  This isn't a far-fetched scenario; AppArmor will default to running everything unconfined, so as soon as you enable it, any root process can potentially load a policy that defines contexts that look exactly like SELinux contexts. Smack is even simpler; you can set any arbitrary string you want as long as you are root (by default); no policy required.  So a root process that is confined by SELinux (or by AppAmor) can suddenly forge arbitrary contexts in audit records or reads of /proc/self/attr nodes or netlink messages or ..., just by virtue of applying these patches and enabling another security module like Smack. Or consider if ptags were ever made real and merged - by design, that's all about setting arbitrary tags from userspace.  Then there is the separate issue of switching
>>>>> display to prevent attempts by a more privileged program to set one of its attributes from taking effect. Where have we seen that before - sendmail capabilities bug anyone?  And it is actually worse than that bug, because with the assistance of a friendly security module, the write may actually succeed; it just won't alter the SELinux context of the program or anything it creates!
>>>>>
>>>>> This gets a NAK from me so long as it has these issues and setting the display remains outside the control of any security module.
>>>>
>>>> The issues you've raised around audit are meritorious.
>>>> Any suggestions regarding how to address them would be
>>>> quite welcome.
>>>>
>>>> As far as the general objection to the display mechanism,
>>>> I am eager to understand what you might propose as an
>>>> alternative. We can't dismiss backward compatibility for
>>>> any of the modules. We can't preclude any module combination.
>>>>
>>>> We can require user space changes for configurations that
>>>> were impossible before, just as the addition of SELinux to
>>>> a system required user space changes. Update libselinux
>>>> to check the display before using the attr interfaces and
>>>> you've addressed most of the issues.
>>>
>>> Either we ensure that setting of the display can only affect processes in the same security equivalence class (same credentials)
>>
>> In the process of trying to argue against your point I
>> may have come around to your thinking. There would still
>> be the case where a privileged program sets the display
>> and invokes an equally privileged program which is "tricked"
>> into setting the wrong attribute, but you have to put the
>> responsibility for use of privilege on someone, somewhere.
>>
>> I will propose a solution in the next round.
>>
>>> or the security modules need to be able to control who can set the display.
>>
>> That's a mechanism for a module to opt-out of stacking,
>> and Paul has been pretty clear that he won't go for that.
> 
> It doesn't have to be used that way; it can just be used to limit the set of authorized processes that can set the display to e.g. trusted container runtimes.

We need something more than just controlling which processes can set the display LSM. We needs to be able to consider the display LSM from boot and the LSM is going to have to be more involved. At a minimum we need to some cooperation and maybe virtualization to properly support existing userspace code.

The current approach is sufficient for applications that are just dumping the output (top, pstree, ...). The set of privileged programs is more problematic. LSM specific privileged programs can be dealt with by mediation but non-LSM specific programs need something more.  The current LSMs (selinux, smack, apparmor) that use the interfaces being virtualized by the display LSM are going to want to control the display for an applications has support for the respective LSMs. If the LSMs don't cooperate in some way we may end up in a situation where the display LSM can never be switched, and even worse userspace might still try to access the label without the display LSM being switched. So are there currently any programs where we have this problem. At least one, DBus.

So lets look at dbus.

It has code to support selinux, apparmor, and I have seen out-of-tree smack code as well. It can be built with support for all of them enabled (ubuntu does this), and whether a particular LSM is enabled is determined by the LSM. Each LSM has its own way of determining if it is enabled and that check is not information not under control of the LSM (is_selinux_enabled() tests whether its fs is mounted, and aa_is_enabled() is its fs and  kernel parameter). This mean booting the Ubuntu deskop with both apparmor and selinux enabled gets interesting. You can try this in ubuntu 19.10, it has a previous version of the stacking patchset so you don't even need to build this particular patchset to test it.

If you take a default Ubuntu 19.04 install you can do
  lsm=capability,yama,apparmor,selinux

this will default the display lsm to
  apparmor

and selinux will come up in permissive mode (no policy) the desktop boots successfully.

However setting the display lsm to selinux
  lsm=capability,yama,selinux,apparmor

will result in dbus apparmor failures due to selinux being the display lsm and apparmor policy being loaded and enforcing. Add selinux policy into the mix and you double the fun as it will fail to boot either way.

New userspaces can be updated to use new interfaces but to successfully boot existing userspaces the none display LSMs are going to have to lie to userspace about being enabled.

I should note that I have tried this in Fedora too with selinux policy enabled, a stacking kernel and apparmor as the display LSM. This fails as expected (though I haven't dug into whether its dbus or another piece of code that is the culprit). Booting with selinux as the display, with apparmor enabled with no policy boots fine.


^ permalink raw reply

* Re: Preferred subj= with multiple LSMs
From: Paul Moore @ 2019-07-19 21:21 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Steve Grubb, Richard Guy Briggs, linux-audit@redhat.com,
	Linux Security Module list
In-Reply-To: <f375c23c-29e6-dc98-d71c-328db91117bc@schaufler-ca.com>

On Wed, Jul 17, 2019 at 7:02 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/17/2019 9:23 AM, Paul Moore wrote:
> > On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 7/17/2019 5:14 AM, Paul Moore wrote:
> >>> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> On 7/16/2019 4:13 PM, Paul Moore wrote:
> >>>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>>>> It sounds as if some variant of the Hideous format:
> >>>>>>
> >>>>>>         subj=selinux='a:b:c:d',apparmor='z'
> >>>>>>         subj=selinux/a:b:c:d/apparmor/z
> >>>>>>         subj=(selinux)a:b:c:d/(apparmor)z
> >>>>>>
> >>>>>> would meet Steve's searchability requirements, but with significant
> >>>>>> parsing performance penalties.
> >>>>> I think "hideous format" sums it up nicely.  Whatever we choose here
> >>>>> we are likely going to be stuck with for some time and I'm near to
> >>>>> 100% that multiplexing the labels onto a single field is going to be a
> >>>>> disaster.
> >>>> If the requirement is that subj= be searchable I don't see much of
> >>>> an alternative to a Hideous format. If we can get past that, and say
> >>>> that all subj_* have to be searchable we can avoid that set of issues.
> >>>> Instead of:
> >>>>
> >>>>         s = strstr(source, "subj=")
> >>>>         search_after_subj(s, ...);
> >>> This example does a lot of hand waving in search_after_subj(...)
> >>> regarding parsing the multiplexed LSM label.  Unless we restrict the
> >>> LSM label formats (which seems both wrong, and too late IMHO)
> >> I don't think it's too late, and I think it would be healthy
> >> to restrict LSM "contexts" to character sets that make command
> >> line specification possible. Embedded newlines? Ewwww.
> > That would imply that the delimiter you would choose for the
> > multiplexed approach would be something odd (I think you suggested
> > 0x02, or similar, earlier) which would likely require the multiplexed
> > subj field to become a hex encoded field which would be very
> > unfortunate in my opinion and would technically break with the current
> > subj/obj field format spec.  Picking a normal-ish delimiter, and
> > restricting its use by LSMs seems wrong to me.
>
> Just say "no" to hex encoding!

Yes, it's best avoided.

> BTW, keys are not hex encoded.

The kernel keyring keys?  Not really relevant here I don't think.

> We've never had to think about having general rules on
> what security modules do before, because with only one
> active each could do whatever it wanted without fear of
> conflict. If there is already a character that none of
> the existing modules use, how would it be wrong to
> reserve it?

"We've never had to think about having general rules on what security
modules do before..."

We famously haven't imposed restrictions on the label format before
now, and this seems like a pretty poor reason to start.

> > It's important to remember that Steve's strstr() comment only reflects
> > his set of userspace tools.  When you start talking about log
> > aggregation and analytics, it seems very likely that there are other
> > tools in use, likely with their own parsers that do much more
> > complicated searches than a simple strstr() call.
>
> Point. But long term, they'll have to be updated to accommodate
> whatever we decide on. Which makes the "simple" case, where one
> security module is in use all the more important.

Both the multiplexed and subj_X proposals handle the single major LSM
case the same: identical to what we have now.  Regardless of how
important the single major LSM case may be, it isn't a distinguishing
factor in this discussion.

> >>>> we have
> >>>>
> >>>>         s = source
> >>>>         for (i = 0; i < lsm_slots ; i++) {
> >>>>                 s = strstr(s, "subj_")
> >>>>                 if (!s)
> >>>>                         break;
> >>>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> >>> The hand waving here in search_after_subj_(...) is much less;
> >>> essentially you just match "subj_X" and then you can take the field
> >>> value as the LSM's label without having to know the format, the policy
> >>> loaded, etc.  It is both safer and doesn't require knowledge of the
> >>> LSMs (the LSM "name" can be specified as a parameter to the search
> >>> tool).
> >> You can do that with the Hideous format as well. I wouldn't
> >> say which would be easier without delving into the audit user
> >> space.
>
> > No, you can't.  You still need to parse the multiplexed mess, that's
> > the problem.
>
> You move the parsing problem to the record, where you have to
> look for subj_selinux= instead of having the parsing problem in
> the subj= field, where you look for something like selinux=
> within the field. Neither looks like the work of an afternoon to
> get right.

Finding subj_X in an audit record is no different than finding any
other field in a record.  Parsing the multiplexed label mess is a
whole different problem and prone to lots of mistakes.

> It probably looks like I'm arguing for the Hideous format option.
> That would require less work and code disruption, so it is tempting
> to push for it. But I would have to know the user space side a
> whole lot better than I do to feel good about pushing anything that
> isn't obviously a good choice. I kind of prefer Paul's "subj=?"
> approach, but as it's harder, I don't want to spend too much time
> on it if it gets me a big, juicy, well deserved NAK.

I didn't want to have to NAK this, but if that is what it is going to
take, so be it ... as it currently stands I'm NAK'ing the the
multiplexed approach.  You don't have to go with the subj_X approach,
but the multiplexed approach is a terrible idea and I can almost
guarantee that we would be regretting that choice in a few years time.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Dbus and multiple LSMs (was Preferred subj= with multiple LSMs)
From: Casey Schaufler @ 2019-07-19 20:02 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs,
	linux-audit@redhat.com, Linux Security Module list, casey,
	SELinux
In-Reply-To: <20190719184720.GB24836@horizon>

On 7/19/2019 11:47 AM, Simon McVittie wrote:
> Thanks for considering user-space in this, and sorry if I'm hijacking
> this thread a bit (but I think some of the things I'm raising might be
> equally applicable for audit subjects).

Thank you for asking these questions. I think that if we
can address the issues around dbus we'll be in pretty good
shape in general.

> On Fri, 19 Jul 2019 at 09:29:17 -0700, Casey Schaufler wrote:
>> On 7/19/2019 5:15 AM, Simon McVittie wrote:
>>> However, I think it would be great to have multiple-"big"-LSM-aware
>>> replacements for those interfaces, which present the various LSMs as
>>> multiple parallel credentials.
>> Defining what would go into liblsm* is a task that has fallen to
>> the chicken/egg paradox. We can't really define how the user-space
>> should work without knowing how the kernel will work, and we can't
>> solidify how the kernel will work until we know what user-space
>> can use.
> I was hoping the syscall wrappers in glibc would be a viable user-space
> interface to the small amount of LSM stuff that dbus needs to use in an
> LSM-agnostic way. That's what we use in dbus at the moment (in practice
> just getsockopt, but I'd also be reading /proc/self/attr/current if there
> was a specification for how to normalize it to match SO_PEERSEC results)
> and it's no harder than the rest of the syscall-level APIs.

I don't see how to do that without making the Fedora and Ubuntu user space
environments remain functional.

> A single LSM-agnostic shared library would be the next best thing from
> my point of view.

Good, that's how it looks to me as well.

>> An option that hasn't been discussed is a display option to provide
>> the Hideous format for applications that know that's what they want.
>> Write "hideous" into /proc/self/attr/display, and from then on you
>> get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
>> interfaces.
> If the way to parse/split it is documented, then this would be easier
> for dbus-daemon than continually resetting attr/display. It would be
> especially good if you can document a way to find out which one of the
> many labels would have been seen by an older user-space process that never
> wrote to attr/display ("it's the first one in the list" would be fine),
> so that we can put that one in our backwards-compatible API to clients.

/sys/kernel/security/lsm provides the list of all LSMs active on the system.
It would be trivial to add /sys/kernel/security/default-display-lsm which
would contain that.

> Or, alternatively, we could pass it on directly to our clients and let
> *them* parse it (possibly by using liblsm), the same way AppArmor-aware
> D-Bus clients have to know how to use either aa_splitcon() or their
> own parsing to go from the raw SO_PEERSEC result
> "/usr/bin/firefox (enforce)" to the pair ("/usr/bin/firefox", "enforce")
> that they probably actually wanted.
>
>>> Do you mean that if process 11111 writes (for example) "apparmor" into
>>> /proc/11111/attr/display, and then reads /proc/22222/attr/current
>>> or queries the SO_PEERSEC of a socket opened by process 22222,
>>> it will specifically see 22222's AppArmor label and not 22222's SELinux
>>> label?
>> Process 11111 would see the AppArmor label when reading
>> /proc/22222/attr/current. The display value is controlled
>> by process 11111 so that it can control what data it wants
>> to see.
> OK, that's what I'd hoped.
>
>> The display is set at the task level, so should be thread safe.
> OK, good. However, thinking more about this, I have other concerns:
>
> * In library code that can be used by a thread (task) that also uses other
>   arbitrary libraries, or in an executable that uses libraries that might
>   be interested in LSMs, the only safe way to deal with attr/display would
>   be this sequence:
>
>     - write desired value to /proc/self/attr/display
>     - immediately read /proc/other/attr/current or query SO_PEERSEC
>
>   and it would not be safe to rely on writing /proc/self/attr/display
>   just once at startup, because some other library might have already
>   changed it between startup and the actual read. Paradoxically, this
>   maximizes the chance of breaking a reader that was relying on writing
>   /proc/self/attr/display once during startup.
>
> * If an async signal handler needs to know a LSM label for whatever
>   reason, it will break anything in the same thread that was relying on
>   that sequence, because it might have interrupted them between their
>   write and their read:
>
>     main execution path                  signal handler
>     -------------------                  --------------
>
>     write "apparmor" to attr/display
>     (interrupted by async signal)
>                                          write "selinux" to attr/display
>                                          read attr/current or SO_PEERSEC
>                                          do other stuff with SELinux label
>                                          return
>     (resumes)
>     read attr/current or SO_PEERSEC
>     expect an AppArmor label
>     get a SELinux label
>     sadness ensues
>
>   Of course it's probably crazy for an async signal handler to do
>   this... but people do lots of odd things in async signal handlers,
>   and open(), read(), write(), getsockopt() are all async-signal-safe
>   functions, so it's at least arguably valid.

Stephen Smalley has already pointed out some of these issues.
I see display being used in scripts:

	echo apparmor > /proc/self/attr/display
	apparmor-do-stuff --options --deamon

much more than inside new or updated programs.

>> Writing to display does not require privilege, as it affects only
>> the current process. The display is inherited on fork and reset on
>> a privileged exec.
> Another concern here: are you sure it shouldn't be reset on *any*
> exec?

Yes, because so much of the user-space ecosystem depends on programs
that rarely get updated there has to be a way to specify it externally.
I don't like the situation, but we can't ignore it.

> Lots of programs (including dbus-daemon) fork-and-exec arbitrary
> child processes that come from a different codebase not under our
> control and aren't necessarily LSM-stacking-aware. I don't really want
> to have to reset /proc/self/attr/display in our increasingly crowded
> after-fork-but-before-exec code path (which, according to POSIX, is not
> a safe place to invoke any non-async-signal-safe function, so we can't
> easily do error handling if something goes wrong there).

My hope is that new and updated programs will have to tools
they need to get it right, and that those that don't won't
fall over on a well configured system.

> Is there any possibility of having a parallel kernel API that,
> if it exists, always returns the whole stack, maybe something
> like /proc/<pid>/attr/current_stack and the SO_PEERSECLABELS that I
> suggested previously,

/proc/<pid>/attr/current_stack is easy. SO_PEERSECLABELS will be
harder to sell, but would not be hard to implement if we can get
agreement on the Hideous format.
 

> instead of repurposing /proc/<pid>/attr/current
> and SO_PEERSEC to have contents that vary according to ambient process
> state in their reader?

In addition, yes. Instead of? I don't think that we can have a
backward compatibility story that flies without it.

>  (Bonus points if they are documented/defined with
> a particular syntactic normalization this time, unlike the situation
> with /proc/<pid>/attr/current and SO_PEERSEC where in principle you
> need LSM-specific knowledge to know whether a trailing "\n" or "\0"
> is safe to discard.)

I think that's necessary.

>
>     smcv


^ permalink raw reply

* Re: Preferred subj= with multiple LSMs
From: Simon McVittie @ 2019-07-19 18:47 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs,
	linux-audit@redhat.com, Linux Security Module list
In-Reply-To: <720880ca-834c-1986-3baf-021c67221ae2@schaufler-ca.com>

Thanks for considering user-space in this, and sorry if I'm hijacking
this thread a bit (but I think some of the things I'm raising might be
equally applicable for audit subjects).

On Fri, 19 Jul 2019 at 09:29:17 -0700, Casey Schaufler wrote:
> On 7/19/2019 5:15 AM, Simon McVittie wrote:
> > However, I think it would be great to have multiple-"big"-LSM-aware
> > replacements for those interfaces, which present the various LSMs as
> > multiple parallel credentials.
> 
> Defining what would go into liblsm* is a task that has fallen to
> the chicken/egg paradox. We can't really define how the user-space
> should work without knowing how the kernel will work, and we can't
> solidify how the kernel will work until we know what user-space
> can use.

I was hoping the syscall wrappers in glibc would be a viable user-space
interface to the small amount of LSM stuff that dbus needs to use in an
LSM-agnostic way. That's what we use in dbus at the moment (in practice
just getsockopt, but I'd also be reading /proc/self/attr/current if there
was a specification for how to normalize it to match SO_PEERSEC results)
and it's no harder than the rest of the syscall-level APIs.

A single LSM-agnostic shared library would be the next best thing from
my point of view.

> An option that hasn't been discussed is a display option to provide
> the Hideous format for applications that know that's what they want.
> Write "hideous" into /proc/self/attr/display, and from then on you
> get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
> interfaces.

If the way to parse/split it is documented, then this would be easier
for dbus-daemon than continually resetting attr/display. It would be
especially good if you can document a way to find out which one of the
many labels would have been seen by an older user-space process that never
wrote to attr/display ("it's the first one in the list" would be fine),
so that we can put that one in our backwards-compatible API to clients.

Or, alternatively, we could pass it on directly to our clients and let
*them* parse it (possibly by using liblsm), the same way AppArmor-aware
D-Bus clients have to know how to use either aa_splitcon() or their
own parsing to go from the raw SO_PEERSEC result
"/usr/bin/firefox (enforce)" to the pair ("/usr/bin/firefox", "enforce")
that they probably actually wanted.

> > Do you mean that if process 11111 writes (for example) "apparmor" into
> > /proc/11111/attr/display, and then reads /proc/22222/attr/current
> > or queries the SO_PEERSEC of a socket opened by process 22222,
> > it will specifically see 22222's AppArmor label and not 22222's SELinux
> > label?
> 
> Process 11111 would see the AppArmor label when reading
> /proc/22222/attr/current. The display value is controlled
> by process 11111 so that it can control what data it wants
> to see.

OK, that's what I'd hoped.

> The display is set at the task level, so should be thread safe.

OK, good. However, thinking more about this, I have other concerns:

* In library code that can be used by a thread (task) that also uses other
  arbitrary libraries, or in an executable that uses libraries that might
  be interested in LSMs, the only safe way to deal with attr/display would
  be this sequence:

    - write desired value to /proc/self/attr/display
    - immediately read /proc/other/attr/current or query SO_PEERSEC

  and it would not be safe to rely on writing /proc/self/attr/display
  just once at startup, because some other library might have already
  changed it between startup and the actual read. Paradoxically, this
  maximizes the chance of breaking a reader that was relying on writing
  /proc/self/attr/display once during startup.

* If an async signal handler needs to know a LSM label for whatever
  reason, it will break anything in the same thread that was relying on
  that sequence, because it might have interrupted them between their
  write and their read:

    main execution path                  signal handler
    -------------------                  --------------

    write "apparmor" to attr/display
    (interrupted by async signal)
                                         write "selinux" to attr/display
                                         read attr/current or SO_PEERSEC
                                         do other stuff with SELinux label
                                         return
    (resumes)
    read attr/current or SO_PEERSEC
    expect an AppArmor label
    get a SELinux label
    sadness ensues

  Of course it's probably crazy for an async signal handler to do
  this... but people do lots of odd things in async signal handlers,
  and open(), read(), write(), getsockopt() are all async-signal-safe
  functions, so it's at least arguably valid.

> Writing to display does not require privilege, as it affects only
> the current process. The display is inherited on fork and reset on
> a privileged exec.

Another concern here: are you sure it shouldn't be reset on *any*
exec? Lots of programs (including dbus-daemon) fork-and-exec arbitrary
child processes that come from a different codebase not under our
control and aren't necessarily LSM-stacking-aware. I don't really want
to have to reset /proc/self/attr/display in our increasingly crowded
after-fork-but-before-exec code path (which, according to POSIX, is not
a safe place to invoke any non-async-signal-safe function, so we can't
easily do error handling if something goes wrong there).

Is there any possibility of having a parallel kernel API that,
if it exists, always returns the whole stack, maybe something
like /proc/<pid>/attr/current_stack and the SO_PEERSECLABELS that I
suggested previously, instead of repurposing /proc/<pid>/attr/current
and SO_PEERSEC to have contents that vary according to ambient process
state in their reader? (Bonus points if they are documented/defined with
a particular syntactic normalization this time, unlike the situation
with /proc/<pid>/attr/current and SO_PEERSEC where in principle you
need LSM-specific knowledge to know whether a trailing "\n" or "\0"
is safe to discard.)

    smcv

^ permalink raw reply

* Re: Preferred subj= with multiple LSMs
From: Casey Schaufler @ 2019-07-19 16:29 UTC (permalink / raw)
  To: Simon McVittie
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs,
	linux-audit@redhat.com, Linux Security Module list, casey
In-Reply-To: <20190719121540.GA1764@horizon>

On 7/19/2019 5:15 AM, Simon McVittie wrote:
> On Thu, 18 Jul 2019 at 09:13:52 -0700, Casey Schaufler wrote:
>> We have discussed what's currently being
>> called the "hideous" format, selinux='a:b:c:d',apparmor='x' which
>> in the past, and concluded that the compatibility issues would be too
>> great.
> I agree this might be too big a compat break for existing interfaces that
> were designed with the assumption that there can only be one "big" LSM
> at a time, like /proc/54321/attr/current and SO_PEERSEC. It would certainly
> break the current libapparmor, and presumably libselinux as well.
>
> However, I think it would be great to have multiple-"big"-LSM-aware
> replacements for those interfaces, which present the various LSMs as
> multiple parallel credentials.

Defining what would go into liblsm* is a task that has fallen to
the chicken/egg paradox. We can't really define how the user-space
should work without knowing how the kernel will work, and we can't
solidify how the kernel will work until we know what user-space
can use.

---
* I absolutely refuse to allow this to be libsecurity!

> I think it would also be valuable to take this opportunity to pin down
> what can and can't be in a label, to an extent where people who want
> to represent them in a similar encoding know what they can and can't
> assume about their format. For example, when dbus-daemon reports an
> unusual event (like rejecting a message due to policy rules or LSMs,
> or hitting a resource limit that isn't normally meant to be reached),
> the log entry contains miscellaneous information about the process for
> debugging purposes, and it would be good if we could include all the LSM
> labels in that string without ambiguity. This is essentially the same
> problem that the audit subsystem has, but with fewer constraints, since
> the audit subsystem has to meet externally-imposed security requirements
> but our equivalent is just a nice-to-have for debugging.

Sounds like the Hideous format, or a variant thereof, would be
fine for you, especially if you never parse it.

>> Have you been following the discussions on setting a "display" value
>> to specify which LSM data is presented by /proc/self/attr/current and
>> SO_PEERSEC? Briefly, a process can write the name of the LSM it wants
>> to see data from to /proc/self/attr/display, and the aforementioned
>> interfaces will use that LSM. If no value has been set the first LSM
>> registered that uses any of these interfaces gets the nod.
> I'm vaguely aware of the discussion, but LSMs aren't a big part of my
> D-Bus maintainer role, so I'm afraid I can't keep up with all of it.
>
> Do you mean that if process 11111 writes (for example) "apparmor" into
> /proc/11111/attr/display, and then reads /proc/22222/attr/current
> or queries the SO_PEERSEC of a socket opened by process 22222,
> it will specifically see 22222's AppArmor label and not 22222's SELinux
> label? Or is the contents of /proc/22222/attr/current controlled
> by /proc/22222/attr/display?

Process 11111 would see the AppArmor label when reading
/proc/22222/attr/current. The display value is controlled
by process 11111 so that it can control what data it wants
to see.

> How is this meant to work for generic LSM-aware user-space processes? If
> (for example) ps -Z 22222 wants to get both the AppArmor label and the
> SELinux label for process 22222, is it meant to write "apparmor" into
> attr/display, then read /proc/22222/attr/current, then write "selinux"
> to attr/display, then read /proc/22222/attr/current again? That sounds
> risky if another thread might be manipulating attr/display concurrently.

The display is set at the task level, so should be thread safe.

> The D-Bus message bus/broker (reference implementation: dbus-daemon)
> is somewhat tricky because it is returning data on behalf of processes
> other than itself, so it would be difficult for it to choose a good
> value for "display": there's no reason why it wouldn't be responding
> to requests from NetworkManager that expect to see SELinux labels, and
> also requests from lxd that expect to see AppArmor labels. Obviously
> it can't put both in LinuxSecurityLabel without the same compatibility
> issues you're discussing.

Just so.

> Also note that dbus-daemon is trusted but mostly unprivileged - it
> starts as root, then drops privileges to a system user normally called
> messagebus, dbus or _dbus for its normal operation (although it does
> retain CAP_AUDIT_WRITE) - so it can't carry out privileged operations
> on other processes' /proc entries, if that's what the API requires.

Writing to display does not require privilege, as it affects only
the current process. The display is inherited on fork and reset on
a privileged exec.

> I would strongly prefer it if we could get this information from
> the kernel in a way that is Linux-specific but LSM-agnostic, without
> having to link to libapparmor, libselinux, libsmack and everyone else's
> favourite LSM library. At the moment we only need to link to libraries
> for the LSMs where dbus-daemon can carry out mediation (asking the LSM
> whether to accept or reject messages), and we don't need the libraries
> if we are just passing through identity information.

I can see that making dbus-daemon have to decide which label of many
to pass on to its clients would be bad.

> I would also prefer it if we can get this information from SO_PEERSEC
> (or some newer SO_PEERSEC replacement) without having to manipulate
> ambient/implicit state like attr/display; but dbus-daemon is
> single-threaded, so if we must do that, it wouldn't be *so* horrible.

An option that hasn't been discussed is a display option to provide
the Hideous format for applications that know that's what they want.
Write "hideous" into /proc/self/attr/display, and from then on you
get selinux='a:b:c:d',apparmor='z'. This could be used widely in liblsm
interfaces.

> Ideally I would like to be able to get all the LSM labels in O(1)
> syscalls. Perhaps something with the same (buffer,length) kernel <->
> user-space API as SO_PEERSEC and SO_PEERGROUPS, but instead of returning
> a single \0-terminated string, it could return either the "hideous" format,
> or a byte-blob that looks something like this?
>
>     char buffer[ENOUGH_LENGTH] = { 0 };
>     socklen_t len = sizeof (buffer);
>     char[] expected =
>     "apparmor=unconfined\0"
>     "selinux=system_u:system_r:init_t:s0\0"
>     "\0"
>     ;
>
>     getsockopt (fd, SOL_SOCKET, SO_PEERSECLABELS, &buffer, &length);
>     /* should return 0 */
>     /* now buffer should have the same bytes as expected, ending with
>      * "\0\0" */
>
> (Obviously in real life you'd have a retry loop to get the length right,
> like the SO_PEERSEC code in dbus does.)

I would see creating a friendly interface like this as part of
my mythical liblsm, but I see your point.

> Because GetConnectionCredentials() is extensible, if there is some way
> to enumerate all the security labels and get their values individually,
> we could have (pseudocode)
>
> GetConnectionCredentials(":1.1") -> {
>   "UnixUserID": 0,
>   "ProcessID": 1,
>   "LinuxSecurityLabel.apparmor": "unconfined",
>   "LinuxSecurityLabel.selinux": "system_u:system_r:init_t:s0",
>   "LinuxSecurityLabel": "unconfined",    /* deprecated */
> }
>
> or (using D-Bus' structured type system)
>
> GetConnectionCredentials(":1.1") -> {
>   "UnixUserID": 0,
>   "ProcessID": 1,
>   "LinuxSecurityLabels": {
>     "apparmor": "unconfined",
>     "selinux": "system_u:system_r:init_t:s0",
>   },
>   "LinuxSecurityLabel": "unconfined",    /* deprecated */
> }
>
> with LinuxSecurityLabel showing the first LSM registered for backwards
> compatibility? Or we could make LinuxSecurityLabel always be in the
> "hideous" format if you chose to go that way in the kernel interfaces:
> it's defined in terms of SO_PEERSEC, so whatever you do at the kernel
> level, D-Bus should mimic that.

If providing the Hideous format makes library code easier or more
efficient I'm happy to make that happen. It can't be the default due to
backward compatibility, but it can be easy as
"echo hideous > /proc/self/attr/display".

>
>     smcv


^ permalink raw reply

* Re: Preferred subj= with multiple LSMs
From: Simon McVittie @ 2019-07-19 12:15 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Steve Grubb, Richard Guy Briggs,
	linux-audit@redhat.com, Linux Security Module list
In-Reply-To: <45661e97-2ed0-22e5-992e-5d562ff11488@schaufler-ca.com>

On Thu, 18 Jul 2019 at 09:13:52 -0700, Casey Schaufler wrote:
> We have discussed what's currently being
> called the "hideous" format, selinux='a:b:c:d',apparmor='x' which
> in the past, and concluded that the compatibility issues would be too
> great.

I agree this might be too big a compat break for existing interfaces that
were designed with the assumption that there can only be one "big" LSM
at a time, like /proc/54321/attr/current and SO_PEERSEC. It would certainly
break the current libapparmor, and presumably libselinux as well.

However, I think it would be great to have multiple-"big"-LSM-aware
replacements for those interfaces, which present the various LSMs as
multiple parallel credentials.

I think it would also be valuable to take this opportunity to pin down
what can and can't be in a label, to an extent where people who want
to represent them in a similar encoding know what they can and can't
assume about their format. For example, when dbus-daemon reports an
unusual event (like rejecting a message due to policy rules or LSMs,
or hitting a resource limit that isn't normally meant to be reached),
the log entry contains miscellaneous information about the process for
debugging purposes, and it would be good if we could include all the LSM
labels in that string without ambiguity. This is essentially the same
problem that the audit subsystem has, but with fewer constraints, since
the audit subsystem has to meet externally-imposed security requirements
but our equivalent is just a nice-to-have for debugging.

> Have you been following the discussions on setting a "display" value
> to specify which LSM data is presented by /proc/self/attr/current and
> SO_PEERSEC? Briefly, a process can write the name of the LSM it wants
> to see data from to /proc/self/attr/display, and the aforementioned
> interfaces will use that LSM. If no value has been set the first LSM
> registered that uses any of these interfaces gets the nod.

I'm vaguely aware of the discussion, but LSMs aren't a big part of my
D-Bus maintainer role, so I'm afraid I can't keep up with all of it.

Do you mean that if process 11111 writes (for example) "apparmor" into
/proc/11111/attr/display, and then reads /proc/22222/attr/current
or queries the SO_PEERSEC of a socket opened by process 22222,
it will specifically see 22222's AppArmor label and not 22222's SELinux
label? Or is the contents of /proc/22222/attr/current controlled
by /proc/22222/attr/display?

How is this meant to work for generic LSM-aware user-space processes? If
(for example) ps -Z 22222 wants to get both the AppArmor label and the
SELinux label for process 22222, is it meant to write "apparmor" into
attr/display, then read /proc/22222/attr/current, then write "selinux"
to attr/display, then read /proc/22222/attr/current again? That sounds
risky if another thread might be manipulating attr/display concurrently.

The D-Bus message bus/broker (reference implementation: dbus-daemon)
is somewhat tricky because it is returning data on behalf of processes
other than itself, so it would be difficult for it to choose a good
value for "display": there's no reason why it wouldn't be responding
to requests from NetworkManager that expect to see SELinux labels, and
also requests from lxd that expect to see AppArmor labels. Obviously
it can't put both in LinuxSecurityLabel without the same compatibility
issues you're discussing.

Also note that dbus-daemon is trusted but mostly unprivileged - it
starts as root, then drops privileges to a system user normally called
messagebus, dbus or _dbus for its normal operation (although it does
retain CAP_AUDIT_WRITE) - so it can't carry out privileged operations
on other processes' /proc entries, if that's what the API requires.

I would strongly prefer it if we could get this information from
the kernel in a way that is Linux-specific but LSM-agnostic, without
having to link to libapparmor, libselinux, libsmack and everyone else's
favourite LSM library. At the moment we only need to link to libraries
for the LSMs where dbus-daemon can carry out mediation (asking the LSM
whether to accept or reject messages), and we don't need the libraries
if we are just passing through identity information.

I would also prefer it if we can get this information from SO_PEERSEC
(or some newer SO_PEERSEC replacement) without having to manipulate
ambient/implicit state like attr/display; but dbus-daemon is
single-threaded, so if we must do that, it wouldn't be *so* horrible.

Ideally I would like to be able to get all the LSM labels in O(1)
syscalls. Perhaps something with the same (buffer,length) kernel <->
user-space API as SO_PEERSEC and SO_PEERGROUPS, but instead of returning
a single \0-terminated string, it could return either the "hideous" format,
or a byte-blob that looks something like this?

    char buffer[ENOUGH_LENGTH] = { 0 };
    socklen_t len = sizeof (buffer);
    char[] expected =
    "apparmor=unconfined\0"
    "selinux=system_u:system_r:init_t:s0\0"
    "\0"
    ;

    getsockopt (fd, SOL_SOCKET, SO_PEERSECLABELS, &buffer, &length);
    /* should return 0 */
    /* now buffer should have the same bytes as expected, ending with
     * "\0\0" */

(Obviously in real life you'd have a retry loop to get the length right,
like the SO_PEERSEC code in dbus does.)

Because GetConnectionCredentials() is extensible, if there is some way
to enumerate all the security labels and get their values individually,
we could have (pseudocode)

GetConnectionCredentials(":1.1") -> {
  "UnixUserID": 0,
  "ProcessID": 1,
  "LinuxSecurityLabel.apparmor": "unconfined",
  "LinuxSecurityLabel.selinux": "system_u:system_r:init_t:s0",
  "LinuxSecurityLabel": "unconfined",    /* deprecated */
}

or (using D-Bus' structured type system)

GetConnectionCredentials(":1.1") -> {
  "UnixUserID": 0,
  "ProcessID": 1,
  "LinuxSecurityLabels": {
    "apparmor": "unconfined",
    "selinux": "system_u:system_r:init_t:s0",
  },
  "LinuxSecurityLabel": "unconfined",    /* deprecated */
}

with LinuxSecurityLabel showing the first LSM registered for backwards
compatibility? Or we could make LinuxSecurityLabel always be in the
"hideous" format if you chose to go that way in the kernel interfaces:
it's defined in terms of SO_PEERSEC, so whatever you do at the kernel
level, D-Bus should mimic that.

    smcv

^ permalink raw reply

* Re: [PATCH V36 23/29] bpf: Restrict bpf when kernel lockdown is in confidentiality mode
From: Kees Cook @ 2019-07-18 21:06 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Alexei Starovoitov, Matthew Garrett, netdev,
	Chun-Yi Lee, Daniel Borkmann
In-Reply-To: <20190718194415.108476-24-matthewgarrett@google.com>

On Thu, Jul 18, 2019 at 12:44:09PM -0700, Matthew Garrett wrote:
> From: David Howells <dhowells@redhat.com>
> 
> bpf_read() and bpf_read_str() could potentially be abused to (eg) allow
> private keys in kernel memory to be leaked. Disable them if the kernel
> has been locked down in confidentiality mode.
> 
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> cc: netdev@vger.kernel.org
> cc: Chun-Yi Lee <jlee@suse.com>
> cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> ---
>  include/linux/security.h     |  1 +
>  kernel/trace/bpf_trace.c     | 10 ++++++++++
>  security/lockdown/lockdown.c |  1 +
>  3 files changed, 12 insertions(+)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 987d8427f091..8dd1741a52cd 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -118,6 +118,7 @@ enum lockdown_reason {
>  	LOCKDOWN_INTEGRITY_MAX,
>  	LOCKDOWN_KCORE,
>  	LOCKDOWN_KPROBES,
> +	LOCKDOWN_BPF_READ,
>  	LOCKDOWN_CONFIDENTIALITY_MAX,
>  };
>  
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index ca1255d14576..492a8bfaae98 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -142,8 +142,13 @@ BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
>  {
>  	int ret;
>  
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret < 0)
> +		goto out;
> +
>  	ret = probe_kernel_read(dst, unsafe_ptr, size);
>  	if (unlikely(ret < 0))
> +out:
>  		memset(dst, 0, size);
>  
>  	return ret;
> @@ -569,6 +574,10 @@ BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
>  {
>  	int ret;
>  
> +	ret = security_locked_down(LOCKDOWN_BPF_READ);
> +	if (ret < 0)
> +		goto out;
> +
>  	/*
>  	 * The strncpy_from_unsafe() call will likely not fill the entire
>  	 * buffer, but that's okay in this circumstance as we're probing
> @@ -580,6 +589,7 @@ BPF_CALL_3(bpf_probe_read_str, void *, dst, u32, size,
>  	 */
>  	ret = strncpy_from_unsafe(dst, unsafe_ptr, size);
>  	if (unlikely(ret < 0))
> +out:
>  		memset(dst, 0, size);
>  
>  	return ret;
> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 6b123cbf3748..1b89d3e8e54d 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -33,6 +33,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
>  	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
>  	[LOCKDOWN_KCORE] = "/proc/kcore access",
>  	[LOCKDOWN_KPROBES] = "use of kprobes",
> +	[LOCKDOWN_BPF_READ] = "use of bpf to read kernel RAM",
>  	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
>  };
>  
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V36 20/29] x86/mmiotrace: Lock down the testmmiotrace module
From: Kees Cook @ 2019-07-18 21:06 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: jmorris, linux-security-module, linux-kernel, linux-api,
	David Howells, Thomas Gleixner, Matthew Garrett, Steven Rostedt,
	Ingo Molnar, H. Peter Anvin, x86
In-Reply-To: <20190718194415.108476-21-matthewgarrett@google.com>

On Thu, Jul 18, 2019 at 12:44:06PM -0700, Matthew Garrett wrote:
> From: David Howells <dhowells@redhat.com>
> 
> The testmmiotrace module shouldn't be permitted when the kernel is locked
> down as it can be used to arbitrarily read and write MMIO space. This is
> a runtime check rather than buildtime in order to allow configurations
> where the same kernel may be run in both locked down or permissive modes
> depending on local policy.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: David Howells <dhowells@redhat.com

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> cc: Thomas Gleixner <tglx@linutronix.de>
> cc: Steven Rostedt <rostedt@goodmis.org>
> cc: Ingo Molnar <mingo@kernel.org>
> cc: "H. Peter Anvin" <hpa@zytor.com>
> cc: x86@kernel.org
> ---
>  arch/x86/mm/testmmiotrace.c  | 5 +++++
>  include/linux/security.h     | 1 +
>  security/lockdown/lockdown.c | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/arch/x86/mm/testmmiotrace.c b/arch/x86/mm/testmmiotrace.c
> index 0881e1ff1e58..a8bd952e136d 100644
> --- a/arch/x86/mm/testmmiotrace.c
> +++ b/arch/x86/mm/testmmiotrace.c
> @@ -8,6 +8,7 @@
>  #include <linux/module.h>
>  #include <linux/io.h>
>  #include <linux/mmiotrace.h>
> +#include <linux/security.h>
>  
>  static unsigned long mmio_address;
>  module_param_hw(mmio_address, ulong, iomem, 0);
> @@ -115,6 +116,10 @@ static void do_test_bulk_ioremapping(void)
>  static int __init init(void)
>  {
>  	unsigned long size = (read_far) ? (8 << 20) : (16 << 10);
> +	int ret = security_locked_down(LOCKDOWN_MMIOTRACE);
> +
> +	if (ret)
> +		return ret;
>  
>  	if (mmio_address == 0) {
>  		pr_err("you have to use the module argument mmio_address.\n");
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 43fa3486522b..3f7b6a4cd65a 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -114,6 +114,7 @@ enum lockdown_reason {
>  	LOCKDOWN_PCMCIA_CIS,
>  	LOCKDOWN_TIOCSSERIAL,
>  	LOCKDOWN_MODULE_PARAMETERS,
> +	LOCKDOWN_MMIOTRACE,
>  	LOCKDOWN_INTEGRITY_MAX,
>  	LOCKDOWN_CONFIDENTIALITY_MAX,
>  };
> diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
> index 5177938cfa0d..37b7d7e50474 100644
> --- a/security/lockdown/lockdown.c
> +++ b/security/lockdown/lockdown.c
> @@ -29,6 +29,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
>  	[LOCKDOWN_PCMCIA_CIS] = "direct PCMCIA CIS storage",
>  	[LOCKDOWN_TIOCSSERIAL] = "reconfiguration of serial port IO",
>  	[LOCKDOWN_MODULE_PARAMETERS] = "unsafe module parameters",
> +	[LOCKDOWN_MMIOTRACE] = "unsafe mmio",
>  	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
>  	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
>  };
> -- 
> 2.22.0.510.g264f2c817a-goog
> 

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH V36 02/29] security: Add a "locked down" LSM hook
From: Casey Schaufler @ 2019-07-18 20:03 UTC (permalink / raw)
  To: Matthew Garrett, jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	Kees Cook
In-Reply-To: <20190718194415.108476-3-matthewgarrett@google.com>

On 7/18/2019 12:43 PM, Matthew Garrett wrote:
> Add a mechanism to allow LSMs to make a policy decision around whether
> kernel functionality that would allow tampering with or examining the
> runtime state of the kernel should be permitted.
>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Acked-by: Kees Cook <keescook@chromium.org>

Acked-by: Casey Schaufler <casey@schaufler-ca.com>

> ---
>  include/linux/lsm_hooks.h |  2 ++
>  include/linux/security.h  | 32 ++++++++++++++++++++++++++++++++
>  security/security.c       |  6 ++++++
>  3 files changed, 40 insertions(+)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index aebb0e032072..29c22cf40113 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1807,6 +1807,7 @@ union security_list_options {
>  	int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
>  	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
>  #endif /* CONFIG_BPF_SYSCALL */
> +	int (*locked_down)(enum lockdown_reason what);
>  };
>  
>  struct security_hook_heads {
> @@ -2046,6 +2047,7 @@ struct security_hook_heads {
>  	struct hlist_head bpf_prog_alloc_security;
>  	struct hlist_head bpf_prog_free_security;
>  #endif /* CONFIG_BPF_SYSCALL */
> +	struct hlist_head locked_down;
>  } __randomize_layout;
>  
>  /*
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 66a2fcbe6ab0..c2b1204e8e26 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -77,6 +77,33 @@ enum lsm_event {
>  	LSM_POLICY_CHANGE,
>  };
>  
> +/*
> + * These are reasons that can be passed to the security_locked_down()
> + * LSM hook. Lockdown reasons that protect kernel integrity (ie, the
> + * ability for userland to modify kernel code) are placed before
> + * LOCKDOWN_INTEGRITY_MAX.  Lockdown reasons that protect kernel
> + * confidentiality (ie, the ability for userland to extract
> + * information from the running kernel that would otherwise be
> + * restricted) are placed before LOCKDOWN_CONFIDENTIALITY_MAX.
> + *
> + * LSM authors should note that the semantics of any given lockdown
> + * reason are not guaranteed to be stable - the same reason may block
> + * one set of features in one kernel release, and a slightly different
> + * set of features in a later kernel release. LSMs that seek to expose
> + * lockdown policy at any level of granularity other than "none",
> + * "integrity" or "confidentiality" are responsible for either
> + * ensuring that they expose a consistent level of functionality to
> + * userland, or ensuring that userland is aware that this is
> + * potentially a moving target. It is easy to misuse this information
> + * in a way that could break userspace. Please be careful not to do
> + * so.
> + */
> +enum lockdown_reason {
> +	LOCKDOWN_NONE,
> +	LOCKDOWN_INTEGRITY_MAX,
> +	LOCKDOWN_CONFIDENTIALITY_MAX,
> +};
> +
>  /* These functions are in security/commoncap.c */
>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
>  		       int cap, unsigned int opts);
> @@ -393,6 +420,7 @@ void security_inode_invalidate_secctx(struct inode *inode);
>  int security_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen);
>  int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen);
>  int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
> +int security_locked_down(enum lockdown_reason what);
>  #else /* CONFIG_SECURITY */
>  
>  static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
> @@ -1205,6 +1233,10 @@ static inline int security_inode_getsecctx(struct inode *inode, void **ctx, u32
>  {
>  	return -EOPNOTSUPP;
>  }
> +static inline int security_locked_down(enum lockdown_reason what)
> +{
> +	return 0;
> +}
>  #endif	/* CONFIG_SECURITY */
>  
>  #ifdef CONFIG_SECURITY_NETWORK
> diff --git a/security/security.c b/security/security.c
> index 90f1e291c800..ce6c945bf347 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -2392,3 +2392,9 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux)
>  	call_void_hook(bpf_prog_free_security, aux);
>  }
>  #endif /* CONFIG_BPF_SYSCALL */
> +
> +int security_locked_down(enum lockdown_reason what)
> +{
> +	return call_int_hook(locked_down, 0, what);
> +}
> +EXPORT_SYMBOL(security_locked_down);

^ permalink raw reply

* Re: [PATCH V36 01/29] security: Support early LSMs
From: Casey Schaufler @ 2019-07-18 20:02 UTC (permalink / raw)
  To: Matthew Garrett, jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	Kees Cook
In-Reply-To: <20190718194415.108476-2-matthewgarrett@google.com>

On 7/18/2019 12:43 PM, Matthew Garrett wrote:
> The lockdown module is intended to allow for kernels to be locked down
> early in boot - sufficiently early that we don't have the ability to
> kmalloc() yet. Add support for early initialisation of some LSMs, and
> then add them to the list of names when we do full initialisation later.
> Early LSMs are initialised in link order and cannot be overridden via
> boot parameters, and cannot make use of kmalloc() (since the allocator
> isn't initialised yet).
>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> Acked-by: Kees Cook <keescook@chromium.org>


Acked-by: Casey Schaufler <casey@schaufler-ca.com>

> ---
>  include/asm-generic/vmlinux.lds.h |  8 ++++-
>  include/linux/lsm_hooks.h         |  6 ++++
>  include/linux/security.h          |  1 +
>  init/main.c                       |  1 +
>  security/security.c               | 50 ++++++++++++++++++++++++++-----
>  5 files changed, 57 insertions(+), 9 deletions(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index ca42182992a5..6cc6174a2a4c 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -215,8 +215,13 @@
>  			__start_lsm_info = .;				\
>  			KEEP(*(.lsm_info.init))				\
>  			__end_lsm_info = .;
> +#define EARLY_LSM_TABLE()	. = ALIGN(8);				\
> +			__start_early_lsm_info = .;			\
> +			KEEP(*(.early_lsm_info.init))			\
> +			__end_early_lsm_info = .;
>  #else
>  #define LSM_TABLE()
> +#define EARLY_LSM_TABLE()
>  #endif
>  
>  #define ___OF_TABLE(cfg, name)	_OF_TABLE_##cfg(name)
> @@ -616,7 +621,8 @@
>  	ACPI_PROBE_TABLE(irqchip)					\
>  	ACPI_PROBE_TABLE(timer)						\
>  	EARLYCON_TABLE()						\
> -	LSM_TABLE()
> +	LSM_TABLE()							\
> +	EARLY_LSM_TABLE()
>  
>  #define INIT_TEXT							\
>  	*(.init.text .init.text.*)					\
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index df1318d85f7d..aebb0e032072 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -2104,12 +2104,18 @@ struct lsm_info {
>  };
>  
>  extern struct lsm_info __start_lsm_info[], __end_lsm_info[];
> +extern struct lsm_info __start_early_lsm_info[], __end_early_lsm_info[];
>  
>  #define DEFINE_LSM(lsm)							\
>  	static struct lsm_info __lsm_##lsm				\
>  		__used __section(.lsm_info.init)			\
>  		__aligned(sizeof(unsigned long))
>  
> +#define DEFINE_EARLY_LSM(lsm)						\
> +	static struct lsm_info __early_lsm_##lsm			\
> +		__used __section(.early_lsm_info.init)			\
> +		__aligned(sizeof(unsigned long))
> +
>  #ifdef CONFIG_SECURITY_SELINUX_DISABLE
>  /*
>   * Assuring the safety of deleting a security module is up to
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 5f7441abbf42..66a2fcbe6ab0 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -195,6 +195,7 @@ int unregister_blocking_lsm_notifier(struct notifier_block *nb);
>  
>  /* prototypes */
>  extern int security_init(void);
> +extern int early_security_init(void);
>  
>  /* Security operations */
>  int security_binder_set_context_mgr(struct task_struct *mgr);
> diff --git a/init/main.c b/init/main.c
> index ff5803b0841c..0fefca3fd43c 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -593,6 +593,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	boot_cpu_init();
>  	page_address_init();
>  	pr_notice("%s", linux_banner);
> +	early_security_init();
>  	setup_arch(&command_line);
>  	mm_init_cpumask(&init_mm);
>  	setup_command_line(command_line);
> diff --git a/security/security.c b/security/security.c
> index 250ee2d76406..90f1e291c800 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -33,6 +33,7 @@
>  
>  /* How many LSMs were built into the kernel? */
>  #define LSM_COUNT (__end_lsm_info - __start_lsm_info)
> +#define EARLY_LSM_COUNT (__end_early_lsm_info - __start_early_lsm_info)
>  
>  struct security_hook_heads security_hook_heads __lsm_ro_after_init;
>  static BLOCKING_NOTIFIER_HEAD(blocking_lsm_notifier_chain);
> @@ -277,6 +278,8 @@ static void __init ordered_lsm_parse(const char *order, const char *origin)
>  static void __init lsm_early_cred(struct cred *cred);
>  static void __init lsm_early_task(struct task_struct *task);
>  
> +static int lsm_append(const char *new, char **result);
> +
>  static void __init ordered_lsm_init(void)
>  {
>  	struct lsm_info **lsm;
> @@ -323,6 +326,26 @@ static void __init ordered_lsm_init(void)
>  	kfree(ordered_lsms);
>  }
>  
> +int __init early_security_init(void)
> +{
> +	int i;
> +	struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
> +	struct lsm_info *lsm;
> +
> +	for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
> +	     i++)
> +		INIT_HLIST_HEAD(&list[i]);
> +
> +	for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
> +		if (!lsm->enabled)
> +			lsm->enabled = &lsm_enabled_true;
> +		prepare_lsm(lsm);
> +		initialize_lsm(lsm);
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * security_init - initializes the security framework
>   *
> @@ -330,14 +353,18 @@ static void __init ordered_lsm_init(void)
>   */
>  int __init security_init(void)
>  {
> -	int i;
> -	struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
> +	struct lsm_info *lsm;
>  
>  	pr_info("Security Framework initializing\n");
>  
> -	for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
> -	     i++)
> -		INIT_HLIST_HEAD(&list[i]);
> +	/*
> +	 * Append the names of the early LSM modules now that kmalloc() is
> +	 * available
> +	 */
> +	for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
> +		if (lsm->enabled)
> +			lsm_append(lsm->name, &lsm_names);
> +	}
>  
>  	/* Load LSMs in specified order. */
>  	ordered_lsm_init();
> @@ -384,7 +411,7 @@ static bool match_last_lsm(const char *list, const char *lsm)
>  	return !strcmp(last, lsm);
>  }
>  
> -static int lsm_append(char *new, char **result)
> +static int lsm_append(const char *new, char **result)
>  {
>  	char *cp;
>  
> @@ -422,8 +449,15 @@ void __init security_add_hooks(struct security_hook_list *hooks, int count,
>  		hooks[i].lsm = lsm;
>  		hlist_add_tail_rcu(&hooks[i].list, hooks[i].head);
>  	}
> -	if (lsm_append(lsm, &lsm_names) < 0)
> -		panic("%s - Cannot get early memory.\n", __func__);
> +
> +	/*
> +	 * Don't try to append during early_security_init(), we'll come back
> +	 * and fix this up afterwards.
> +	 */
> +	if (slab_is_available()) {
> +		if (lsm_append(lsm, &lsm_names) < 0)
> +			panic("%s - Cannot get early memory.\n", __func__);
> +	}
>  }
>  
>  int call_blocking_lsm_notifier(enum lsm_event event, void *data)

^ permalink raw reply

* [PATCH V36 01/29] security: Support early LSMs
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	Matthew Garrett, Kees Cook
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

The lockdown module is intended to allow for kernels to be locked down
early in boot - sufficiently early that we don't have the ability to
kmalloc() yet. Add support for early initialisation of some LSMs, and
then add them to the list of names when we do full initialisation later.
Early LSMs are initialised in link order and cannot be overridden via
boot parameters, and cannot make use of kmalloc() (since the allocator
isn't initialised yet).

Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 include/asm-generic/vmlinux.lds.h |  8 ++++-
 include/linux/lsm_hooks.h         |  6 ++++
 include/linux/security.h          |  1 +
 init/main.c                       |  1 +
 security/security.c               | 50 ++++++++++++++++++++++++++-----
 5 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ca42182992a5..6cc6174a2a4c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -215,8 +215,13 @@
 			__start_lsm_info = .;				\
 			KEEP(*(.lsm_info.init))				\
 			__end_lsm_info = .;
+#define EARLY_LSM_TABLE()	. = ALIGN(8);				\
+			__start_early_lsm_info = .;			\
+			KEEP(*(.early_lsm_info.init))			\
+			__end_early_lsm_info = .;
 #else
 #define LSM_TABLE()
+#define EARLY_LSM_TABLE()
 #endif
 
 #define ___OF_TABLE(cfg, name)	_OF_TABLE_##cfg(name)
@@ -616,7 +621,8 @@
 	ACPI_PROBE_TABLE(irqchip)					\
 	ACPI_PROBE_TABLE(timer)						\
 	EARLYCON_TABLE()						\
-	LSM_TABLE()
+	LSM_TABLE()							\
+	EARLY_LSM_TABLE()
 
 #define INIT_TEXT							\
 	*(.init.text .init.text.*)					\
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index df1318d85f7d..aebb0e032072 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -2104,12 +2104,18 @@ struct lsm_info {
 };
 
 extern struct lsm_info __start_lsm_info[], __end_lsm_info[];
+extern struct lsm_info __start_early_lsm_info[], __end_early_lsm_info[];
 
 #define DEFINE_LSM(lsm)							\
 	static struct lsm_info __lsm_##lsm				\
 		__used __section(.lsm_info.init)			\
 		__aligned(sizeof(unsigned long))
 
+#define DEFINE_EARLY_LSM(lsm)						\
+	static struct lsm_info __early_lsm_##lsm			\
+		__used __section(.early_lsm_info.init)			\
+		__aligned(sizeof(unsigned long))
+
 #ifdef CONFIG_SECURITY_SELINUX_DISABLE
 /*
  * Assuring the safety of deleting a security module is up to
diff --git a/include/linux/security.h b/include/linux/security.h
index 5f7441abbf42..66a2fcbe6ab0 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -195,6 +195,7 @@ int unregister_blocking_lsm_notifier(struct notifier_block *nb);
 
 /* prototypes */
 extern int security_init(void);
+extern int early_security_init(void);
 
 /* Security operations */
 int security_binder_set_context_mgr(struct task_struct *mgr);
diff --git a/init/main.c b/init/main.c
index ff5803b0841c..0fefca3fd43c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -593,6 +593,7 @@ asmlinkage __visible void __init start_kernel(void)
 	boot_cpu_init();
 	page_address_init();
 	pr_notice("%s", linux_banner);
+	early_security_init();
 	setup_arch(&command_line);
 	mm_init_cpumask(&init_mm);
 	setup_command_line(command_line);
diff --git a/security/security.c b/security/security.c
index 250ee2d76406..90f1e291c800 100644
--- a/security/security.c
+++ b/security/security.c
@@ -33,6 +33,7 @@
 
 /* How many LSMs were built into the kernel? */
 #define LSM_COUNT (__end_lsm_info - __start_lsm_info)
+#define EARLY_LSM_COUNT (__end_early_lsm_info - __start_early_lsm_info)
 
 struct security_hook_heads security_hook_heads __lsm_ro_after_init;
 static BLOCKING_NOTIFIER_HEAD(blocking_lsm_notifier_chain);
@@ -277,6 +278,8 @@ static void __init ordered_lsm_parse(const char *order, const char *origin)
 static void __init lsm_early_cred(struct cred *cred);
 static void __init lsm_early_task(struct task_struct *task);
 
+static int lsm_append(const char *new, char **result);
+
 static void __init ordered_lsm_init(void)
 {
 	struct lsm_info **lsm;
@@ -323,6 +326,26 @@ static void __init ordered_lsm_init(void)
 	kfree(ordered_lsms);
 }
 
+int __init early_security_init(void)
+{
+	int i;
+	struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
+	struct lsm_info *lsm;
+
+	for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
+	     i++)
+		INIT_HLIST_HEAD(&list[i]);
+
+	for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
+		if (!lsm->enabled)
+			lsm->enabled = &lsm_enabled_true;
+		prepare_lsm(lsm);
+		initialize_lsm(lsm);
+	}
+
+	return 0;
+}
+
 /**
  * security_init - initializes the security framework
  *
@@ -330,14 +353,18 @@ static void __init ordered_lsm_init(void)
  */
 int __init security_init(void)
 {
-	int i;
-	struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
+	struct lsm_info *lsm;
 
 	pr_info("Security Framework initializing\n");
 
-	for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
-	     i++)
-		INIT_HLIST_HEAD(&list[i]);
+	/*
+	 * Append the names of the early LSM modules now that kmalloc() is
+	 * available
+	 */
+	for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
+		if (lsm->enabled)
+			lsm_append(lsm->name, &lsm_names);
+	}
 
 	/* Load LSMs in specified order. */
 	ordered_lsm_init();
@@ -384,7 +411,7 @@ static bool match_last_lsm(const char *list, const char *lsm)
 	return !strcmp(last, lsm);
 }
 
-static int lsm_append(char *new, char **result)
+static int lsm_append(const char *new, char **result)
 {
 	char *cp;
 
@@ -422,8 +449,15 @@ void __init security_add_hooks(struct security_hook_list *hooks, int count,
 		hooks[i].lsm = lsm;
 		hlist_add_tail_rcu(&hooks[i].list, hooks[i].head);
 	}
-	if (lsm_append(lsm, &lsm_names) < 0)
-		panic("%s - Cannot get early memory.\n", __func__);
+
+	/*
+	 * Don't try to append during early_security_init(), we'll come back
+	 * and fix this up afterwards.
+	 */
+	if (slab_is_available()) {
+		if (lsm_append(lsm, &lsm_names) < 0)
+			panic("%s - Cannot get early memory.\n", __func__);
+	}
 }
 
 int call_blocking_lsm_notifier(enum lsm_event event, void *data)
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 02/29] security: Add a "locked down" LSM hook
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	Matthew Garrett, Kees Cook
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

Add a mechanism to allow LSMs to make a policy decision around whether
kernel functionality that would allow tampering with or examining the
runtime state of the kernel should be permitted.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 include/linux/lsm_hooks.h |  2 ++
 include/linux/security.h  | 32 ++++++++++++++++++++++++++++++++
 security/security.c       |  6 ++++++
 3 files changed, 40 insertions(+)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index aebb0e032072..29c22cf40113 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1807,6 +1807,7 @@ union security_list_options {
 	int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
 	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
 #endif /* CONFIG_BPF_SYSCALL */
+	int (*locked_down)(enum lockdown_reason what);
 };
 
 struct security_hook_heads {
@@ -2046,6 +2047,7 @@ struct security_hook_heads {
 	struct hlist_head bpf_prog_alloc_security;
 	struct hlist_head bpf_prog_free_security;
 #endif /* CONFIG_BPF_SYSCALL */
+	struct hlist_head locked_down;
 } __randomize_layout;
 
 /*
diff --git a/include/linux/security.h b/include/linux/security.h
index 66a2fcbe6ab0..c2b1204e8e26 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -77,6 +77,33 @@ enum lsm_event {
 	LSM_POLICY_CHANGE,
 };
 
+/*
+ * These are reasons that can be passed to the security_locked_down()
+ * LSM hook. Lockdown reasons that protect kernel integrity (ie, the
+ * ability for userland to modify kernel code) are placed before
+ * LOCKDOWN_INTEGRITY_MAX.  Lockdown reasons that protect kernel
+ * confidentiality (ie, the ability for userland to extract
+ * information from the running kernel that would otherwise be
+ * restricted) are placed before LOCKDOWN_CONFIDENTIALITY_MAX.
+ *
+ * LSM authors should note that the semantics of any given lockdown
+ * reason are not guaranteed to be stable - the same reason may block
+ * one set of features in one kernel release, and a slightly different
+ * set of features in a later kernel release. LSMs that seek to expose
+ * lockdown policy at any level of granularity other than "none",
+ * "integrity" or "confidentiality" are responsible for either
+ * ensuring that they expose a consistent level of functionality to
+ * userland, or ensuring that userland is aware that this is
+ * potentially a moving target. It is easy to misuse this information
+ * in a way that could break userspace. Please be careful not to do
+ * so.
+ */
+enum lockdown_reason {
+	LOCKDOWN_NONE,
+	LOCKDOWN_INTEGRITY_MAX,
+	LOCKDOWN_CONFIDENTIALITY_MAX,
+};
+
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
 		       int cap, unsigned int opts);
@@ -393,6 +420,7 @@ void security_inode_invalidate_secctx(struct inode *inode);
 int security_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen);
 int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen);
 int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
+int security_locked_down(enum lockdown_reason what);
 #else /* CONFIG_SECURITY */
 
 static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
@@ -1205,6 +1233,10 @@ static inline int security_inode_getsecctx(struct inode *inode, void **ctx, u32
 {
 	return -EOPNOTSUPP;
 }
+static inline int security_locked_down(enum lockdown_reason what)
+{
+	return 0;
+}
 #endif	/* CONFIG_SECURITY */
 
 #ifdef CONFIG_SECURITY_NETWORK
diff --git a/security/security.c b/security/security.c
index 90f1e291c800..ce6c945bf347 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2392,3 +2392,9 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux)
 	call_void_hook(bpf_prog_free_security, aux);
 }
 #endif /* CONFIG_BPF_SYSCALL */
+
+int security_locked_down(enum lockdown_reason what)
+{
+	return call_int_hook(locked_down, 0, what);
+}
+EXPORT_SYMBOL(security_locked_down);
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 03/29] security: Add a static lockdown policy LSM
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	Matthew Garrett, Kees Cook, David Howells
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

While existing LSMs can be extended to handle lockdown policy,
distributions generally want to be able to apply a straightforward
static policy. This patch adds a simple LSM that can be configured to
reject either integrity or all lockdown queries, and can be configured
at runtime (through securityfs), boot time (via a kernel parameter) or
build time (via a kconfig option). Based on initial code by David
Howells.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
---
 .../admin-guide/kernel-parameters.txt         |   9 +
 include/linux/security.h                      |   3 +
 security/Kconfig                              |  11 +-
 security/Makefile                             |   2 +
 security/lockdown/Kconfig                     |  47 +++++
 security/lockdown/Makefile                    |   1 +
 security/lockdown/lockdown.c                  | 172 ++++++++++++++++++
 7 files changed, 240 insertions(+), 5 deletions(-)
 create mode 100644 security/lockdown/Kconfig
 create mode 100644 security/lockdown/Makefile
 create mode 100644 security/lockdown/lockdown.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 099c5a4be95b..95acd46fd891 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2248,6 +2248,15 @@
 	lockd.nlm_udpport=M	[NFS] Assign UDP port.
 			Format: <integer>
 
+	lockdown=	[SECURITY]
+			{ integrity | confidentiality }
+			Enable the kernel lockdown feature. If set to
+			integrity, kernel features that allow userland to
+			modify the running kernel are disabled. If set to
+			confidentiality, kernel features that allow userland
+			to extract confidential information from the kernel
+			are also disabled.
+
 	locktorture.nreaders_stress= [KNL]
 			Set the number of locking read-acquisition kthreads.
 			Defaults to being automatically set based on the
diff --git a/include/linux/security.h b/include/linux/security.h
index c2b1204e8e26..54a0532ec12f 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -97,6 +97,9 @@ enum lsm_event {
  * potentially a moving target. It is easy to misuse this information
  * in a way that could break userspace. Please be careful not to do
  * so.
+ *
+ * If you add to this, remember to extend lockdown_reasons in
+ * security/lockdown/lockdown.c.
  */
 enum lockdown_reason {
 	LOCKDOWN_NONE,
diff --git a/security/Kconfig b/security/Kconfig
index 06a30851511a..967e86fc415a 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -237,6 +237,7 @@ source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
+source "security/lockdown/Kconfig"
 
 source "security/integrity/Kconfig"
 
@@ -276,11 +277,11 @@ endchoice
 
 config LSM
 	string "Ordered list of enabled LSMs"
-	default "yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor" if DEFAULT_SECURITY_SMACK
-	default "yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" if DEFAULT_SECURITY_APPARMOR
-	default "yama,loadpin,safesetid,integrity,tomoyo" if DEFAULT_SECURITY_TOMOYO
-	default "yama,loadpin,safesetid,integrity" if DEFAULT_SECURITY_DAC
-	default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
+	default "lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor" if DEFAULT_SECURITY_SMACK
+	default "lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" if DEFAULT_SECURITY_APPARMOR
+	default "lockdown,yama,loadpin,safesetid,integrity,tomoyo" if DEFAULT_SECURITY_TOMOYO
+	default "lockdown,yama,loadpin,safesetid,integrity" if DEFAULT_SECURITY_DAC
+	default "lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
 	help
 	  A comma-separated list of LSMs, in initialization order.
 	  Any LSMs left off this list will be ignored. This can be
diff --git a/security/Makefile b/security/Makefile
index c598b904938f..be1dd9d2cb2f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -11,6 +11,7 @@ subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
+subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -27,6 +28,7 @@ obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
 obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
+obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/lockdown/Kconfig b/security/lockdown/Kconfig
new file mode 100644
index 000000000000..7374ba76d8eb
--- /dev/null
+++ b/security/lockdown/Kconfig
@@ -0,0 +1,47 @@
+config SECURITY_LOCKDOWN_LSM
+	bool "Basic module for enforcing kernel lockdown"
+	depends on SECURITY
+	help
+	  Build support for an LSM that enforces a coarse kernel lockdown
+	  behaviour.
+
+config SECURITY_LOCKDOWN_LSM_EARLY
+	bool "Enable lockdown LSM early in init"
+	depends on SECURITY_LOCKDOWN_LSM
+	help
+	  Enable the lockdown LSM early in boot. This is necessary in order
+	  to ensure that lockdown enforcement can be carried out on kernel
+	  boot parameters that are otherwise parsed before the security
+	  subsystem is fully initialised. If enabled, lockdown will
+	  unconditionally be called before any other LSMs.
+
+choice
+	prompt "Kernel default lockdown mode"
+	default LOCK_DOWN_KERNEL_FORCE_NONE
+	depends on SECURITY_LOCKDOWN_LSM
+	help
+	  The kernel can be configured to default to differing levels of
+	  lockdown.
+
+config LOCK_DOWN_KERNEL_FORCE_NONE
+	bool "None"
+	help
+	  No lockdown functionality is enabled by default. Lockdown may be
+	  enabled via the kernel commandline or /sys/kernel/security/lockdown.
+
+config LOCK_DOWN_KERNEL_FORCE_INTEGRITY
+	bool "Integrity"
+	help
+	 The kernel runs in integrity mode by default. Features that allow
+	 the kernel to be modified at runtime are disabled.
+
+config LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY
+	bool "Confidentiality"
+	help
+	 The kernel runs in confidentiality mode by default. Features that
+	 allow the kernel to be modified at runtime or that permit userland
+	 code to read confidential material held inside the kernel are
+	 disabled.
+
+endchoice
+
diff --git a/security/lockdown/Makefile b/security/lockdown/Makefile
new file mode 100644
index 000000000000..e3634b9017e7
--- /dev/null
+++ b/security/lockdown/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown.o
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
new file mode 100644
index 000000000000..d30c4d254b5f
--- /dev/null
+++ b/security/lockdown/lockdown.c
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Lock down the kernel
+ *
+ * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include <linux/security.h>
+#include <linux/export.h>
+#include <linux/lsm_hooks.h>
+
+static enum lockdown_reason kernel_locked_down;
+
+static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
+	[LOCKDOWN_NONE] = "none",
+	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
+	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
+};
+
+static enum lockdown_reason lockdown_levels[] = {LOCKDOWN_NONE,
+						 LOCKDOWN_INTEGRITY_MAX,
+						 LOCKDOWN_CONFIDENTIALITY_MAX};
+
+/*
+ * Put the kernel into lock-down mode.
+ */
+static int lock_kernel_down(const char *where, enum lockdown_reason level)
+{
+	if (kernel_locked_down >= level)
+		return -EPERM;
+
+	kernel_locked_down = level;
+	pr_notice("Kernel is locked down from %s; see man kernel_lockdown.7\n",
+		  where);
+	return 0;
+}
+
+static int __init lockdown_param(char *level)
+{
+	if (!level)
+		return -EINVAL;
+
+	if (strcmp(level, "integrity") == 0)
+		lock_kernel_down("command line", LOCKDOWN_INTEGRITY_MAX);
+	else if (strcmp(level, "confidentiality") == 0)
+		lock_kernel_down("command line", LOCKDOWN_CONFIDENTIALITY_MAX);
+	else
+		return -EINVAL;
+
+	return 0;
+}
+
+early_param("lockdown", lockdown_param);
+
+/**
+ * lockdown_is_locked_down - Find out if the kernel is locked down
+ * @what: Tag to use in notice generated if lockdown is in effect
+ */
+static int lockdown_is_locked_down(enum lockdown_reason what)
+{
+	if (kernel_locked_down >= what) {
+		if (lockdown_reasons[what])
+			pr_notice("Lockdown: %s is restricted; see man kernel_lockdown.7\n",
+				  lockdown_reasons[what]);
+		return -EPERM;
+	}
+
+	return 0;
+}
+
+static struct security_hook_list lockdown_hooks[] __lsm_ro_after_init = {
+	LSM_HOOK_INIT(locked_down, lockdown_is_locked_down),
+};
+
+static int __init lockdown_lsm_init(void)
+{
+#if defined(CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY)
+	lock_kernel_down("Kernel configuration", LOCKDOWN_INTEGRITY_MAX);
+#elif defined(CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY)
+	lock_kernel_down("Kernel configuration", LOCKDOWN_CONFIDENTIALITY_MAX);
+#endif
+	security_add_hooks(lockdown_hooks, ARRAY_SIZE(lockdown_hooks),
+			   "lockdown");
+	return 0;
+}
+
+static ssize_t lockdown_read(struct file *filp, char __user *buf, size_t count,
+			     loff_t *ppos)
+{
+	char temp[80];
+	int i, offset = 0;
+
+	for (i = 0; i < ARRAY_SIZE(lockdown_levels); i++) {
+		enum lockdown_reason level = lockdown_levels[i];
+
+		if (lockdown_reasons[level]) {
+			const char *label = lockdown_reasons[level];
+
+			if (kernel_locked_down == level)
+				offset += sprintf(temp+offset, "[%s] ", label);
+			else
+				offset += sprintf(temp+offset, "%s ", label);
+		}
+	}
+
+	/* Convert the last space to a newline if needed. */
+	if (offset > 0)
+		temp[offset-1] = '\n';
+
+	return simple_read_from_buffer(buf, count, ppos, temp, strlen(temp));
+}
+
+static ssize_t lockdown_write(struct file *file, const char __user *buf,
+			      size_t n, loff_t *ppos)
+{
+	char *state;
+	int i, len, err = -EINVAL;
+
+	state = memdup_user_nul(buf, n);
+	if (IS_ERR(state))
+		return PTR_ERR(state);
+
+	len = strlen(state);
+	if (len && state[len-1] == '\n') {
+		state[len-1] = '\0';
+		len--;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(lockdown_levels); i++) {
+		enum lockdown_reason level = lockdown_levels[i];
+		const char *label = lockdown_reasons[level];
+
+		if (label && !strcmp(state, label))
+			err = lock_kernel_down("securityfs", level);
+	}
+
+	kfree(state);
+	return err ? err : n;
+}
+
+static const struct file_operations lockdown_ops = {
+	.read  = lockdown_read,
+	.write = lockdown_write,
+};
+
+static int __init lockdown_secfs_init(void)
+{
+	struct dentry *dentry;
+
+	dentry = securityfs_create_file("lockdown", 0600, NULL, NULL,
+					&lockdown_ops);
+	if (IS_ERR(dentry))
+		return PTR_ERR(dentry);
+
+	return 0;
+}
+
+core_initcall(lockdown_secfs_init);
+
+#ifdef CONFIG_SECURITY_LOCKDOWN_LSM_EARLY
+DEFINE_EARLY_LSM(lockdown) = {
+#else
+DEFINE_LSM(lockdown) = {
+#endif
+	.name = "lockdown",
+	.init = lockdown_lsm_init,
+};
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 04/29] Enforce module signatures if the kernel is locked down
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, David Howells,
	Matthew Garrett, Kees Cook, Jessica Yu
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

From: David Howells <dhowells@redhat.com>

If the kernel is locked down, require that all modules have valid
signatures that we can verify.

I have adjusted the errors generated:

 (1) If there's no signature (ENODATA) or we can't check it (ENOPKG,
     ENOKEY), then:

     (a) If signatures are enforced then EKEYREJECTED is returned.

     (b) If there's no signature or we can't check it, but the kernel is
	 locked down then EPERM is returned (this is then consistent with
	 other lockdown cases).

 (2) If the signature is unparseable (EBADMSG, EINVAL), the signature fails
     the check (EKEYREJECTED) or a system error occurs (eg. ENOMEM), we
     return the error we got.

Note that the X.509 code doesn't check for key expiry as the RTC might not
be valid or might not have been transferred to the kernel's clock yet.

 [Modified by Matthew Garrett to remove the IMA integration. This will
  be replaced with integration with the IMA architecture policy
  patchset.]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <matthewgarrett@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Jessica Yu <jeyu@kernel.org>
---
 include/linux/security.h     |  1 +
 kernel/module.c              | 37 +++++++++++++++++++++++++++++-------
 security/lockdown/lockdown.c |  1 +
 3 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index 54a0532ec12f..8e70063074a1 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -103,6 +103,7 @@ enum lsm_event {
  */
 enum lockdown_reason {
 	LOCKDOWN_NONE,
+	LOCKDOWN_MODULE_SIGNATURE,
 	LOCKDOWN_INTEGRITY_MAX,
 	LOCKDOWN_CONFIDENTIALITY_MAX,
 };
diff --git a/kernel/module.c b/kernel/module.c
index a2cee14a83f3..d8e1258e54af 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2753,8 +2753,9 @@ static inline void kmemleak_load_module(const struct module *mod,
 #ifdef CONFIG_MODULE_SIG
 static int module_sig_check(struct load_info *info, int flags)
 {
-	int err = -ENOKEY;
+	int err = -ENODATA;
 	const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
+	const char *reason;
 	const void *mod = info->hdr;
 
 	/*
@@ -2769,16 +2770,38 @@ static int module_sig_check(struct load_info *info, int flags)
 		err = mod_verify_sig(mod, info);
 	}
 
-	if (!err) {
+	switch (err) {
+	case 0:
 		info->sig_ok = true;
 		return 0;
-	}
 
-	/* Not having a signature is only an error if we're strict. */
-	if (err == -ENOKEY && !is_module_sig_enforced())
-		err = 0;
+		/* We don't permit modules to be loaded into trusted kernels
+		 * without a valid signature on them, but if we're not
+		 * enforcing, certain errors are non-fatal.
+		 */
+	case -ENODATA:
+		reason = "Loading of unsigned module";
+		goto decide;
+	case -ENOPKG:
+		reason = "Loading of module with unsupported crypto";
+		goto decide;
+	case -ENOKEY:
+		reason = "Loading of module with unavailable key";
+	decide:
+		if (is_module_sig_enforced()) {
+			pr_notice("%s is rejected\n", reason);
+			return -EKEYREJECTED;
+		}
 
-	return err;
+		return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
+
+		/* All other errors are fatal, including nomem, unparseable
+		 * signatures and signature check failures - even if signatures
+		 * aren't required.
+		 */
+	default:
+		return err;
+	}
 }
 #else /* !CONFIG_MODULE_SIG */
 static int module_sig_check(struct load_info *info, int flags)
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index d30c4d254b5f..2c53fd9f5c9b 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -18,6 +18,7 @@ static enum lockdown_reason kernel_locked_down;
 
 static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
 	[LOCKDOWN_NONE] = "none",
+	[LOCKDOWN_MODULE_SIGNATURE] = "unsigned module loading",
 	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
 	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
 };
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 08/29] kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Jiri Bohac,
	David Howells, Matthew Garrett, Dave Young, kexec
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

From: Jiri Bohac <jbohac@suse.cz>

This is a preparatory patch for kexec_file_load() lockdown.  A locked down
kernel needs to prevent unsigned kernel images from being loaded with
kexec_file_load().  Currently, the only way to force the signature
verification is compiling with KEXEC_VERIFY_SIG.  This prevents loading
usigned images even when the kernel is not locked down at runtime.

This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE.
Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG
turns on the signature verification but allows unsigned images to be
loaded.  KEXEC_SIG_FORCE disallows images without a valid signature.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Jiri Bohac <jbohac@suse.cz>
Reviewed-by: Dave Young <dyoung@redhat.com>
cc: kexec@lists.infradead.org
---
 arch/x86/Kconfig                       | 20 +++++++++----
 crypto/asymmetric_keys/verify_pefile.c |  4 ++-
 include/linux/kexec.h                  |  4 +--
 kernel/kexec_file.c                    | 41 ++++++++++++++++++++++----
 4 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9df2d1cb7a9e..104995fd32d0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2026,20 +2026,30 @@ config KEXEC_FILE
 config ARCH_HAS_KEXEC_PURGATORY
 	def_bool KEXEC_FILE
 
-config KEXEC_VERIFY_SIG
+config KEXEC_SIG
 	bool "Verify kernel signature during kexec_file_load() syscall"
 	depends on KEXEC_FILE
 	---help---
-	  This option makes kernel signature verification mandatory for
-	  the kexec_file_load() syscall.
 
-	  In addition to that option, you need to enable signature
+	  This option makes the kexec_file_load() syscall check for a valid
+	  signature of the kernel image.  The image can still be loaded without
+	  a valid signature unless you also enable KEXEC_SIG_FORCE, though if
+	  there's a signature that we can check, then it must be valid.
+
+	  In addition to this option, you need to enable signature
 	  verification for the corresponding kernel image type being
 	  loaded in order for this to work.
 
+config KEXEC_SIG_FORCE
+	bool "Require a valid signature in kexec_file_load() syscall"
+	depends on KEXEC_SIG
+	---help---
+	  This option makes kernel signature verification mandatory for
+	  the kexec_file_load() syscall.
+
 config KEXEC_BZIMAGE_VERIFY_SIG
 	bool "Enable bzImage signature verification support"
-	depends on KEXEC_VERIFY_SIG
+	depends on KEXEC_SIG
 	depends on SIGNED_PE_FILE_VERIFICATION
 	select SYSTEM_TRUSTED_KEYRING
 	---help---
diff --git a/crypto/asymmetric_keys/verify_pefile.c b/crypto/asymmetric_keys/verify_pefile.c
index 3b303fe2f061..cc9dbcecaaca 100644
--- a/crypto/asymmetric_keys/verify_pefile.c
+++ b/crypto/asymmetric_keys/verify_pefile.c
@@ -96,7 +96,7 @@ static int pefile_parse_binary(const void *pebuf, unsigned int pelen,
 
 	if (!ddir->certs.virtual_address || !ddir->certs.size) {
 		pr_debug("Unsigned PE binary\n");
-		return -EKEYREJECTED;
+		return -ENODATA;
 	}
 
 	chkaddr(ctx->header_size, ddir->certs.virtual_address,
@@ -403,6 +403,8 @@ static int pefile_digest_pe(const void *pebuf, unsigned int pelen,
  *  (*) 0 if at least one signature chain intersects with the keys in the trust
  *	keyring, or:
  *
+ *  (*) -ENODATA if there is no signature present.
+ *
  *  (*) -ENOPKG if a suitable crypto module couldn't be found for a check on a
  *	chain.
  *
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index b9b1bc5f9669..58b27c7bdc2b 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -125,7 +125,7 @@ typedef void *(kexec_load_t)(struct kimage *image, char *kernel_buf,
 			     unsigned long cmdline_len);
 typedef int (kexec_cleanup_t)(void *loader_data);
 
-#ifdef CONFIG_KEXEC_VERIFY_SIG
+#ifdef CONFIG_KEXEC_SIG
 typedef int (kexec_verify_sig_t)(const char *kernel_buf,
 				 unsigned long kernel_len);
 #endif
@@ -134,7 +134,7 @@ struct kexec_file_ops {
 	kexec_probe_t *probe;
 	kexec_load_t *load;
 	kexec_cleanup_t *cleanup;
-#ifdef CONFIG_KEXEC_VERIFY_SIG
+#ifdef CONFIG_KEXEC_SIG
 	kexec_verify_sig_t *verify_sig;
 #endif
 };
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b8cc032d5620..875482c34154 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -88,7 +88,7 @@ int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
 	return kexec_image_post_load_cleanup_default(image);
 }
 
-#ifdef CONFIG_KEXEC_VERIFY_SIG
+#ifdef CONFIG_KEXEC_SIG
 static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
 					  unsigned long buf_len)
 {
@@ -186,7 +186,8 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
 			     const char __user *cmdline_ptr,
 			     unsigned long cmdline_len, unsigned flags)
 {
-	int ret = 0;
+	const char *reason;
+	int ret;
 	void *ldata;
 	loff_t size;
 
@@ -202,14 +203,42 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd,
 	if (ret)
 		goto out;
 
-#ifdef CONFIG_KEXEC_VERIFY_SIG
+#ifdef CONFIG_KEXEC_SIG
 	ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
 					   image->kernel_buf_len);
-	if (ret) {
-		pr_debug("kernel signature verification failed.\n");
+	switch (ret) {
+	case 0:
+		break;
+
+		/* Certain verification errors are non-fatal if we're not
+		 * checking errors, provided we aren't mandating that there
+		 * must be a valid signature.
+		 */
+	case -ENODATA:
+		reason = "kexec of unsigned image";
+		goto decide;
+	case -ENOPKG:
+		reason = "kexec of image with unsupported crypto";
+		goto decide;
+	case -ENOKEY:
+		reason = "kexec of image with unavailable key";
+	decide:
+		if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) {
+			pr_notice("%s rejected\n", reason);
+			goto out;
+		}
+
+		ret = 0;
+		break;
+
+		/* All other errors are fatal, including nomem, unparseable
+		 * signatures and signature check failures - even if signatures
+		 * aren't required.
+		 */
+	default:
+		pr_notice("kernel signature verification failed (%d).\n", ret);
 		goto out;
 	}
-	pr_debug("kernel signature verification successful.\n");
 #endif
 	/* It is possible that there no initramfs is being loaded */
 	if (!(flags & KEXEC_FILE_NO_INITRAMFS)) {
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 05/29] Restrict /dev/{mem,kmem,port} when the kernel is locked down
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	David Howells, Matthew Garrett, Kees Cook, x86
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

From: Matthew Garrett <mjg59@srcf.ucam.org>

Allowing users to read and write to core kernel memory makes it possible
for the kernel to be subverted, avoiding module loading restrictions, and
also to steal cryptographic information.

Disallow /dev/mem and /dev/kmem from being opened this when the kernel has
been locked down to prevent this.

Also disallow /dev/port from being opened to prevent raw ioport access and
thus DMA from being used to accomplish the same thing.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: x86@kernel.org
---
 drivers/char/mem.c           | 7 +++++--
 include/linux/security.h     | 1 +
 security/lockdown/lockdown.c | 1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index b08dc50f9f26..d0148aee1aab 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -29,8 +29,8 @@
 #include <linux/export.h>
 #include <linux/io.h>
 #include <linux/uio.h>
-
 #include <linux/uaccess.h>
+#include <linux/security.h>
 
 #ifdef CONFIG_IA64
 # include <linux/efi.h>
@@ -786,7 +786,10 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig)
 
 static int open_port(struct inode *inode, struct file *filp)
 {
-	return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
+	if (!capable(CAP_SYS_RAWIO))
+		return -EPERM;
+
+	return security_locked_down(LOCKDOWN_DEV_MEM);
 }
 
 #define zero_lseek	null_lseek
diff --git a/include/linux/security.h b/include/linux/security.h
index 8e70063074a1..9458152601b5 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -104,6 +104,7 @@ enum lsm_event {
 enum lockdown_reason {
 	LOCKDOWN_NONE,
 	LOCKDOWN_MODULE_SIGNATURE,
+	LOCKDOWN_DEV_MEM,
 	LOCKDOWN_INTEGRITY_MAX,
 	LOCKDOWN_CONFIDENTIALITY_MAX,
 };
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index 2c53fd9f5c9b..d2ef29d9f0b2 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -19,6 +19,7 @@ static enum lockdown_reason kernel_locked_down;
 static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
 	[LOCKDOWN_NONE] = "none",
 	[LOCKDOWN_MODULE_SIGNATURE] = "unsigned module loading",
+	[LOCKDOWN_DEV_MEM] = "/dev/mem,kmem,port",
 	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
 	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
 };
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 07/29] Copy secure_boot flag in boot params across kexec reboot
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Dave Young,
	David Howells, Matthew Garrett, Kees Cook, kexec
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

From: Dave Young <dyoung@redhat.com>

Kexec reboot in case secure boot being enabled does not keep the secure
boot mode in new kernel, so later one can load unsigned kernel via legacy
kexec_load.  In this state, the system is missing the protections provided
by secure boot.

Adding a patch to fix this by retain the secure_boot flag in original
kernel.

secure_boot flag in boot_params is set in EFI stub, but kexec bypasses the
stub.  Fixing this issue by copying secure_boot flag across kexec reboot.

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: kexec@lists.infradead.org
---
 arch/x86/kernel/kexec-bzimage64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index 5ebcd02cbca7..d2f4e706a428 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -180,6 +180,7 @@ setup_efi_state(struct boot_params *params, unsigned long params_load_addr,
 	if (efi_enabled(EFI_OLD_MEMMAP))
 		return 0;
 
+	params->secure_boot = boot_params.secure_boot;
 	ei->efi_loader_signature = current_ei->efi_loader_signature;
 	ei->efi_systab = current_ei->efi_systab;
 	ei->efi_systab_hi = current_ei->efi_systab_hi;
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

* [PATCH V36 11/29] PCI: Lock down BAR access when the kernel is locked down
From: Matthew Garrett @ 2019-07-18 19:43 UTC (permalink / raw)
  To: jmorris
  Cc: linux-security-module, linux-kernel, linux-api, Matthew Garrett,
	David Howells, Matthew Garrett, Bjorn Helgaas, Kees Cook,
	linux-pci
In-Reply-To: <20190718194415.108476-1-matthewgarrett@google.com>

From: Matthew Garrett <mjg59@srcf.ucam.org>

Any hardware that can potentially generate DMA has to be locked down in
order to avoid it being possible for an attacker to modify kernel code,
allowing them to circumvent disabled module loading or module signing.
Default to paranoid - in future we can potentially relax this for
sufficiently IOMMU-isolated devices.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-sysfs.c      | 16 ++++++++++++++++
 drivers/pci/proc.c           | 14 ++++++++++++--
 drivers/pci/syscall.c        |  4 +++-
 include/linux/security.h     |  1 +
 security/lockdown/lockdown.c |  1 +
 5 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 6d27475e39b2..ec103a7e13fc 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -903,6 +903,11 @@ static ssize_t pci_write_config(struct file *filp, struct kobject *kobj,
 	unsigned int size = count;
 	loff_t init_off = off;
 	u8 *data = (u8 *) buf;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+	if (ret)
+		return ret;
 
 	if (off > dev->cfg_size)
 		return 0;
@@ -1164,6 +1169,11 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
 	int bar = (unsigned long)attr->private;
 	enum pci_mmap_state mmap_type;
 	struct resource *res = &pdev->resource[bar];
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+	if (ret)
+		return ret;
 
 	if (res->flags & IORESOURCE_MEM && iomem_is_exclusive(res->start))
 		return -EINVAL;
@@ -1240,6 +1250,12 @@ static ssize_t pci_write_resource_io(struct file *filp, struct kobject *kobj,
 				     struct bin_attribute *attr, char *buf,
 				     loff_t off, size_t count)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+	if (ret)
+		return ret;
+
 	return pci_resource_io(filp, kobj, attr, buf, off, count, true);
 }
 
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 445b51db75b0..e29b0d5ced62 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -13,6 +13,7 @@
 #include <linux/seq_file.h>
 #include <linux/capability.h>
 #include <linux/uaccess.h>
+#include <linux/security.h>
 #include <asm/byteorder.h>
 #include "pci.h"
 
@@ -115,7 +116,11 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
 	struct pci_dev *dev = PDE_DATA(ino);
 	int pos = *ppos;
 	int size = dev->cfg_size;
-	int cnt;
+	int cnt, ret;
+
+	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+	if (ret)
+		return ret;
 
 	if (pos >= size)
 		return 0;
@@ -196,6 +201,10 @@ static long proc_bus_pci_ioctl(struct file *file, unsigned int cmd,
 #endif /* HAVE_PCI_MMAP */
 	int ret = 0;
 
+	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+	if (ret)
+		return ret;
+
 	switch (cmd) {
 	case PCIIOC_CONTROLLER:
 		ret = pci_domain_nr(dev->bus);
@@ -238,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 	struct pci_filp_private *fpriv = file->private_data;
 	int i, ret, write_combine = 0, res_bit = IORESOURCE_MEM;
 
-	if (!capable(CAP_SYS_RAWIO))
+	if (!capable(CAP_SYS_RAWIO) ||
+	    security_locked_down(LOCKDOWN_PCI_ACCESS))
 		return -EPERM;
 
 	if (fpriv->mmap_state == pci_mmap_io) {
diff --git a/drivers/pci/syscall.c b/drivers/pci/syscall.c
index d96626c614f5..31e39558d49d 100644
--- a/drivers/pci/syscall.c
+++ b/drivers/pci/syscall.c
@@ -7,6 +7,7 @@
 
 #include <linux/errno.h>
 #include <linux/pci.h>
+#include <linux/security.h>
 #include <linux/syscalls.h>
 #include <linux/uaccess.h>
 #include "pci.h"
@@ -90,7 +91,8 @@ SYSCALL_DEFINE5(pciconfig_write, unsigned long, bus, unsigned long, dfn,
 	u32 dword;
 	int err = 0;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN) ||
+	    security_locked_down(LOCKDOWN_PCI_ACCESS))
 		return -EPERM;
 
 	dev = pci_get_domain_bus_and_slot(0, bus, dfn);
diff --git a/include/linux/security.h b/include/linux/security.h
index 304a155a5628..8adbd62b7669 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -107,6 +107,7 @@ enum lockdown_reason {
 	LOCKDOWN_DEV_MEM,
 	LOCKDOWN_KEXEC,
 	LOCKDOWN_HIBERNATION,
+	LOCKDOWN_PCI_ACCESS,
 	LOCKDOWN_INTEGRITY_MAX,
 	LOCKDOWN_CONFIDENTIALITY_MAX,
 };
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index a0996f75629f..655fe388e615 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -22,6 +22,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
 	[LOCKDOWN_DEV_MEM] = "/dev/mem,kmem,port",
 	[LOCKDOWN_KEXEC] = "kexec of unsigned images",
 	[LOCKDOWN_HIBERNATION] = "hibernation",
+	[LOCKDOWN_PCI_ACCESS] = "direct PCI access",
 	[LOCKDOWN_INTEGRITY_MAX] = "integrity",
 	[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
 };
-- 
2.22.0.510.g264f2c817a-goog


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox