Linux Security Modules development
 help / color / mirror / Atom feed
* [PATCH 1/7 v2] tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")
From: Steven Rostedt @ 2019-10-12  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Matthew Garrett,
	James Morris James Morris, LSM List, Linux API, Ben Hutchings,
	Al Viro
In-Reply-To: <20191012005747.210722465@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Running the latest kernel through my "make instances" stress tests, I
triggered the following bug (with KASAN and kmemleak enabled):

mkdir invoked oom-killer:
gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0,
oom_score_adj=0
CPU: 1 PID: 2229 Comm: mkdir Not tainted 5.4.0-rc2-test #325
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
Call Trace:
 dump_stack+0x64/0x8c
 dump_header+0x43/0x3b7
 ? trace_hardirqs_on+0x48/0x4a
 oom_kill_process+0x68/0x2d5
 out_of_memory+0x2aa/0x2d0
 __alloc_pages_nodemask+0x96d/0xb67
 __alloc_pages_node+0x19/0x1e
 alloc_slab_page+0x17/0x45
 new_slab+0xd0/0x234
 ___slab_alloc.constprop.86+0x18f/0x336
 ? alloc_inode+0x2c/0x74
 ? irq_trace+0x12/0x1e
 ? tracer_hardirqs_off+0x1d/0xd7
 ? __slab_alloc.constprop.85+0x21/0x53
 __slab_alloc.constprop.85+0x31/0x53
 ? __slab_alloc.constprop.85+0x31/0x53
 ? alloc_inode+0x2c/0x74
 kmem_cache_alloc+0x50/0x179
 ? alloc_inode+0x2c/0x74
 alloc_inode+0x2c/0x74
 new_inode_pseudo+0xf/0x48
 new_inode+0x15/0x25
 tracefs_get_inode+0x23/0x7c
 ? lookup_one_len+0x54/0x6c
 tracefs_create_file+0x53/0x11d
 trace_create_file+0x15/0x33
 event_create_dir+0x2a3/0x34b
 __trace_add_new_event+0x1c/0x26
 event_trace_add_tracer+0x56/0x86
 trace_array_create+0x13e/0x1e1
 instance_mkdir+0x8/0x17
 tracefs_syscall_mkdir+0x39/0x50
 ? get_dname+0x31/0x31
 vfs_mkdir+0x78/0xa3
 do_mkdirat+0x71/0xb0
 sys_mkdir+0x19/0x1b
 do_fast_syscall_32+0xb0/0xed

I bisected this down to the addition of the proxy_ops into tracefs for
lockdown. It appears that the allocation of the proxy_ops and then freeing
it in the destroy_inode callback, is causing havoc with the memory system.
Reading the documentation about destroy_inode and talking with Linus about
this, this is buggy and wrong.

Instead of allocating the proxy_ops (and then having to free it) the checks
should be done by the open functions themselves, and not hack into the
tracefs directory. First revert the tracefs updates for locked_down and then
later we can add the locked_down checks in the kernel/trace files.

Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home

Fixes: ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 fs/tracefs/inode.c | 42 +-----------------------------------------
 1 file changed, 1 insertion(+), 41 deletions(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9fc14e38927f..eeeae0475da9 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -20,7 +20,6 @@
 #include <linux/parser.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
-#include <linux/security.h>
 
 #define TRACEFS_DEFAULT_MODE	0700
 
@@ -28,25 +27,6 @@ static struct vfsmount *tracefs_mount;
 static int tracefs_mount_count;
 static bool tracefs_registered;
 
-static int default_open_file(struct inode *inode, struct file *filp)
-{
-	struct dentry *dentry = filp->f_path.dentry;
-	struct file_operations *real_fops;
-	int ret;
-
-	if (!dentry)
-		return -EINVAL;
-
-	ret = security_locked_down(LOCKDOWN_TRACEFS);
-	if (ret)
-		return ret;
-
-	real_fops = dentry->d_fsdata;
-	if (!real_fops->open)
-		return 0;
-	return real_fops->open(inode, filp);
-}
-
 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 {
@@ -241,12 +221,6 @@ static int tracefs_apply_options(struct super_block *sb)
 	return 0;
 }
 
-static void tracefs_destroy_inode(struct inode *inode)
-{
-	if (S_ISREG(inode->i_mode))
-		kfree(inode->i_fop);
-}
-
 static int tracefs_remount(struct super_block *sb, int *flags, char *data)
 {
 	int err;
@@ -283,7 +257,6 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
 static const struct super_operations tracefs_super_operations = {
 	.statfs		= simple_statfs,
 	.remount_fs	= tracefs_remount,
-	.destroy_inode  = tracefs_destroy_inode,
 	.show_options	= tracefs_show_options,
 };
 
@@ -414,7 +387,6 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 				   struct dentry *parent, void *data,
 				   const struct file_operations *fops)
 {
-	struct file_operations *proxy_fops;
 	struct dentry *dentry;
 	struct inode *inode;
 
@@ -430,20 +402,8 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
-	proxy_fops = kzalloc(sizeof(struct file_operations), GFP_KERNEL);
-	if (unlikely(!proxy_fops)) {
-		iput(inode);
-		return failed_creating(dentry);
-	}
-
-	if (!fops)
-		fops = &tracefs_file_operations;
-
-	dentry->d_fsdata = (void *)fops;
-	memcpy(proxy_fops, fops, sizeof(*proxy_fops));
-	proxy_fops->open = default_open_file;
 	inode->i_mode = mode;
-	inode->i_fop = proxy_fops;
+	inode->i_fop = fops ? fops : &tracefs_file_operations;
 	inode->i_private = data;
 	d_instantiate(dentry, inode);
 	fsnotify_create(dentry->d_parent->d_inode, dentry);
-- 
2.23.0



^ permalink raw reply related

* [PATCH 5/7 v2] tracing: Add tracing_check_open_get_tr()
From: Steven Rostedt @ 2019-10-12  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Matthew Garrett,
	James Morris James Morris, LSM List, Linux API, Ben Hutchings,
	Al Viro
In-Reply-To: <20191012005747.210722465@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Currently, most files in the tracefs directory that gets opened needs to
test if tracing_disabled is set. If so, it should return -ENODEV. The
tracing_disabled is called when tracing is found to be broken. Originally it
was done in case the ring buffer was found to be corrupted, and we wanted to
prevent reading it from crashing the kernel. But it's also called if a
tracing selftest fails on boot. It's a one way switch. That is, once it is
triggered, tracing is disabled until reboot.

As most tracefs files can also be used by instances in the tracefs
directory, they need to be carefully done. Each instance has a trace_array
associated to it, and when the instance is removed, the trace_array is
freed. But if an instance is opened with a reference to the trace_array, the
getting the trace_array has to be done by a lookup (as there could be a race
with it being deleted and the open itself), and then once it is found, a
reference is added to prevent the instance from being removed (and the
trace_array associated with it freed).

Combine the two checks (tracing_disabled and trace_array_get()) into a
single helper function. This will also make it easier to add lockdown to
tracefs later.

Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c            |   7 +-
 kernel/trace/trace.c             | 117 +++++++++++++++++--------------
 kernel/trace/trace.h             |   1 +
 kernel/trace/trace_dynevent.c    |   4 ++
 kernel/trace/trace_events.c      |  10 +--
 kernel/trace/trace_events_hist.c |   2 +-
 6 files changed, 81 insertions(+), 60 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 32c2eb167de0..8b765a55e01c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3547,7 +3547,7 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 	if (unlikely(ftrace_disabled))
 		return -ENODEV;
 
-	if (tr && trace_array_get(tr) < 0)
+	if (tracing_check_open_get_tr(tr))
 		return -ENODEV;
 
 	iter = kzalloc(sizeof(*iter), GFP_KERNEL);
@@ -6546,8 +6546,9 @@ ftrace_pid_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret = 0;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if ((file->f_mode & FMODE_WRITE) &&
 	    (file->f_flags & O_TRUNC))
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 94f1b9124939..26ee280f852b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -304,6 +304,17 @@ void trace_array_put(struct trace_array *this_tr)
 	mutex_unlock(&trace_types_lock);
 }
 
+int tracing_check_open_get_tr(struct trace_array *tr)
+{
+	if (tracing_disabled)
+		return -ENODEV;
+
+	if (tr && trace_array_get(tr) < 0)
+		return -ENODEV;
+
+	return 0;
+}
+
 int call_filter_check_discard(struct trace_event_call *call, void *rec,
 			      struct ring_buffer *buffer,
 			      struct ring_buffer_event *event)
@@ -4140,8 +4151,11 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
 
 int tracing_open_generic(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	filp->private_data = inode->i_private;
 	return 0;
@@ -4159,12 +4173,11 @@ bool tracing_is_disabled(void)
 int tracing_open_generic_tr(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
+	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	filp->private_data = inode->i_private;
 
@@ -4233,10 +4246,11 @@ static int tracing_open(struct inode *inode, struct file *file)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	/* If this file was open for write, then erase contents */
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
@@ -4352,11 +4366,9 @@ static int show_traces_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = seq_open(file, &show_traces_seq_ops);
 	if (ret) {
@@ -4710,11 +4722,9 @@ static int tracing_trace_options_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_trace_options_show, inode->i_private);
 	if (ret < 0)
@@ -5051,8 +5061,11 @@ static const struct seq_operations tracing_saved_tgids_seq_ops = {
 
 static int tracing_saved_tgids_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_saved_tgids_seq_ops);
 }
@@ -5128,8 +5141,11 @@ static const struct seq_operations tracing_saved_cmdlines_seq_ops = {
 
 static int tracing_saved_cmdlines_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_saved_cmdlines_seq_ops);
 }
@@ -5293,8 +5309,11 @@ static const struct seq_operations tracing_eval_map_seq_ops = {
 
 static int tracing_eval_map_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_eval_map_seq_ops);
 }
@@ -5817,13 +5836,11 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
-
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	mutex_lock(&trace_types_lock);
 
@@ -6560,11 +6577,9 @@ static int tracing_clock_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr))
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_clock_show, inode->i_private);
 	if (ret < 0)
@@ -6594,11 +6609,9 @@ static int tracing_time_stamp_mode_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr))
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_time_stamp_mode_show, inode->i_private);
 	if (ret < 0)
@@ -6651,10 +6664,11 @@ static int tracing_snapshot_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
 	struct seq_file *m;
-	int ret = 0;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if (file->f_mode & FMODE_READ) {
 		iter = __tracing_open(inode, file, true);
@@ -7118,8 +7132,9 @@ static int tracing_err_log_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret = 0;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	/* If this file was opened for write, then erase contents */
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC))
@@ -7170,11 +7185,9 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	info = kzalloc(sizeof(*info), GFP_KERNEL);
 	if (!info) {
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 854dbf4050f8..d685c61085c0 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -338,6 +338,7 @@ extern struct mutex trace_types_lock;
 
 extern int trace_array_get(struct trace_array *tr);
 extern void trace_array_put(struct trace_array *tr);
+extern int tracing_check_open_get_tr(struct trace_array *tr);
 
 extern int tracing_set_time_stamp_abs(struct trace_array *tr, bool abs);
 extern int tracing_set_clock(struct trace_array *tr, const char *clockstr);
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index a41fed46c285..89779eb84a07 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -174,6 +174,10 @@ static int dyn_event_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(NULL);
 		if (ret < 0)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 0d29f2f13477..ae35d6cb802a 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1784,8 +1784,9 @@ ftrace_event_set_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if ((file->f_mode & FMODE_WRITE) &&
 	    (file->f_flags & O_TRUNC))
@@ -1804,8 +1805,9 @@ ftrace_event_set_pid_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if ((file->f_mode & FMODE_WRITE) &&
 	    (file->f_flags & O_TRUNC))
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 9468bd8d44a2..dd18d76bf1bd 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1680,7 +1680,7 @@ static int save_hist_vars(struct hist_trigger_data *hist_data)
 	if (var_data)
 		return 0;
 
-	if (trace_array_get(tr) < 0)
+	if (tracing_check_open_get_tr(tr))
 		return -ENODEV;
 
 	var_data = kzalloc(sizeof(*var_data), GFP_KERNEL);
-- 
2.23.0



^ permalink raw reply related

* [PATCH 6/7 v2] tracing: Add some more locked_down checks
From: Steven Rostedt @ 2019-10-12  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Matthew Garrett,
	James Morris James Morris, LSM List, Linux API, Ben Hutchings,
	Al Viro
In-Reply-To: <20191012005747.210722465@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Added various checks on open tracefs calls to see if tracefs is in lockdown
mode, and if so, to return -EPERM.

Note, the event format files (which are basically standard on all machines)
as well as the enabled_functions file (which shows what is currently being
traced) are not lockde down. Perhaps they should be, but it seems counter
intuitive to lockdown information to help you know if the system has been
modified.

Link: http://lkml.kernel.org/r/CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.com

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c               | 23 ++++++++++++++++++++++-
 kernel/trace/trace.c                |  8 ++++++++
 kernel/trace/trace_events.c         |  8 ++++++++
 kernel/trace/trace_events_hist.c    | 11 +++++++++++
 kernel/trace/trace_events_trigger.c |  8 +++++++-
 kernel/trace/trace_kprobe.c         | 12 +++++++++++-
 kernel/trace/trace_printk.c         |  7 +++++++
 kernel/trace/trace_stack.c          |  8 ++++++++
 kernel/trace/trace_stat.c           |  6 +++++-
 kernel/trace/trace_uprobe.c         | 11 +++++++++++
 10 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8b765a55e01c..f296d89be757 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -18,6 +18,7 @@
 #include <linux/clocksource.h>
 #include <linux/sched/task.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/tracefs.h>
 #include <linux/hardirq.h>
@@ -3486,6 +3487,11 @@ static int
 ftrace_avail_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (unlikely(ftrace_disabled))
 		return -ENODEV;
@@ -3505,6 +3511,15 @@ ftrace_enabled_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
 
+	/*
+	 * This shows us what functions are currently being
+	 * traced and by what. Not sure if we want lockdown
+	 * to hide such critical information for an admin.
+	 * Although, perhaps it can show information we don't
+	 * want people to see, but if something is tracing
+	 * something, we probably want to know about it.
+	 */
+
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
 	if (!iter)
 		return -ENOMEM;
@@ -3625,6 +3640,7 @@ ftrace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops,
 			FTRACE_ITER_FILTER | FTRACE_ITER_DO_PROBES,
 			inode, file);
@@ -3635,6 +3651,7 @@ ftrace_notrace_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_NOTRACE,
 				 inode, file);
 }
@@ -5203,9 +5220,13 @@ static int
 __ftrace_graph_open(struct inode *inode, struct file *file,
 		    struct ftrace_graph_data *fgd)
 {
-	int ret = 0;
+	int ret;
 	struct ftrace_hash *new_hash = NULL;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (file->f_mode & FMODE_WRITE) {
 		const int size_bits = FTRACE_HASH_DEFAULT_BITS;
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 26ee280f852b..2b4eff383505 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -17,6 +17,7 @@
 #include <linux/stacktrace.h>
 #include <linux/writeback.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/notifier.h>
 #include <linux/irqflags.h>
@@ -306,6 +307,12 @@ void trace_array_put(struct trace_array *this_tr)
 
 int tracing_check_open_get_tr(struct trace_array *tr)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -6813,6 +6820,7 @@ static int snapshot_raw_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
+	/* The following checks for tracefs lockdown */
 	ret = tracing_buffers_open(inode, filp);
 	if (ret < 0)
 		return ret;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index ae35d6cb802a..994b7a408c5f 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -12,6 +12,7 @@
 #define pr_fmt(fmt) fmt
 
 #include <linux/workqueue.h>
+#include <linux/security.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/tracefs.h>
@@ -1294,6 +1295,8 @@ static int trace_format_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
+	/* Do we want to hide event format files on tracefs lockdown? */
+
 	ret = seq_open(file, &trace_format_seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1750,6 +1753,10 @@ ftrace_event_open(struct inode *inode, struct file *file,
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = seq_open(file, seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1774,6 +1781,7 @@ ftrace_event_avail_open(struct inode *inode, struct file *file)
 {
 	const struct seq_operations *seq_ops = &show_event_seq_ops;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_event_open(inode, file, seq_ops);
 }
 
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index dd18d76bf1bd..57648c5aa679 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -7,6 +7,7 @@
 
 #include <linux/module.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -1448,6 +1449,10 @@ static int synth_events_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&synth_event_ops);
 		if (ret < 0)
@@ -5515,6 +5520,12 @@ static int hist_show(struct seq_file *m, void *v)
 
 static int event_hist_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return single_open(file, hist_show, file);
 }
 
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 2a2912cb4533..2cd53ca21b51 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2013 Tom Zanussi <tom.zanussi@linux.intel.com>
  */
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/ctype.h>
 #include <linux/mutex.h>
@@ -173,7 +174,11 @@ static const struct seq_operations event_triggers_seq_ops = {
 
 static int event_trigger_regex_open(struct inode *inode, struct file *file)
 {
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	mutex_lock(&event_mutex);
 
@@ -292,6 +297,7 @@ event_trigger_write(struct file *filp, const char __user *ubuf,
 static int
 event_trigger_open(struct inode *inode, struct file *filp)
 {
+	/* Checks for tracefs lockdown */
 	return event_trigger_regex_open(inode, filp);
 }
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 324ffbea3556..1552a95c743b 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -7,11 +7,11 @@
  */
 #define pr_fmt(fmt)	"trace_kprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
 #include <linux/rculist.h>
 #include <linux/error-injection.h>
-#include <linux/security.h>
 
 #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
 
@@ -936,6 +936,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_kprobe_ops);
 		if (ret < 0)
@@ -988,6 +992,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 
diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
index c3fd849d4a8f..d4e31e969206 100644
--- a/kernel/trace/trace_printk.c
+++ b/kernel/trace/trace_printk.c
@@ -6,6 +6,7 @@
  *
  */
 #include <linux/seq_file.h>
+#include <linux/security.h>
 #include <linux/uaccess.h>
 #include <linux/kernel.h>
 #include <linux/ftrace.h>
@@ -348,6 +349,12 @@ static const struct seq_operations show_format_seq_ops = {
 static int
 ftrace_formats_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &show_format_seq_ops);
 }
 
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index ec9a34a97129..4df9a209f7ca 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -5,6 +5,7 @@
  */
 #include <linux/sched/task_stack.h>
 #include <linux/stacktrace.h>
+#include <linux/security.h>
 #include <linux/kallsyms.h>
 #include <linux/seq_file.h>
 #include <linux/spinlock.h>
@@ -470,6 +471,12 @@ static const struct seq_operations stack_trace_seq_ops = {
 
 static int stack_trace_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &stack_trace_seq_ops);
 }
 
@@ -487,6 +494,7 @@ stack_trace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_FILTER,
 				 inode, file);
 }
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75bf1bcb4a8a..9ab0a1a7ad5e 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -9,7 +9,7 @@
  *
  */
 
-
+#include <linux/security.h>
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
@@ -238,6 +238,10 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	struct stat_session *session = inode->i_private;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = stat_seq_init(session);
 	if (ret)
 		return ret;
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index dd884341f5c5..352073d36585 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt)	"trace_uprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/ctype.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -769,6 +770,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_uprobe_ops);
 		if (ret)
@@ -818,6 +823,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 
-- 
2.23.0



^ permalink raw reply related

* [PATCH 2/7 v2] ftrace: Get a reference counter for the trace_array on filter files
From: Steven Rostedt @ 2019-10-12  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Matthew Garrett,
	James Morris James Morris, LSM List, Linux API, Ben Hutchings,
	Al Viro, stable
In-Reply-To: <20191012005747.210722465@goodmis.org>

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for
an instance now. They need to take a reference to the instance otherwise
there could be a race between accessing the files and deleting the instance.

It wasn't until the :mod: caching where these file operations stated
referencing the trace_arry directly.

Cc: stable@vger.kernel.org
Fixes: 673feb9d76ab3 ("ftrace: Add :mod: caching infrastructure to trace_array")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 62a50bf399d6..32c2eb167de0 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3540,21 +3540,22 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 	struct ftrace_hash *hash;
 	struct list_head *mod_head;
 	struct trace_array *tr = ops->private;
-	int ret = 0;
+	int ret = -ENOMEM;
 
 	ftrace_ops_init(ops);
 
 	if (unlikely(ftrace_disabled))
 		return -ENODEV;
 
+	if (tr && trace_array_get(tr) < 0)
+		return -ENODEV;
+
 	iter = kzalloc(sizeof(*iter), GFP_KERNEL);
 	if (!iter)
-		return -ENOMEM;
+		goto out;
 
-	if (trace_parser_get_init(&iter->parser, FTRACE_BUFF_MAX)) {
-		kfree(iter);
-		return -ENOMEM;
-	}
+	if (trace_parser_get_init(&iter->parser, FTRACE_BUFF_MAX))
+		goto out;
 
 	iter->ops = ops;
 	iter->flags = flag;
@@ -3584,13 +3585,13 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 
 		if (!iter->hash) {
 			trace_parser_put(&iter->parser);
-			kfree(iter);
-			ret = -ENOMEM;
 			goto out_unlock;
 		}
 	} else
 		iter->hash = hash;
 
+	ret = 0;
+
 	if (file->f_mode & FMODE_READ) {
 		iter->pg = ftrace_pages_start;
 
@@ -3602,7 +3603,6 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 			/* Failed */
 			free_ftrace_hash(iter->hash);
 			trace_parser_put(&iter->parser);
-			kfree(iter);
 		}
 	} else
 		file->private_data = iter;
@@ -3610,6 +3610,13 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
  out_unlock:
 	mutex_unlock(&ops->func_hash->regex_lock);
 
+ out:
+	if (ret) {
+		kfree(iter);
+		if (tr)
+			trace_array_put(tr);
+	}
+
 	return ret;
 }
 
@@ -5037,6 +5044,8 @@ int ftrace_regex_release(struct inode *inode, struct file *file)
 
 	mutex_unlock(&iter->ops->func_hash->regex_lock);
 	free_ftrace_hash(iter->hash);
+	if (iter->tr)
+		trace_array_put(iter->tr);
 	kfree(iter);
 
 	return 0;
-- 
2.23.0



^ permalink raw reply related

* Re: [Patch v7 2/4] KEYS: Use common tpm_buf for trusted and asymmetric keys
From: Jerry Snitselaar @ 2019-10-11 22:34 UTC (permalink / raw)
  To: Sumit Garg
  Cc: jarkko.sakkinen, dhowells, peterhuewe, keyrings, linux-integrity,
	linux-crypto, linux-security-module, herbert, davem, jgg, arnd,
	gregkh, jejb, zohar, jmorris, serge, linux-kernel,
	daniel.thompson
In-Reply-To: <1570425935-7435-3-git-send-email-sumit.garg@linaro.org>

On Mon Oct 07 19, Sumit Garg wrote:
>Switch to utilize common heap based tpm_buf code for TPM based trusted
>and asymmetric keys rather than using stack based tpm1_buf code. Also,
>remove tpm1_buf code.
>
>Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
>Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
>---

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>

^ permalink raw reply

* Re: [Patch v7 1/4] tpm: Move tpm_buf code to include/linux/
From: Jerry Snitselaar @ 2019-10-11 22:33 UTC (permalink / raw)
  To: Sumit Garg
  Cc: jarkko.sakkinen, dhowells, peterhuewe, keyrings, linux-integrity,
	linux-crypto, linux-security-module, herbert, davem, jgg, arnd,
	gregkh, jejb, zohar, jmorris, serge, linux-kernel,
	daniel.thompson
In-Reply-To: <1570425935-7435-2-git-send-email-sumit.garg@linaro.org>

On Mon Oct 07 19, Sumit Garg wrote:
>Move tpm_buf code to common include/linux/tpm.h header so that it can
>be reused via other subsystems like trusted keys etc.
>
>Also rename trusted keys and asymmetric keys usage of TPM 1.x buffer
>implementation to tpm1_buf to avoid any compilation errors.
>
>Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
>Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
>---
> crypto/asymmetric_keys/asym_tpm.c |  12 +--
> drivers/char/tpm/tpm.h            | 215 --------------------------------------
> include/keys/trusted.h            |  12 +--
> include/linux/tpm.h               | 215 ++++++++++++++++++++++++++++++++++++++
> security/keys/trusted.c           |  12 +--
> 5 files changed, 233 insertions(+), 233 deletions(-)
>
>diff --git a/crypto/asymmetric_keys/asym_tpm.c b/crypto/asymmetric_keys/asym_tpm.c
>index 76d2ce3..b88968d 100644
>--- a/crypto/asymmetric_keys/asym_tpm.c
>+++ b/crypto/asymmetric_keys/asym_tpm.c
>@@ -31,7 +31,7 @@
> /*
>  * Load a TPM key from the blob provided by userspace
>  */
>-static int tpm_loadkey2(struct tpm_buf *tb,
>+static int tpm_loadkey2(struct tpm1_buf *tb,
> 			uint32_t keyhandle, unsigned char *keyauth,
> 			const unsigned char *keyblob, int keybloblen,
> 			uint32_t *newhandle)
>@@ -99,7 +99,7 @@ static int tpm_loadkey2(struct tpm_buf *tb,
> /*
>  * Execute the FlushSpecific TPM command
>  */
>-static int tpm_flushspecific(struct tpm_buf *tb, uint32_t handle)
>+static int tpm_flushspecific(struct tpm1_buf *tb, uint32_t handle)
> {
> 	INIT_BUF(tb);
> 	store16(tb, TPM_TAG_RQU_COMMAND);
>@@ -115,7 +115,7 @@ static int tpm_flushspecific(struct tpm_buf *tb, uint32_t handle)
>  * Decrypt a blob provided by userspace using a specific key handle.
>  * The handle is a well known handle or previously loaded by e.g. LoadKey2
>  */
>-static int tpm_unbind(struct tpm_buf *tb,
>+static int tpm_unbind(struct tpm1_buf *tb,
> 			uint32_t keyhandle, unsigned char *keyauth,
> 			const unsigned char *blob, uint32_t bloblen,
> 			void *out, uint32_t outlen)
>@@ -201,7 +201,7 @@ static int tpm_unbind(struct tpm_buf *tb,
>  * up to key_length_in_bytes - 11 and not be limited to size 20 like the
>  * TPM_SS_RSASSAPKCS1v15_SHA1 signature scheme.
>  */
>-static int tpm_sign(struct tpm_buf *tb,
>+static int tpm_sign(struct tpm1_buf *tb,
> 		    uint32_t keyhandle, unsigned char *keyauth,
> 		    const unsigned char *blob, uint32_t bloblen,
> 		    void *out, uint32_t outlen)
>@@ -519,7 +519,7 @@ static int tpm_key_decrypt(struct tpm_key *tk,
> 			   struct kernel_pkey_params *params,
> 			   const void *in, void *out)
> {
>-	struct tpm_buf *tb;
>+	struct tpm1_buf *tb;
> 	uint32_t keyhandle;
> 	uint8_t srkauth[SHA1_DIGEST_SIZE];
> 	uint8_t keyauth[SHA1_DIGEST_SIZE];
>@@ -643,7 +643,7 @@ static int tpm_key_sign(struct tpm_key *tk,
> 			struct kernel_pkey_params *params,
> 			const void *in, void *out)
> {
>-	struct tpm_buf *tb;
>+	struct tpm1_buf *tb;
> 	uint32_t keyhandle;
> 	uint8_t srkauth[SHA1_DIGEST_SIZE];
> 	uint8_t keyauth[SHA1_DIGEST_SIZE];
>diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
>index 80bca88..b174cf4 100644
>--- a/drivers/char/tpm/tpm.h
>+++ b/drivers/char/tpm/tpm.h
>@@ -25,7 +25,6 @@
> #include <linux/platform_device.h>
> #include <linux/io.h>
> #include <linux/tpm.h>
>-#include <linux/highmem.h>
> #include <linux/tpm_eventlog.h>
>
> #ifdef CONFIG_X86
>@@ -58,124 +57,6 @@ enum tpm_addr {
> #define TPM_ERR_DISABLED        0x7
> #define TPM_ERR_INVALID_POSTINIT 38
>
>-#define TPM_HEADER_SIZE		10
>-
>-enum tpm2_const {
>-	TPM2_PLATFORM_PCR       =     24,
>-	TPM2_PCR_SELECT_MIN     = ((TPM2_PLATFORM_PCR + 7) / 8),
>-};
>-
>-enum tpm2_timeouts {
>-	TPM2_TIMEOUT_A          =    750,
>-	TPM2_TIMEOUT_B          =   2000,
>-	TPM2_TIMEOUT_C          =    200,
>-	TPM2_TIMEOUT_D          =     30,
>-	TPM2_DURATION_SHORT     =     20,
>-	TPM2_DURATION_MEDIUM    =    750,
>-	TPM2_DURATION_LONG      =   2000,
>-	TPM2_DURATION_LONG_LONG = 300000,
>-	TPM2_DURATION_DEFAULT   = 120000,
>-};
>-
>-enum tpm2_structures {
>-	TPM2_ST_NO_SESSIONS	= 0x8001,
>-	TPM2_ST_SESSIONS	= 0x8002,
>-};
>-
>-/* Indicates from what layer of the software stack the error comes from */
>-#define TSS2_RC_LAYER_SHIFT	 16
>-#define TSS2_RESMGR_TPM_RC_LAYER (11 << TSS2_RC_LAYER_SHIFT)
>-
>-enum tpm2_return_codes {
>-	TPM2_RC_SUCCESS		= 0x0000,
>-	TPM2_RC_HASH		= 0x0083, /* RC_FMT1 */
>-	TPM2_RC_HANDLE		= 0x008B,
>-	TPM2_RC_INITIALIZE	= 0x0100, /* RC_VER1 */
>-	TPM2_RC_FAILURE		= 0x0101,
>-	TPM2_RC_DISABLED	= 0x0120,
>-	TPM2_RC_COMMAND_CODE    = 0x0143,
>-	TPM2_RC_TESTING		= 0x090A, /* RC_WARN */
>-	TPM2_RC_REFERENCE_H0	= 0x0910,
>-	TPM2_RC_RETRY		= 0x0922,
>-};
>-
>-enum tpm2_command_codes {
>-	TPM2_CC_FIRST		        = 0x011F,
>-	TPM2_CC_HIERARCHY_CONTROL       = 0x0121,
>-	TPM2_CC_HIERARCHY_CHANGE_AUTH   = 0x0129,
>-	TPM2_CC_CREATE_PRIMARY          = 0x0131,
>-	TPM2_CC_SEQUENCE_COMPLETE       = 0x013E,
>-	TPM2_CC_SELF_TEST	        = 0x0143,
>-	TPM2_CC_STARTUP		        = 0x0144,
>-	TPM2_CC_SHUTDOWN	        = 0x0145,
>-	TPM2_CC_NV_READ                 = 0x014E,
>-	TPM2_CC_CREATE		        = 0x0153,
>-	TPM2_CC_LOAD		        = 0x0157,
>-	TPM2_CC_SEQUENCE_UPDATE         = 0x015C,
>-	TPM2_CC_UNSEAL		        = 0x015E,
>-	TPM2_CC_CONTEXT_LOAD	        = 0x0161,
>-	TPM2_CC_CONTEXT_SAVE	        = 0x0162,
>-	TPM2_CC_FLUSH_CONTEXT	        = 0x0165,
>-	TPM2_CC_VERIFY_SIGNATURE        = 0x0177,
>-	TPM2_CC_GET_CAPABILITY	        = 0x017A,
>-	TPM2_CC_GET_RANDOM	        = 0x017B,
>-	TPM2_CC_PCR_READ	        = 0x017E,
>-	TPM2_CC_PCR_EXTEND	        = 0x0182,
>-	TPM2_CC_EVENT_SEQUENCE_COMPLETE = 0x0185,
>-	TPM2_CC_HASH_SEQUENCE_START     = 0x0186,
>-	TPM2_CC_CREATE_LOADED           = 0x0191,
>-	TPM2_CC_LAST		        = 0x0193, /* Spec 1.36 */
>-};
>-
>-enum tpm2_permanent_handles {
>-	TPM2_RS_PW		= 0x40000009,
>-};
>-
>-enum tpm2_capabilities {
>-	TPM2_CAP_HANDLES	= 1,
>-	TPM2_CAP_COMMANDS	= 2,
>-	TPM2_CAP_PCRS		= 5,
>-	TPM2_CAP_TPM_PROPERTIES = 6,
>-};
>-
>-enum tpm2_properties {
>-	TPM_PT_TOTAL_COMMANDS	= 0x0129,
>-};
>-
>-enum tpm2_startup_types {
>-	TPM2_SU_CLEAR	= 0x0000,
>-	TPM2_SU_STATE	= 0x0001,
>-};
>-
>-enum tpm2_cc_attrs {
>-	TPM2_CC_ATTR_CHANDLES	= 25,
>-	TPM2_CC_ATTR_RHANDLE	= 28,
>-};
>-
>-#define TPM_VID_INTEL    0x8086
>-#define TPM_VID_WINBOND  0x1050
>-#define TPM_VID_STM      0x104A
>-
>-enum tpm_chip_flags {
>-	TPM_CHIP_FLAG_TPM2		= BIT(1),
>-	TPM_CHIP_FLAG_IRQ		= BIT(2),
>-	TPM_CHIP_FLAG_VIRTUAL		= BIT(3),
>-	TPM_CHIP_FLAG_HAVE_TIMEOUTS	= BIT(4),
>-	TPM_CHIP_FLAG_ALWAYS_POWERED	= BIT(5),
>-	TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED	= BIT(6),
>-};
>-
>-#define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)
>-
>-struct tpm_header {
>-	__be16 tag;
>-	__be32 length;
>-	union {
>-		__be32 ordinal;
>-		__be32 return_code;
>-	};
>-} __packed;
>-
> #define TPM_TAG_RQU_COMMAND 193
>
> struct	stclear_flags_t {
>@@ -272,102 +153,6 @@ enum tpm_sub_capabilities {
>  * compiler warnings about stack frame size. */
> #define TPM_MAX_RNG_DATA	128
>
>-/* A string buffer type for constructing TPM commands. This is based on the
>- * ideas of string buffer code in security/keys/trusted.h but is heap based
>- * in order to keep the stack usage minimal.
>- */
>-
>-enum tpm_buf_flags {
>-	TPM_BUF_OVERFLOW	= BIT(0),
>-};
>-
>-struct tpm_buf {
>-	struct page *data_page;
>-	unsigned int flags;
>-	u8 *data;
>-};
>-
>-static inline void tpm_buf_reset(struct tpm_buf *buf, u16 tag, u32 ordinal)
>-{
>-	struct tpm_header *head = (struct tpm_header *)buf->data;
>-
>-	head->tag = cpu_to_be16(tag);
>-	head->length = cpu_to_be32(sizeof(*head));
>-	head->ordinal = cpu_to_be32(ordinal);
>-}
>-
>-static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal)
>-{
>-	buf->data_page = alloc_page(GFP_HIGHUSER);
>-	if (!buf->data_page)
>-		return -ENOMEM;
>-
>-	buf->flags = 0;
>-	buf->data = kmap(buf->data_page);
>-	tpm_buf_reset(buf, tag, ordinal);
>-	return 0;
>-}
>-
>-static inline void tpm_buf_destroy(struct tpm_buf *buf)
>-{
>-	kunmap(buf->data_page);
>-	__free_page(buf->data_page);
>-}
>-
>-static inline u32 tpm_buf_length(struct tpm_buf *buf)
>-{
>-	struct tpm_header *head = (struct tpm_header *)buf->data;
>-
>-	return be32_to_cpu(head->length);
>-}
>-
>-static inline u16 tpm_buf_tag(struct tpm_buf *buf)
>-{
>-	struct tpm_header *head = (struct tpm_header *)buf->data;
>-
>-	return be16_to_cpu(head->tag);
>-}
>-
>-static inline void tpm_buf_append(struct tpm_buf *buf,
>-				  const unsigned char *new_data,
>-				  unsigned int new_len)
>-{
>-	struct tpm_header *head = (struct tpm_header *)buf->data;
>-	u32 len = tpm_buf_length(buf);
>-
>-	/* Return silently if overflow has already happened. */
>-	if (buf->flags & TPM_BUF_OVERFLOW)
>-		return;
>-
>-	if ((len + new_len) > PAGE_SIZE) {
>-		WARN(1, "tpm_buf: overflow\n");
>-		buf->flags |= TPM_BUF_OVERFLOW;
>-		return;
>-	}
>-
>-	memcpy(&buf->data[len], new_data, new_len);
>-	head->length = cpu_to_be32(len + new_len);
>-}
>-
>-static inline void tpm_buf_append_u8(struct tpm_buf *buf, const u8 value)
>-{
>-	tpm_buf_append(buf, &value, 1);
>-}
>-
>-static inline void tpm_buf_append_u16(struct tpm_buf *buf, const u16 value)
>-{
>-	__be16 value2 = cpu_to_be16(value);
>-
>-	tpm_buf_append(buf, (u8 *) &value2, 2);
>-}
>-
>-static inline void tpm_buf_append_u32(struct tpm_buf *buf, const u32 value)
>-{
>-	__be32 value2 = cpu_to_be32(value);
>-
>-	tpm_buf_append(buf, (u8 *) &value2, 4);
>-}
>-
> extern struct class *tpm_class;
> extern struct class *tpmrm_class;
> extern dev_t tpm_devt;
>diff --git a/include/keys/trusted.h b/include/keys/trusted.h
>index 0071298..841ae11 100644
>--- a/include/keys/trusted.h
>+++ b/include/keys/trusted.h
>@@ -17,7 +17,7 @@
> #define LOAD32N(buffer, offset)	(*(uint32_t *)&buffer[offset])
> #define LOAD16(buffer, offset)	(ntohs(*(uint16_t *)&buffer[offset]))
>
>-struct tpm_buf {
>+struct tpm1_buf {
> 	int len;
> 	unsigned char data[MAX_BUF_SIZE];
> };
>@@ -46,7 +46,7 @@ int TSS_checkhmac1(unsigned char *buffer,
> 			  unsigned int keylen, ...);
>
> int trusted_tpm_send(unsigned char *cmd, size_t buflen);
>-int oiap(struct tpm_buf *tb, uint32_t *handle, unsigned char *nonce);
>+int oiap(struct tpm1_buf *tb, uint32_t *handle, unsigned char *nonce);
>
> #define TPM_DEBUG 0
>
>@@ -110,24 +110,24 @@ static inline void dump_tpm_buf(unsigned char *buf)
> }
> #endif
>
>-static inline void store8(struct tpm_buf *buf, const unsigned char value)
>+static inline void store8(struct tpm1_buf *buf, const unsigned char value)
> {
> 	buf->data[buf->len++] = value;
> }
>
>-static inline void store16(struct tpm_buf *buf, const uint16_t value)
>+static inline void store16(struct tpm1_buf *buf, const uint16_t value)
> {
> 	*(uint16_t *) & buf->data[buf->len] = htons(value);
> 	buf->len += sizeof value;
> }
>
>-static inline void store32(struct tpm_buf *buf, const uint32_t value)
>+static inline void store32(struct tpm1_buf *buf, const uint32_t value)
> {
> 	*(uint32_t *) & buf->data[buf->len] = htonl(value);
> 	buf->len += sizeof value;
> }
>
>-static inline void storebytes(struct tpm_buf *buf, const unsigned char *in,
>+static inline void storebytes(struct tpm1_buf *buf, const unsigned char *in,
> 			      const int len)
> {
> 	memcpy(buf->data + buf->len, in, len);
>diff --git a/include/linux/tpm.h b/include/linux/tpm.h
>index bb1d1ac..19c68f8 100644
>--- a/include/linux/tpm.h
>+++ b/include/linux/tpm.h
>@@ -21,6 +21,7 @@
> #include <linux/acpi.h>
> #include <linux/cdev.h>
> #include <linux/fs.h>
>+#include <linux/highmem.h>
> #include <crypto/hash_info.h>
>
> #define TPM_DIGEST_SIZE 20	/* Max TPM v1.2 PCR size */
>@@ -163,6 +164,220 @@ struct tpm_chip {
> 	int locality;
> };
>
>+#define TPM_HEADER_SIZE		10
>+
>+enum tpm2_const {
>+	TPM2_PLATFORM_PCR       =     24,
>+	TPM2_PCR_SELECT_MIN     = ((TPM2_PLATFORM_PCR + 7) / 8),
>+};
>+
>+enum tpm2_timeouts {
>+	TPM2_TIMEOUT_A          =    750,
>+	TPM2_TIMEOUT_B          =   2000,
>+	TPM2_TIMEOUT_C          =    200,
>+	TPM2_TIMEOUT_D          =     30,
>+	TPM2_DURATION_SHORT     =     20,
>+	TPM2_DURATION_MEDIUM    =    750,
>+	TPM2_DURATION_LONG      =   2000,
>+	TPM2_DURATION_LONG_LONG = 300000,
>+	TPM2_DURATION_DEFAULT   = 120000,
>+};
>+
>+enum tpm2_structures {
>+	TPM2_ST_NO_SESSIONS	= 0x8001,
>+	TPM2_ST_SESSIONS	= 0x8002,
>+};
>+
>+/* Indicates from what layer of the software stack the error comes from */
>+#define TSS2_RC_LAYER_SHIFT	 16
>+#define TSS2_RESMGR_TPM_RC_LAYER (11 << TSS2_RC_LAYER_SHIFT)
>+
>+enum tpm2_return_codes {
>+	TPM2_RC_SUCCESS		= 0x0000,
>+	TPM2_RC_HASH		= 0x0083, /* RC_FMT1 */
>+	TPM2_RC_HANDLE		= 0x008B,
>+	TPM2_RC_INITIALIZE	= 0x0100, /* RC_VER1 */
>+	TPM2_RC_FAILURE		= 0x0101,
>+	TPM2_RC_DISABLED	= 0x0120,
>+	TPM2_RC_COMMAND_CODE    = 0x0143,
>+	TPM2_RC_TESTING		= 0x090A, /* RC_WARN */
>+	TPM2_RC_REFERENCE_H0	= 0x0910,
>+	TPM2_RC_RETRY		= 0x0922,
>+};
>+
>+enum tpm2_command_codes {
>+	TPM2_CC_FIRST		        = 0x011F,
>+	TPM2_CC_HIERARCHY_CONTROL       = 0x0121,
>+	TPM2_CC_HIERARCHY_CHANGE_AUTH   = 0x0129,
>+	TPM2_CC_CREATE_PRIMARY          = 0x0131,
>+	TPM2_CC_SEQUENCE_COMPLETE       = 0x013E,
>+	TPM2_CC_SELF_TEST	        = 0x0143,
>+	TPM2_CC_STARTUP		        = 0x0144,
>+	TPM2_CC_SHUTDOWN	        = 0x0145,
>+	TPM2_CC_NV_READ                 = 0x014E,
>+	TPM2_CC_CREATE		        = 0x0153,
>+	TPM2_CC_LOAD		        = 0x0157,
>+	TPM2_CC_SEQUENCE_UPDATE         = 0x015C,
>+	TPM2_CC_UNSEAL		        = 0x015E,
>+	TPM2_CC_CONTEXT_LOAD	        = 0x0161,
>+	TPM2_CC_CONTEXT_SAVE	        = 0x0162,
>+	TPM2_CC_FLUSH_CONTEXT	        = 0x0165,
>+	TPM2_CC_VERIFY_SIGNATURE        = 0x0177,
>+	TPM2_CC_GET_CAPABILITY	        = 0x017A,
>+	TPM2_CC_GET_RANDOM	        = 0x017B,
>+	TPM2_CC_PCR_READ	        = 0x017E,
>+	TPM2_CC_PCR_EXTEND	        = 0x0182,
>+	TPM2_CC_EVENT_SEQUENCE_COMPLETE = 0x0185,
>+	TPM2_CC_HASH_SEQUENCE_START     = 0x0186,
>+	TPM2_CC_CREATE_LOADED           = 0x0191,
>+	TPM2_CC_LAST		        = 0x0193, /* Spec 1.36 */
>+};
>+
>+enum tpm2_permanent_handles {
>+	TPM2_RS_PW		= 0x40000009,
>+};
>+
>+enum tpm2_capabilities {
>+	TPM2_CAP_HANDLES	= 1,
>+	TPM2_CAP_COMMANDS	= 2,
>+	TPM2_CAP_PCRS		= 5,
>+	TPM2_CAP_TPM_PROPERTIES = 6,
>+};
>+
>+enum tpm2_properties {
>+	TPM_PT_TOTAL_COMMANDS	= 0x0129,
>+};
>+
>+enum tpm2_startup_types {
>+	TPM2_SU_CLEAR	= 0x0000,
>+	TPM2_SU_STATE	= 0x0001,
>+};
>+
>+enum tpm2_cc_attrs {
>+	TPM2_CC_ATTR_CHANDLES	= 25,
>+	TPM2_CC_ATTR_RHANDLE	= 28,
>+};
>+
>+#define TPM_VID_INTEL    0x8086
>+#define TPM_VID_WINBOND  0x1050
>+#define TPM_VID_STM      0x104A
>+
>+enum tpm_chip_flags {
>+	TPM_CHIP_FLAG_TPM2		= BIT(1),
>+	TPM_CHIP_FLAG_IRQ		= BIT(2),
>+	TPM_CHIP_FLAG_VIRTUAL		= BIT(3),
>+	TPM_CHIP_FLAG_HAVE_TIMEOUTS	= BIT(4),
>+	TPM_CHIP_FLAG_ALWAYS_POWERED	= BIT(5),
>+	TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED	= BIT(6),
>+};
>+
>+#define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)
>+
>+struct tpm_header {
>+	__be16 tag;
>+	__be32 length;
>+	union {
>+		__be32 ordinal;
>+		__be32 return_code;
>+	};
>+} __packed;
>+
>+/* A string buffer type for constructing TPM commands. This is based on the
>+ * ideas of string buffer code in security/keys/trusted.h but is heap based
>+ * in order to keep the stack usage minimal.
>+ */
>+
>+enum tpm_buf_flags {
>+	TPM_BUF_OVERFLOW	= BIT(0),
>+};
>+
>+struct tpm_buf {
>+	struct page *data_page;
>+	unsigned int flags;
>+	u8 *data;
>+};
>+
>+static inline void tpm_buf_reset(struct tpm_buf *buf, u16 tag, u32 ordinal)
>+{
>+	struct tpm_header *head = (struct tpm_header *)buf->data;
>+
>+	head->tag = cpu_to_be16(tag);
>+	head->length = cpu_to_be32(sizeof(*head));
>+	head->ordinal = cpu_to_be32(ordinal);
>+}
>+
>+static inline int tpm_buf_init(struct tpm_buf *buf, u16 tag, u32 ordinal)
>+{
>+	buf->data_page = alloc_page(GFP_HIGHUSER);
>+	if (!buf->data_page)
>+		return -ENOMEM;
>+
>+	buf->flags = 0;
>+	buf->data = kmap(buf->data_page);
>+	tpm_buf_reset(buf, tag, ordinal);
>+	return 0;
>+}
>+
>+static inline void tpm_buf_destroy(struct tpm_buf *buf)
>+{
>+	kunmap(buf->data_page);
>+	__free_page(buf->data_page);
>+}
>+
>+static inline u32 tpm_buf_length(struct tpm_buf *buf)
>+{
>+	struct tpm_header *head = (struct tpm_header *)buf->data;
>+
>+	return be32_to_cpu(head->length);
>+}
>+
>+static inline u16 tpm_buf_tag(struct tpm_buf *buf)
>+{
>+	struct tpm_header *head = (struct tpm_header *)buf->data;
>+
>+	return be16_to_cpu(head->tag);
>+}
>+
>+static inline void tpm_buf_append(struct tpm_buf *buf,
>+				  const unsigned char *new_data,
>+				  unsigned int new_len)
>+{
>+	struct tpm_header *head = (struct tpm_header *)buf->data;
>+	u32 len = tpm_buf_length(buf);
>+
>+	/* Return silently if overflow has already happened. */
>+	if (buf->flags & TPM_BUF_OVERFLOW)
>+		return;
>+
>+	if ((len + new_len) > PAGE_SIZE) {
>+		WARN(1, "tpm_buf: overflow\n");
>+		buf->flags |= TPM_BUF_OVERFLOW;
>+		return;
>+	}
>+
>+	memcpy(&buf->data[len], new_data, new_len);
>+	head->length = cpu_to_be32(len + new_len);
>+}
>+
>+static inline void tpm_buf_append_u8(struct tpm_buf *buf, const u8 value)
>+{
>+	tpm_buf_append(buf, &value, 1);
>+}
>+
>+static inline void tpm_buf_append_u16(struct tpm_buf *buf, const u16 value)
>+{
>+	__be16 value2 = cpu_to_be16(value);
>+
>+	tpm_buf_append(buf, (u8 *) &value2, 2);
>+}
>+
>+static inline void tpm_buf_append_u32(struct tpm_buf *buf, const u32 value)
>+{
>+	__be32 value2 = cpu_to_be32(value);
>+
>+	tpm_buf_append(buf, (u8 *) &value2, 4);
>+}
>+
> #if defined(CONFIG_TCG_TPM) || defined(CONFIG_TCG_TPM_MODULE)
>
> extern int tpm_is_tpm2(struct tpm_chip *chip);
>diff --git a/security/keys/trusted.c b/security/keys/trusted.c
>index 1fbd778..4cfae208 100644
>--- a/security/keys/trusted.c
>+++ b/security/keys/trusted.c
>@@ -395,7 +395,7 @@ static int pcrlock(const int pcrnum)
> /*
>  * Create an object specific authorisation protocol (OSAP) session
>  */
>-static int osap(struct tpm_buf *tb, struct osapsess *s,
>+static int osap(struct tpm1_buf *tb, struct osapsess *s,
> 		const unsigned char *key, uint16_t type, uint32_t handle)
> {
> 	unsigned char enonce[TPM_NONCE_SIZE];
>@@ -430,7 +430,7 @@ static int osap(struct tpm_buf *tb, struct osapsess *s,
> /*
>  * Create an object independent authorisation protocol (oiap) session
>  */
>-int oiap(struct tpm_buf *tb, uint32_t *handle, unsigned char *nonce)
>+int oiap(struct tpm1_buf *tb, uint32_t *handle, unsigned char *nonce)
> {
> 	int ret;
>
>@@ -464,7 +464,7 @@ struct tpm_digests {
>  * Have the TPM seal(encrypt) the trusted key, possibly based on
>  * Platform Configuration Registers (PCRs). AUTH1 for sealing key.
>  */
>-static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
>+static int tpm_seal(struct tpm1_buf *tb, uint16_t keytype,
> 		    uint32_t keyhandle, const unsigned char *keyauth,
> 		    const unsigned char *data, uint32_t datalen,
> 		    unsigned char *blob, uint32_t *bloblen,
>@@ -579,7 +579,7 @@ static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
> /*
>  * use the AUTH2_COMMAND form of unseal, to authorize both key and blob
>  */
>-static int tpm_unseal(struct tpm_buf *tb,
>+static int tpm_unseal(struct tpm1_buf *tb,
> 		      uint32_t keyhandle, const unsigned char *keyauth,
> 		      const unsigned char *blob, int bloblen,
> 		      const unsigned char *blobauth,
>@@ -670,7 +670,7 @@ static int tpm_unseal(struct tpm_buf *tb,
> static int key_seal(struct trusted_key_payload *p,
> 		    struct trusted_key_options *o)
> {
>-	struct tpm_buf *tb;
>+	struct tpm1_buf *tb;
> 	int ret;
>
> 	tb = kzalloc(sizeof *tb, GFP_KERNEL);
>@@ -696,7 +696,7 @@ static int key_seal(struct trusted_key_payload *p,
> static int key_unseal(struct trusted_key_payload *p,
> 		      struct trusted_key_options *o)
> {
>-	struct tpm_buf *tb;
>+	struct tpm1_buf *tb;
> 	int ret;
>
> 	tb = kzalloc(sizeof *tb, GFP_KERNEL);
>-- 
>2.7.4
>

I reviewed version on tpmdd/master. Will have to massage it again
once James' v2 patch is merged, but that is trivial.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 22:27 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Linus Torvalds, LKML, Matthew Garrett, James Morris James Morris,
	LSM List, Linux API, Ben Hutchings, Al Viro
In-Reply-To: <87tv8f9cr7.fsf@mid.deneb.enyo.de>

On Fri, 11 Oct 2019 23:46:20 +0200
Florian Weimer <fw@deneb.enyo.de> wrote:

> * Steven Rostedt:
> 
> > Once locked down is set, can it ever be undone without rebooting?  
> 
> I think this is the original intent with such patches, yes.  But then
> reality interferes and people add some escape hatch, so that it's
> possible again to load arbitrary kernel modules.  And for servers, you
> can't have a meaningful physical presence check, so you end up with a
> lot of complexity for something that offers absolutely zero gains in
> security.
> 
> The other practical issue is that general-purpose Linux distributions
> cannot prevent kernel downgrades, so even if there's a
> cryptographically signed chain from the firmware to the kernel, you
> can boot last year's kernel, use a root-to-ring-0 exploit to disable
> its particular implementation of lockdown, and then kexec the real
> kernel with lockdown disabled.
> 
> I'm sure that kernel lockdown has applications somewhere, but for
> general-purpose distributions (who usually want to support third-party
> kernel modules), it's an endless source of problems that wouldn't
> exist without it.

I just decided to keep the two separate. The tracing_disable is
permanent (unless you actually do something that writes into kernel
memory to change the variable). When set, there's nothing to clear it.

Thus, I decided not to couple that with lockdown, and let the lockdown
folks do whatever they damn well please ;-)

-- Steve

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Florian Weimer @ 2019-10-11 21:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, LKML, Matthew Garrett, James Morris James Morris,
	LSM List, Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011143610.21bcd9c0@gandalf.local.home>

* Steven Rostedt:

> Once locked down is set, can it ever be undone without rebooting?

I think this is the original intent with such patches, yes.  But then
reality interferes and people add some escape hatch, so that it's
possible again to load arbitrary kernel modules.  And for servers, you
can't have a meaningful physical presence check, so you end up with a
lot of complexity for something that offers absolutely zero gains in
security.

The other practical issue is that general-purpose Linux distributions
cannot prevent kernel downgrades, so even if there's a
cryptographically signed chain from the firmware to the kernel, you
can boot last year's kernel, use a root-to-ring-0 exploit to disable
its particular implementation of lockdown, and then kexec the real
kernel with lockdown disabled.

I'm sure that kernel lockdown has applications somewhere, but for
general-purpose distributions (who usually want to support third-party
kernel modules), it's an endless source of problems that wouldn't
exist without it.

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 21:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <CAHk-=wiGtEDhwJab7+tQzmjDssynBruRvgj9NJY2bzNrVzw+0Q@mail.gmail.com>

On Fri, 11 Oct 2019 14:00:50 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Oct 11, 2019 at 1:55 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > I guess I can keep it this way. Thoughts?  
> 
> That looks fine to me. I'm still not sure you actually need to do all
> this, but it doesn't look _wrong_.

Yep, I sent this before seeing your other email.

> 
> That said, I still do think that if things are locked down from the
> very get-go, tracefs_create_file() shouldn't even create the files.

Agreed. I forgot to add in my last email:

4) Add the lockdown check to create_file()

> 
> That's mostly an independent thing from the "what about if they exists
> and things got locked down afterwards", though.

Well, I'll be combining it with the tracing_disabled code, which was
there originally to prevent corrupted buffers from causing harm by
being read.

> 
> I do wonder about the whole "well, if you started tracing before
> locking things down, don't you want to see the end results"?
> 

I don't think it hurts to disable it. Although, I don't think reading
the trace event formats will be a issue. I'm not aware of any of them
from giving too much info to the system. And if you really do care
about that, do it at boot up, and with the create_file part, it wont
have any files to read.

-- Steve

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 21:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <CAHk-=whC6Ji=fWnjh2+eS4b15TnbsS4VPVtvBOwCy1jjEG_JHQ@mail.gmail.com>

On Fri, 11 Oct 2019 13:46:11 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Oct 11, 2019 at 1:25 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > OK, so I tried this approach, and there's a bit more than just a "few"
> > extra cases that use tracing_open_generic().  
> 
> Yeah, that's more than I would have expected.
> 
> That said, you also did what I consider a somewhat over-done conversion.
> 
> Just do
> 
>    static inline bool tracefs_lockdown_or_disabled(void)
>    { return tracing_disabled || security_locked_down(LOCKDOWN_TRACEFS); }
> 
> and ignore the pointless return value (we know it's EPERM, and we
> really don't care).
> 
> And then several of those conversions just turn into oneliner
> 
> -       if (tracing_disabled)
> +       if (tracefs_lockdown_or_disabled())
>                  return -ENODEV;

OK, so this is similar to what I just sent. But instead I made it into
a function that combines tracing_disabled and the locked_down, plus, it
did the reference update for the trace_array if needed (which in doing
this exercise, I think I found a couple of places that need the ref
count update).

> 
> patches instead.
> 
> I'm also not sure why you bothered with a lot of the files that don't
> currently even have that "tracing_disabled" logic at all. Yeah, they
> show pre-existing tracing info, but even if you locked down _after_
> starting some tracing, that's probably what you want. You just don't
> want people to be able to add new trace events.

I guess just preventing new traces would be good enough (or anything
that records data), as well as seeing that recorded data.

Note, the tracing_disabled was added in case the ring buffer got
corrupted, and we wanted to prevent the system from crashing if someone
read the corrupted ring buffer. In the over 10 years of having that
check, I don't recall a single instance of someone triggering it :-/


> 
> For example, maybe you want to have lockdown after you've booted, but
> still show boot time traces?
> 
> I dunno. I feel like you re-did the original patch, and the original
> patch was designed for "least line impact" rather than for anything
> else.
> 
> I also suspect that if you *actually* do lockdown at early boot (which
> is presumably one common case), then all you need is to do
> 
>     --- a/fs/tracefs/inode.c
>     +++ b/fs/tracefs/inode.c
>     @@ -418,6 +418,9 @@ struct dentry *tracefs_create_file(
>         struct dentry *dentry;
>         struct inode *inode;
> 
>     +   if (security_locked_down(LOCKDOWN_TRACEFS))
>     +           return NULL;
>     +

Sounds reasonable.

>         if (!(mode & S_IFMT))
>                 mode |= S_IFREG;
>         BUG_ON(!S_ISREG(mode));
> 
> and the "open-time check" is purely for "we did lockdown _after_ boot,
> but then you might well want to be able to read the existing trace
> information that you did before you locked down.
> 
> Again - I think trying to emulate exactly what that broken lockdown
> patch did is not really what you should aim for.
> 
> That patch was not only buggy, it really wasn't about what people
> really necessarily _want_, as about "we don't want to deal with
> tracing, so here's a minimal patch that doesn't even work".

OK. My new approach is to:

1) revert the tracefs lockdown patch
2) Add my tracing_check_open_get_tr() patch (and consolidate the calls)
 (fix up anything that should have this too)
3) Add the lockdown to the tracing_check_open_get_tr() routine, and be
   done with it.

-- Steve

^ permalink raw reply

* Re: [Patch v7 0/4] Create and consolidate trusted keys subsystem
From: Jerry Snitselaar @ 2019-10-11 21:05 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Sumit Garg, dhowells, peterhuewe, keyrings, linux-integrity,
	linux-crypto, linux-security-module, herbert, davem, jgg, arnd,
	gregkh, jejb, zohar, jmorris, serge, linux-kernel,
	daniel.thompson
In-Reply-To: <20191011123757.GD3129@linux.intel.com>

On Fri Oct 11 19, Jarkko Sakkinen wrote:
>On Mon, Oct 07, 2019 at 10:55:31AM +0530, Sumit Garg wrote:
>> This patch-set does restructuring of trusted keys code to create and
>> consolidate trusted keys subsystem.
>>
>> Also, patch #2 replaces tpm1_buf code used in security/keys/trusted.c and
>> crypto/asymmertic_keys/asym_tpm.c files to use the common tpm_buf code.
>>
>> Changes in v7:
>> 1. Rebased to top of tpmdd/master
>> 2. Patch #4: update tpm2 trusted keys code to use tpm_send() instead of
>>    tpm_transmit_cmd() which is an internal function.
>>
>> Changes in v6:
>> 1. Switch TPM asymmetric code also to use common tpm_buf code. These
>>    changes required patches #1 and #2 update, so I have dropped review
>>    tags from those patches.
>> 2. Incorporated miscellaneous comments from Jarkko.
>>
>> Changes in v5:
>> 1. Drop 5/5 patch as its more relavant along with TEE patch-set.
>> 2. Add Reviewed-by tag for patch #2.
>> 3. Fix build failure when "CONFIG_HEADER_TEST" and
>>    "CONFIG_KERNEL_HEADER_TEST" config options are enabled.
>> 4. Misc changes to rename files.
>>
>> Changes in v4:
>> 1. Separate patch for export of tpm_buf code to include/linux/tpm.h
>> 2. Change TPM1.x trusted keys code to use common tpm_buf
>> 3. Keep module name as trusted.ko only
>>
>> Changes in v3:
>>
>> Move TPM2 trusted keys code to trusted keys subsystem.
>>
>> Changes in v2:
>>
>> Split trusted keys abstraction patch for ease of review.
>>
>> Sumit Garg (4):
>>   tpm: Move tpm_buf code to include/linux/
>>   KEYS: Use common tpm_buf for trusted and asymmetric keys
>>   KEYS: trusted: Create trusted keys subsystem
>>   KEYS: trusted: Move TPM2 trusted keys code
>>
>>  crypto/asymmetric_keys/asym_tpm.c                  | 101 +++----
>>  drivers/char/tpm/tpm-interface.c                   |  56 ----
>>  drivers/char/tpm/tpm.h                             | 226 ---------------
>>  drivers/char/tpm/tpm2-cmd.c                        | 307 --------------------
>>  include/Kbuild                                     |   1 -
>>  include/keys/{trusted.h => trusted_tpm.h}          |  49 +---
>>  include/linux/tpm.h                                | 251 ++++++++++++++--
>>  security/keys/Makefile                             |   2 +-
>>  security/keys/trusted-keys/Makefile                |   8 +
>>  .../{trusted.c => trusted-keys/trusted_tpm1.c}     |  96 +++----
>>  security/keys/trusted-keys/trusted_tpm2.c          | 314 +++++++++++++++++++++
>>  11 files changed, 652 insertions(+), 759 deletions(-)
>>  rename include/keys/{trusted.h => trusted_tpm.h} (77%)
>>  create mode 100644 security/keys/trusted-keys/Makefile
>>  rename security/keys/{trusted.c => trusted-keys/trusted_tpm1.c} (94%)
>>  create mode 100644 security/keys/trusted-keys/trusted_tpm2.c
>>
>> --
>> 2.7.4
>>
>
>I fixed a merge conflict caused by James' commit. Already pushed.
>Compiling test kernel ATM i.e. tested-by's will follow later.
>
>/Jarkko

Are you missing patch 4 on master?

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Linus Torvalds @ 2019-10-11 21:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011165455.32666d53@gandalf.local.home>

On Fri, Oct 11, 2019 at 1:55 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I guess I can keep it this way. Thoughts?

That looks fine to me. I'm still not sure you actually need to do all
this, but it doesn't look _wrong_.

That said, I still do think that if things are locked down from the
very get-go, tracefs_create_file() shouldn't even create the files.

That's mostly an independent thing from the "what about if they exists
and things got locked down afterwards", though.

I do wonder about the whole "well, if you started tracing before
locking things down, don't you want to see the end results"?

             Linus

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 20:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011162518.2f8c99ca@gandalf.local.home>

On Fri, 11 Oct 2019 16:25:18 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> Here's the diffstat:
> 
>  fs/tracefs/inode.c                  |   42 --------------------
>  kernel/trace/ftrace.c               |   29 +++++++++++++-
>  kernel/trace/trace.c                |   73 ++++++++++++++++++++++++++++++++++--
>  kernel/trace/trace_dynevent.c       |    5 ++
>  kernel/trace/trace_events.c         |   10 ++++
>  kernel/trace/trace_events_hist.c    |   11 +++++
>  kernel/trace/trace_events_trigger.c |    8 +++
>  kernel/trace/trace_kprobe.c         |   11 +++++
>  kernel/trace/trace_printk.c         |    7 +++
>  kernel/trace/trace_stack.c          |    8 +++
>  kernel/trace/trace_stat.c           |    6 ++
>  kernel/trace/trace_uprobe.c         |   11 +++++
>  12 files changed, 173 insertions(+), 48 deletions(-)
> 
> Compared to the patch with the full wrapper:
> 
>  fs/tracefs/inode.c | 153 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 135 insertions(+), 18 deletions(-)
> 

Consolidating some of the functions did a little better on the deletion
front:

 fs/tracefs/inode.c                  |   42 ------------
 kernel/trace/ftrace.c               |   30 +++++++-
 kernel/trace/trace.c                |  122 +++++++++++++++++++++---------------
 kernel/trace/trace.h                |    1 
 kernel/trace/trace_dynevent.c       |    5 +
 kernel/trace/trace_events.c         |   41 ++++--------
 kernel/trace/trace_events_hist.c    |   11 +++
 kernel/trace/trace_events_trigger.c |    8 ++
 kernel/trace/trace_kprobe.c         |   11 +++
 kernel/trace/trace_printk.c         |    7 ++
 kernel/trace/trace_stack.c          |    8 ++
 kernel/trace/trace_stat.c           |    6 +
 kernel/trace/trace_uprobe.c         |   11 +++
 13 files changed, 181 insertions(+), 122 deletions(-)

I guess I can keep it this way. Thoughts?

-- Steve


diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9fc14e38927f..eeeae0475da9 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -20,7 +20,6 @@
 #include <linux/parser.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
-#include <linux/security.h>
 
 #define TRACEFS_DEFAULT_MODE	0700
 
@@ -28,25 +27,6 @@ static struct vfsmount *tracefs_mount;
 static int tracefs_mount_count;
 static bool tracefs_registered;
 
-static int default_open_file(struct inode *inode, struct file *filp)
-{
-	struct dentry *dentry = filp->f_path.dentry;
-	struct file_operations *real_fops;
-	int ret;
-
-	if (!dentry)
-		return -EINVAL;
-
-	ret = security_locked_down(LOCKDOWN_TRACEFS);
-	if (ret)
-		return ret;
-
-	real_fops = dentry->d_fsdata;
-	if (!real_fops->open)
-		return 0;
-	return real_fops->open(inode, filp);
-}
-
 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 {
@@ -241,12 +221,6 @@ static int tracefs_apply_options(struct super_block *sb)
 	return 0;
 }
 
-static void tracefs_destroy_inode(struct inode *inode)
-{
-	if (S_ISREG(inode->i_mode))
-		kfree(inode->i_fop);
-}
-
 static int tracefs_remount(struct super_block *sb, int *flags, char *data)
 {
 	int err;
@@ -283,7 +257,6 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
 static const struct super_operations tracefs_super_operations = {
 	.statfs		= simple_statfs,
 	.remount_fs	= tracefs_remount,
-	.destroy_inode  = tracefs_destroy_inode,
 	.show_options	= tracefs_show_options,
 };
 
@@ -414,7 +387,6 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 				   struct dentry *parent, void *data,
 				   const struct file_operations *fops)
 {
-	struct file_operations *proxy_fops;
 	struct dentry *dentry;
 	struct inode *inode;
 
@@ -430,20 +402,8 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
-	proxy_fops = kzalloc(sizeof(struct file_operations), GFP_KERNEL);
-	if (unlikely(!proxy_fops)) {
-		iput(inode);
-		return failed_creating(dentry);
-	}
-
-	if (!fops)
-		fops = &tracefs_file_operations;
-
-	dentry->d_fsdata = (void *)fops;
-	memcpy(proxy_fops, fops, sizeof(*proxy_fops));
-	proxy_fops->open = default_open_file;
 	inode->i_mode = mode;
-	inode->i_fop = proxy_fops;
+	inode->i_fop = fops ? fops : &tracefs_file_operations;
 	inode->i_private = data;
 	d_instantiate(dentry, inode);
 	fsnotify_create(dentry->d_parent->d_inode, dentry);
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 62a50bf399d6..7bcbfef9ab56 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -18,6 +18,7 @@
 #include <linux/clocksource.h>
 #include <linux/sched/task.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/tracefs.h>
 #include <linux/hardirq.h>
@@ -3486,6 +3487,11 @@ static int
 ftrace_avail_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (unlikely(ftrace_disabled))
 		return -ENODEV;
@@ -3504,6 +3510,11 @@ static int
 ftrace_enabled_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
 	if (!iter)
@@ -3540,7 +3551,11 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 	struct ftrace_hash *hash;
 	struct list_head *mod_head;
 	struct trace_array *tr = ops->private;
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	ftrace_ops_init(ops);
 
@@ -3618,6 +3633,7 @@ ftrace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops,
 			FTRACE_ITER_FILTER | FTRACE_ITER_DO_PROBES,
 			inode, file);
@@ -3628,6 +3644,7 @@ ftrace_notrace_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_NOTRACE,
 				 inode, file);
 }
@@ -5194,9 +5211,13 @@ static int
 __ftrace_graph_open(struct inode *inode, struct file *file,
 		    struct ftrace_graph_data *fgd)
 {
-	int ret = 0;
+	int ret;
 	struct ftrace_hash *new_hash = NULL;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (file->f_mode & FMODE_WRITE) {
 		const int size_bits = FTRACE_HASH_DEFAULT_BITS;
 
@@ -6537,8 +6558,9 @@ ftrace_pid_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret = 0;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if ((file->f_mode & FMODE_WRITE) &&
 	    (file->f_flags & O_TRUNC))
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 252f79c435f8..d6e467281bbc 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -17,6 +17,7 @@
 #include <linux/stacktrace.h>
 #include <linux/writeback.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/notifier.h>
 #include <linux/irqflags.h>
@@ -304,6 +305,23 @@ void trace_array_put(struct trace_array *this_tr)
 	mutex_unlock(&trace_types_lock);
 }
 
+int tracing_check_open_get_tr(struct trace_array *tr)
+{
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
+	if (tracing_disabled)
+		return -ENODEV;
+
+	if (tr && trace_array_get(tr) < 0)
+		return -ENODEV;
+
+	return 0;
+}
+
 int call_filter_check_discard(struct trace_event_call *call, void *rec,
 			      struct ring_buffer *buffer,
 			      struct ring_buffer_event *event)
@@ -4140,8 +4158,11 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
 
 int tracing_open_generic(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	filp->private_data = inode->i_private;
 	return 0;
@@ -4159,12 +4180,11 @@ bool tracing_is_disabled(void)
 static int tracing_open_generic_tr(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
+	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	filp->private_data = inode->i_private;
 
@@ -4233,10 +4253,11 @@ static int tracing_open(struct inode *inode, struct file *file)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	/* If this file was open for write, then erase contents */
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
@@ -4352,8 +4373,9 @@ static int show_traces_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	ret = seq_open(file, &show_traces_seq_ops);
 	if (ret)
@@ -4697,11 +4719,9 @@ static int tracing_trace_options_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_trace_options_show, inode->i_private);
 	if (ret < 0)
@@ -5038,8 +5058,11 @@ static const struct seq_operations tracing_saved_tgids_seq_ops = {
 
 static int tracing_saved_tgids_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_saved_tgids_seq_ops);
 }
@@ -5115,8 +5138,11 @@ static const struct seq_operations tracing_saved_cmdlines_seq_ops = {
 
 static int tracing_saved_cmdlines_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_saved_cmdlines_seq_ops);
 }
@@ -5280,8 +5306,11 @@ static const struct seq_operations tracing_eval_map_seq_ops = {
 
 static int tracing_eval_map_open(struct inode *inode, struct file *filp)
 {
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
+
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
 
 	return seq_open(filp, &tracing_eval_map_seq_ops);
 }
@@ -5804,13 +5833,11 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
-
-	if (tracing_disabled)
-		return -ENODEV;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	mutex_lock(&trace_types_lock);
 
@@ -6547,11 +6574,9 @@ static int tracing_clock_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr))
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_clock_show, inode->i_private);
 	if (ret < 0)
@@ -6581,11 +6606,9 @@ static int tracing_time_stamp_mode_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr))
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	ret = single_open(file, tracing_time_stamp_mode_show, inode->i_private);
 	if (ret < 0)
@@ -6638,10 +6661,11 @@ static int tracing_snapshot_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
 	struct seq_file *m;
-	int ret = 0;
+	int ret;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	if (file->f_mode & FMODE_READ) {
 		iter = __tracing_open(inode, file, true);
@@ -6786,6 +6810,7 @@ static int snapshot_raw_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
+	/* The following checks for tracefs lockdown */
 	ret = tracing_buffers_open(inode, filp);
 	if (ret < 0)
 		return ret;
@@ -7105,8 +7130,9 @@ static int tracing_err_log_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret = 0;
 
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	/* If this file was opened for write, then erase contents */
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC))
@@ -7157,11 +7183,9 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
-	if (tracing_disabled)
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
+	ret = tracing_check_open_get_tr(tr);
+	if (ret)
+		return ret;
 
 	info = kzalloc(sizeof(*info), GFP_KERNEL);
 	if (!info) {
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index f801d154ff6a..fc41971306c1 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -338,6 +338,7 @@ extern struct mutex trace_types_lock;
 
 extern int trace_array_get(struct trace_array *tr);
 extern void trace_array_put(struct trace_array *tr);
+extern int tracing_check_open_get_tr(struct trace_array *tr);
 
 extern int tracing_set_time_stamp_abs(struct trace_array *tr, bool abs);
 extern int tracing_set_clock(struct trace_array *tr, const char *clockstr);
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index a41fed46c285..74df50375fd9 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2018 Masami Hiramatsu <mhiramat@kernel.org>
  */
 
+#include <linux/security.h>
 #include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
@@ -174,6 +175,10 @@ static int dyn_event_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = tracing_check_open_get_tr(NULL);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(NULL);
 		if (ret < 0)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index b89cdfe20bc1..24a642489437 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -12,6 +12,7 @@
 #define pr_fmt(fmt) fmt
 
 #include <linux/workqueue.h>
+#include <linux/security.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/tracefs.h>
@@ -1294,6 +1295,10 @@ static int trace_format_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = seq_open(file, &trace_format_seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1391,9 +1396,6 @@ static int subsystem_open(struct inode *inode, struct file *filp)
 	struct trace_array *tr;
 	int ret;
 
-	if (tracing_is_disabled())
-		return -ENODEV;
-
 	/* Make sure the system still exists */
 	mutex_lock(&event_mutex);
 	mutex_lock(&trace_types_lock);
@@ -1420,16 +1422,9 @@ static int subsystem_open(struct inode *inode, struct file *filp)
 	WARN_ON(!dir);
 
 	/* Still need to increment the ref count of the system */
-	if (trace_array_get(tr) < 0) {
-		put_system(dir);
-		return -ENODEV;
-	}
-
-	ret = tracing_open_generic(inode, filp);
-	if (ret < 0) {
-		trace_array_put(tr);
+	ret = tracing_open_generic_tr(inode, filp);
+	if (ret < 0)
 		put_system(dir);
-	}
 
 	return ret;
 }
@@ -1440,28 +1435,17 @@ static int system_tr_open(struct inode *inode, struct file *filp)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
-	if (tracing_is_disabled())
-		return -ENODEV;
-
-	if (trace_array_get(tr) < 0)
-		return -ENODEV;
-
 	/* Make a temporary dir that has no system but points to tr */
 	dir = kzalloc(sizeof(*dir), GFP_KERNEL);
-	if (!dir) {
-		trace_array_put(tr);
+	if (!dir)
 		return -ENOMEM;
-	}
-
-	dir->tr = tr;
 
-	ret = tracing_open_generic(inode, filp);
+	ret = tracing_open_generic_tr(inode, filp);
 	if (ret < 0) {
-		trace_array_put(tr);
 		kfree(dir);
 		return ret;
 	}
-
+	dir->tr = tr;
 	filp->private_data = dir;
 
 	return 0;
@@ -1771,6 +1755,10 @@ ftrace_event_open(struct inode *inode, struct file *file,
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = seq_open(file, seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1795,6 +1783,7 @@ ftrace_event_avail_open(struct inode *inode, struct file *file)
 {
 	const struct seq_operations *seq_ops = &show_event_seq_ops;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_event_open(inode, file, seq_ops);
 }
 
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 9468bd8d44a2..71dfe6889d62 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -7,6 +7,7 @@
 
 #include <linux/module.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -1448,6 +1449,10 @@ static int synth_events_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&synth_event_ops);
 		if (ret < 0)
@@ -5515,6 +5520,12 @@ static int hist_show(struct seq_file *m, void *v)
 
 static int event_hist_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return single_open(file, hist_show, file);
 }
 
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 2a2912cb4533..2cd53ca21b51 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2013 Tom Zanussi <tom.zanussi@linux.intel.com>
  */
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/ctype.h>
 #include <linux/mutex.h>
@@ -173,7 +174,11 @@ static const struct seq_operations event_triggers_seq_ops = {
 
 static int event_trigger_regex_open(struct inode *inode, struct file *file)
 {
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	mutex_lock(&event_mutex);
 
@@ -292,6 +297,7 @@ event_trigger_write(struct file *filp, const char __user *ubuf,
 static int
 event_trigger_open(struct inode *inode, struct file *filp)
 {
+	/* Checks for tracefs lockdown */
 	return event_trigger_regex_open(inode, filp);
 }
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 324ffbea3556..e4422746cb4c 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt)	"trace_kprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
 #include <linux/rculist.h>
@@ -936,6 +937,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_kprobe_ops);
 		if (ret < 0)
@@ -988,6 +993,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 
diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
index c3fd849d4a8f..d4e31e969206 100644
--- a/kernel/trace/trace_printk.c
+++ b/kernel/trace/trace_printk.c
@@ -6,6 +6,7 @@
  *
  */
 #include <linux/seq_file.h>
+#include <linux/security.h>
 #include <linux/uaccess.h>
 #include <linux/kernel.h>
 #include <linux/ftrace.h>
@@ -348,6 +349,12 @@ static const struct seq_operations show_format_seq_ops = {
 static int
 ftrace_formats_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &show_format_seq_ops);
 }
 
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index ec9a34a97129..4df9a209f7ca 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -5,6 +5,7 @@
  */
 #include <linux/sched/task_stack.h>
 #include <linux/stacktrace.h>
+#include <linux/security.h>
 #include <linux/kallsyms.h>
 #include <linux/seq_file.h>
 #include <linux/spinlock.h>
@@ -470,6 +471,12 @@ static const struct seq_operations stack_trace_seq_ops = {
 
 static int stack_trace_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &stack_trace_seq_ops);
 }
 
@@ -487,6 +494,7 @@ stack_trace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_FILTER,
 				 inode, file);
 }
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75bf1bcb4a8a..9ab0a1a7ad5e 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -9,7 +9,7 @@
  *
  */
 
-
+#include <linux/security.h>
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
@@ -238,6 +238,10 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	struct stat_session *session = inode->i_private;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = stat_seq_init(session);
 	if (ret)
 		return ret;
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index dd884341f5c5..352073d36585 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt)	"trace_uprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/ctype.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -769,6 +770,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_uprobe_ops);
 		if (ret)
@@ -818,6 +823,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 

^ permalink raw reply related

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Linus Torvalds @ 2019-10-11 20:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011162518.2f8c99ca@gandalf.local.home>

On Fri, Oct 11, 2019 at 1:25 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> OK, so I tried this approach, and there's a bit more than just a "few"
> extra cases that use tracing_open_generic().

Yeah, that's more than I would have expected.

That said, you also did what I consider a somewhat over-done conversion.

Just do

   static inline bool tracefs_lockdown_or_disabled(void)
   { return tracing_disabled || security_locked_down(LOCKDOWN_TRACEFS); }

and ignore the pointless return value (we know it's EPERM, and we
really don't care).

And then several of those conversions just turn into oneliner

-       if (tracing_disabled)
+       if (tracefs_lockdown_or_disabled())
                 return -ENODEV;

patches instead.

I'm also not sure why you bothered with a lot of the files that don't
currently even have that "tracing_disabled" logic at all. Yeah, they
show pre-existing tracing info, but even if you locked down _after_
starting some tracing, that's probably what you want. You just don't
want people to be able to add new trace events.

For example, maybe you want to have lockdown after you've booted, but
still show boot time traces?

I dunno. I feel like you re-did the original patch, and the original
patch was designed for "least line impact" rather than for anything
else.

I also suspect that if you *actually* do lockdown at early boot (which
is presumably one common case), then all you need is to do

    --- a/fs/tracefs/inode.c
    +++ b/fs/tracefs/inode.c
    @@ -418,6 +418,9 @@ struct dentry *tracefs_create_file(
        struct dentry *dentry;
        struct inode *inode;

    +   if (security_locked_down(LOCKDOWN_TRACEFS))
    +           return NULL;
    +
        if (!(mode & S_IFMT))
                mode |= S_IFREG;
        BUG_ON(!S_ISREG(mode));

and the "open-time check" is purely for "we did lockdown _after_ boot,
but then you might well want to be able to read the existing trace
information that you did before you locked down.

Again - I think trying to emulate exactly what that broken lockdown
patch did is not really what you should aim for.

That patch was not only buggy, it really wasn't about what people
really necessarily _want_, as about "we don't want to deal with
tracing, so here's a minimal patch that doesn't even work".

          Linus

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 20:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.com>

On Fri, 11 Oct 2019 11:20:30 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> So from a quick glance, just making tracing_open_generic() do the
> lockdown testing will take care of most cases.  Add a few other cases
> to fill up the whole set, and your'e done.
> 
> Willing to do that instead?

OK, so I tried this approach, and there's a bit more than just a "few"
extra cases that use tracing_open_generic(). Below is the full patch.

Here's the diffstat:

 fs/tracefs/inode.c                  |   42 --------------------
 kernel/trace/ftrace.c               |   29 +++++++++++++-
 kernel/trace/trace.c                |   73 ++++++++++++++++++++++++++++++++++--
 kernel/trace/trace_dynevent.c       |    5 ++
 kernel/trace/trace_events.c         |   10 ++++
 kernel/trace/trace_events_hist.c    |   11 +++++
 kernel/trace/trace_events_trigger.c |    8 +++
 kernel/trace/trace_kprobe.c         |   11 +++++
 kernel/trace/trace_printk.c         |    7 +++
 kernel/trace/trace_stack.c          |    8 +++
 kernel/trace/trace_stat.c           |    6 ++
 kernel/trace/trace_uprobe.c         |   11 +++++
 12 files changed, 173 insertions(+), 48 deletions(-)

Compared to the patch with the full wrapper:

 fs/tracefs/inode.c | 153 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 135 insertions(+), 18 deletions(-)

I'm thinking that the full wrapper may not be as bad. But then again, I
could probably clean up some of this code and have a single check for
both the lockdown and the "tracing_disabled" check.

-- Steve

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9fc14e38927f..eeeae0475da9 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -20,7 +20,6 @@
 #include <linux/parser.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
-#include <linux/security.h>
 
 #define TRACEFS_DEFAULT_MODE	0700
 
@@ -28,25 +27,6 @@ static struct vfsmount *tracefs_mount;
 static int tracefs_mount_count;
 static bool tracefs_registered;
 
-static int default_open_file(struct inode *inode, struct file *filp)
-{
-	struct dentry *dentry = filp->f_path.dentry;
-	struct file_operations *real_fops;
-	int ret;
-
-	if (!dentry)
-		return -EINVAL;
-
-	ret = security_locked_down(LOCKDOWN_TRACEFS);
-	if (ret)
-		return ret;
-
-	real_fops = dentry->d_fsdata;
-	if (!real_fops->open)
-		return 0;
-	return real_fops->open(inode, filp);
-}
-
 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 {
@@ -241,12 +221,6 @@ static int tracefs_apply_options(struct super_block *sb)
 	return 0;
 }
 
-static void tracefs_destroy_inode(struct inode *inode)
-{
-	if (S_ISREG(inode->i_mode))
-		kfree(inode->i_fop);
-}
-
 static int tracefs_remount(struct super_block *sb, int *flags, char *data)
 {
 	int err;
@@ -283,7 +257,6 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
 static const struct super_operations tracefs_super_operations = {
 	.statfs		= simple_statfs,
 	.remount_fs	= tracefs_remount,
-	.destroy_inode  = tracefs_destroy_inode,
 	.show_options	= tracefs_show_options,
 };
 
@@ -414,7 +387,6 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 				   struct dentry *parent, void *data,
 				   const struct file_operations *fops)
 {
-	struct file_operations *proxy_fops;
 	struct dentry *dentry;
 	struct inode *inode;
 
@@ -430,20 +402,8 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
-	proxy_fops = kzalloc(sizeof(struct file_operations), GFP_KERNEL);
-	if (unlikely(!proxy_fops)) {
-		iput(inode);
-		return failed_creating(dentry);
-	}
-
-	if (!fops)
-		fops = &tracefs_file_operations;
-
-	dentry->d_fsdata = (void *)fops;
-	memcpy(proxy_fops, fops, sizeof(*proxy_fops));
-	proxy_fops->open = default_open_file;
 	inode->i_mode = mode;
-	inode->i_fop = proxy_fops;
+	inode->i_fop = fops ? fops : &tracefs_file_operations;
 	inode->i_private = data;
 	d_instantiate(dentry, inode);
 	fsnotify_create(dentry->d_parent->d_inode, dentry);
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 62a50bf399d6..326253b4de93 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -18,6 +18,7 @@
 #include <linux/clocksource.h>
 #include <linux/sched/task.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/tracefs.h>
 #include <linux/hardirq.h>
@@ -3486,6 +3487,11 @@ static int
 ftrace_avail_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (unlikely(ftrace_disabled))
 		return -ENODEV;
@@ -3504,6 +3510,11 @@ static int
 ftrace_enabled_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
 	if (!iter)
@@ -3540,7 +3551,11 @@ ftrace_regex_open(struct ftrace_ops *ops, int flag,
 	struct ftrace_hash *hash;
 	struct list_head *mod_head;
 	struct trace_array *tr = ops->private;
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	ftrace_ops_init(ops);
 
@@ -3618,6 +3633,7 @@ ftrace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops,
 			FTRACE_ITER_FILTER | FTRACE_ITER_DO_PROBES,
 			inode, file);
@@ -3628,6 +3644,7 @@ ftrace_notrace_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_NOTRACE,
 				 inode, file);
 }
@@ -5194,9 +5211,13 @@ static int
 __ftrace_graph_open(struct inode *inode, struct file *file,
 		    struct ftrace_graph_data *fgd)
 {
-	int ret = 0;
+	int ret;
 	struct ftrace_hash *new_hash = NULL;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (file->f_mode & FMODE_WRITE) {
 		const int size_bits = FTRACE_HASH_DEFAULT_BITS;
 
@@ -6537,6 +6558,10 @@ ftrace_pid_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret = 0;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (trace_array_get(tr) < 0)
 		return -ENODEV;
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 252f79c435f8..4be1be27d064 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -17,6 +17,7 @@
 #include <linux/stacktrace.h>
 #include <linux/writeback.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/seq_file.h>
 #include <linux/notifier.h>
 #include <linux/irqflags.h>
@@ -4140,6 +4141,12 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
 
 int tracing_open_generic(struct inode *inode, struct file *filp)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -4159,6 +4166,11 @@ bool tracing_is_disabled(void)
 static int tracing_open_generic_tr(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (tracing_disabled)
 		return -ENODEV;
@@ -4233,7 +4245,11 @@ static int tracing_open(struct inode *inode, struct file *file)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (trace_array_get(tr) < 0)
 		return -ENODEV;
@@ -4352,6 +4368,10 @@ static int show_traces_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -4697,6 +4717,10 @@ static int tracing_trace_options_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -5038,6 +5062,12 @@ static const struct seq_operations tracing_saved_tgids_seq_ops = {
 
 static int tracing_saved_tgids_open(struct inode *inode, struct file *filp)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -5115,6 +5145,12 @@ static const struct seq_operations tracing_saved_cmdlines_seq_ops = {
 
 static int tracing_saved_cmdlines_open(struct inode *inode, struct file *filp)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -5280,6 +5316,12 @@ static const struct seq_operations tracing_eval_map_seq_ops = {
 
 static int tracing_eval_map_open(struct inode *inode, struct file *filp)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -5804,7 +5846,11 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
 {
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (tracing_disabled)
 		return -ENODEV;
@@ -6547,6 +6593,10 @@ static int tracing_clock_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -6581,6 +6631,10 @@ static int tracing_time_stamp_mode_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
@@ -6638,7 +6692,11 @@ static int tracing_snapshot_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	struct trace_iterator *iter;
 	struct seq_file *m;
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	if (trace_array_get(tr) < 0)
 		return -ENODEV;
@@ -6786,6 +6844,7 @@ static int snapshot_raw_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
+	/* The following checks for tracefs lockdown */
 	ret = tracing_buffers_open(inode, filp);
 	if (ret < 0)
 		return ret;
@@ -7105,6 +7164,10 @@ static int tracing_err_log_open(struct inode *inode, struct file *file)
 	struct trace_array *tr = inode->i_private;
 	int ret = 0;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (trace_array_get(tr) < 0)
 		return -ENODEV;
 
@@ -7157,6 +7220,10 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
 	struct ftrace_buffer_info *info;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if (tracing_disabled)
 		return -ENODEV;
 
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
index a41fed46c285..7a731416bbdd 100644
--- a/kernel/trace/trace_dynevent.c
+++ b/kernel/trace/trace_dynevent.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2018 Masami Hiramatsu <mhiramat@kernel.org>
  */
 
+#include <linux/security.h>
 #include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
@@ -174,6 +175,10 @@ static int dyn_event_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(NULL);
 		if (ret < 0)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index b89cdfe20bc1..cc02257fd6df 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -12,6 +12,7 @@
 #define pr_fmt(fmt) fmt
 
 #include <linux/workqueue.h>
+#include <linux/security.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
 #include <linux/tracefs.h>
@@ -1294,6 +1295,10 @@ static int trace_format_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = seq_open(file, &trace_format_seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1771,6 +1776,10 @@ ftrace_event_open(struct inode *inode, struct file *file,
 	struct seq_file *m;
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = seq_open(file, seq_ops);
 	if (ret < 0)
 		return ret;
@@ -1795,6 +1804,7 @@ ftrace_event_avail_open(struct inode *inode, struct file *file)
 {
 	const struct seq_operations *seq_ops = &show_event_seq_ops;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_event_open(inode, file, seq_ops);
 }
 
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 9468bd8d44a2..71dfe6889d62 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -7,6 +7,7 @@
 
 #include <linux/module.h>
 #include <linux/kallsyms.h>
+#include <linux/security.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
@@ -1448,6 +1449,10 @@ static int synth_events_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&synth_event_ops);
 		if (ret < 0)
@@ -5515,6 +5520,12 @@ static int hist_show(struct seq_file *m, void *v)
 
 static int event_hist_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return single_open(file, hist_show, file);
 }
 
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index 2a2912cb4533..2cd53ca21b51 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2013 Tom Zanussi <tom.zanussi@linux.intel.com>
  */
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/ctype.h>
 #include <linux/mutex.h>
@@ -173,7 +174,11 @@ static const struct seq_operations event_triggers_seq_ops = {
 
 static int event_trigger_regex_open(struct inode *inode, struct file *file)
 {
-	int ret = 0;
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
 
 	mutex_lock(&event_mutex);
 
@@ -292,6 +297,7 @@ event_trigger_write(struct file *filp, const char __user *ubuf,
 static int
 event_trigger_open(struct inode *inode, struct file *filp)
 {
+	/* Checks for tracefs lockdown */
 	return event_trigger_regex_open(inode, filp);
 }
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 324ffbea3556..e4422746cb4c 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt)	"trace_kprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
 #include <linux/rculist.h>
@@ -936,6 +937,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_kprobe_ops);
 		if (ret < 0)
@@ -988,6 +993,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 
diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
index c3fd849d4a8f..d4e31e969206 100644
--- a/kernel/trace/trace_printk.c
+++ b/kernel/trace/trace_printk.c
@@ -6,6 +6,7 @@
  *
  */
 #include <linux/seq_file.h>
+#include <linux/security.h>
 #include <linux/uaccess.h>
 #include <linux/kernel.h>
 #include <linux/ftrace.h>
@@ -348,6 +349,12 @@ static const struct seq_operations show_format_seq_ops = {
 static int
 ftrace_formats_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &show_format_seq_ops);
 }
 
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index ec9a34a97129..4df9a209f7ca 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -5,6 +5,7 @@
  */
 #include <linux/sched/task_stack.h>
 #include <linux/stacktrace.h>
+#include <linux/security.h>
 #include <linux/kallsyms.h>
 #include <linux/seq_file.h>
 #include <linux/spinlock.h>
@@ -470,6 +471,12 @@ static const struct seq_operations stack_trace_seq_ops = {
 
 static int stack_trace_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &stack_trace_seq_ops);
 }
 
@@ -487,6 +494,7 @@ stack_trace_filter_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_ops *ops = inode->i_private;
 
+	/* Checks for tracefs lockdown */
 	return ftrace_regex_open(ops, FTRACE_ITER_FILTER,
 				 inode, file);
 }
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75bf1bcb4a8a..9ab0a1a7ad5e 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -9,7 +9,7 @@
  *
  */
 
-
+#include <linux/security.h>
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
@@ -238,6 +238,10 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
 	struct seq_file *m;
 	struct stat_session *session = inode->i_private;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	ret = stat_seq_init(session);
 	if (ret)
 		return ret;
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index dd884341f5c5..352073d36585 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -7,6 +7,7 @@
  */
 #define pr_fmt(fmt)	"trace_uprobe: " fmt
 
+#include <linux/security.h>
 #include <linux/ctype.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -769,6 +770,10 @@ static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
 
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
 		ret = dyn_events_release_all(&trace_uprobe_ops);
 		if (ret)
@@ -818,6 +823,12 @@ static const struct seq_operations profile_seq_op = {
 
 static int profile_open(struct inode *inode, struct file *file)
 {
+	int ret;
+
+	ret = security_locked_down(LOCKDOWN_TRACEFS);
+	if (ret)
+		return ret;
+
 	return seq_open(file, &profile_seq_op);
 }
 

^ permalink raw reply related

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Ben Hutchings @ 2019-10-11 19:50 UTC (permalink / raw)
  To: Steven Rostedt, Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Al Viro
In-Reply-To: <20191011143610.21bcd9c0@gandalf.local.home>

[-- Attachment #1: Type: text/plain, Size: 724 bytes --]

On Fri, 2019-10-11 at 14:36 -0400, Steven Rostedt wrote:
> On Fri, 11 Oct 2019 11:20:30 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > Willing to do that instead?
> 
> Honestly, what you described was my preferred solution ;-)
> 
> I just didn't want to upset the lockdown crowd if a new tracefs file
> was opened without doing this.
> 
> Once locked down is set, can it ever be undone without rebooting?
[...]

Earlier versions of the lockdown patch set added a magic SysRq command
to turn it off.  That's not currently present upstream but there may be
plans to add it.

Ben.

-- 
Ben Hutchings
It is easier to change the specification to fit the program
than vice versa.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Linus Torvalds @ 2019-10-11 19:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011143610.21bcd9c0@gandalf.local.home>

On Fri, Oct 11, 2019 at 11:36 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 11 Oct 2019 11:20:30 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > Willing to do that instead?
>
> Honestly, what you described was my preferred solution ;-)
>
> I just didn't want to upset the lockdown crowd if a new tracefs file
> was opened without doing this.

Well, since they introduced a bug in your code that killed your
machine with the patch _they_ did, I don't think they get to complain
when you fix it the way you (and me) want to...

Fair is fair.

             Linus

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 18:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.com>

On Fri, 11 Oct 2019 11:20:30 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Willing to do that instead?

Honestly, what you described was my preferred solution ;-)

I just didn't want to upset the lockdown crowd if a new tracefs file
was opened without doing this.

Once locked down is set, can it ever be undone without rebooting? If
not, a lockdown call could also trigger setting tracing_disabled to 1.
Which is much stronger, as that was the code we added to kill tracing
if anything abnormal was detected (and it does a hard shutdown of all
the tracing utilities).

It's set to one on bootup and cleared, after tracing is initialized.
But it is never cleared again. If lockdown can be enabled at bootup, we
could simply not clear it, and we can have something to allow lockdown
to set it as well.

Currently, the only places tracing_disabled gets set is in the self
tests and if the ring buffer gets corrupted.

-- Steve

^ permalink raw reply

* Re: [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Linus Torvalds @ 2019-10-11 18:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Matthew Garrett, James Morris James Morris, LSM List,
	Linux API, Ben Hutchings, Al Viro
In-Reply-To: <20191011135458.7399da44@gandalf.local.home>

On Fri, Oct 11, 2019 at 10:55 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I bisected this down to the addition of the proxy_ops into tracefs for
> lockdown. It appears that the allocation of the proxy_ops and then freeing
> it in the destroy_inode callback, is causing havoc with the memory system.
> Reading the documentation about destroy_inode, I'm not sure that this is the
> proper way to handle allocating and then freeing the fops of the inode.

What is happening is that *because* it has a "destroy_inode()"
callback, it now expects destroy_inode to not just free free the proxy
ops, but to also schedule the inode itself for freeing.

Which that tracefs)destroy_inode() doesn't do.

So the inode never gets free'd at all - and you eventually run out of
memory due to the inode leak.

The trivial fix would be to instead use the "free_inode()" callback
(which is called after the required RCU grace period) and then free
the proxy op there _and_ call free_inode_nonrcu() too.

But I think your patch to not need a proxy op allocation at all is
probably better.

That said, I think the _best_ option would be to just getting rid of
the proxy fops _entirely_, and just make the (very few)
tracefs_create_file() cases do that LOCKDOWN_TRACEFS in their open in
the first place.

The proxy_fops was a hacky attempt to make the patch smaller. Your
"just wrap all ops" thing now made the patch bigger than just doing
the lockdown thing in all the callers.

In fact, many of them use tracing_open_generic(). And others - like
subsystem_open() already call tracing_open_generic() as part of their
open.

So from a quick glance, just making tracing_open_generic() do the
lockdown testing will take care of most cases.  Add a few other cases
to fill up the whole set, and your'e done.

Willing to do that instead?

                Linus

^ permalink raw reply

* [PATCH] tracefs: Do not allocate and free proxy_ops for lockdown
From: Steven Rostedt @ 2019-10-11 17:54 UTC (permalink / raw)
  To: LKML
  Cc: Matthew Garrett, jmorris, linux-security-module, linux-api,
	Ben Hutchings, Al Viro, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 7978 bytes --]


[ Attached the reproducers to this email ]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Running the latest kernel through my "make instances" stress tests, I
triggered the following bug (with KASAN and kmemleak enabled):

mkdir invoked oom-killer:
gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0,
oom_score_adj=0
CPU: 1 PID: 2229 Comm: mkdir Not tainted 5.4.0-rc2-test #325
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
Call Trace:
 dump_stack+0x64/0x8c
 dump_header+0x43/0x3b7
 ? trace_hardirqs_on+0x48/0x4a
 oom_kill_process+0x68/0x2d5
 out_of_memory+0x2aa/0x2d0
 __alloc_pages_nodemask+0x96d/0xb67
 __alloc_pages_node+0x19/0x1e
 alloc_slab_page+0x17/0x45
 new_slab+0xd0/0x234
 ___slab_alloc.constprop.86+0x18f/0x336
 ? alloc_inode+0x2c/0x74
 ? irq_trace+0x12/0x1e
 ? tracer_hardirqs_off+0x1d/0xd7
 ? __slab_alloc.constprop.85+0x21/0x53
 __slab_alloc.constprop.85+0x31/0x53
 ? __slab_alloc.constprop.85+0x31/0x53
 ? alloc_inode+0x2c/0x74
 kmem_cache_alloc+0x50/0x179
 ? alloc_inode+0x2c/0x74
 alloc_inode+0x2c/0x74
 new_inode_pseudo+0xf/0x48
 new_inode+0x15/0x25
 tracefs_get_inode+0x23/0x7c
 ? lookup_one_len+0x54/0x6c
 tracefs_create_file+0x53/0x11d
 trace_create_file+0x15/0x33
 event_create_dir+0x2a3/0x34b
 __trace_add_new_event+0x1c/0x26
 event_trace_add_tracer+0x56/0x86
 trace_array_create+0x13e/0x1e1
 instance_mkdir+0x8/0x17
 tracefs_syscall_mkdir+0x39/0x50
 ? get_dname+0x31/0x31
 vfs_mkdir+0x78/0xa3
 do_mkdirat+0x71/0xb0
 sys_mkdir+0x19/0x1b
 do_fast_syscall_32+0xb0/0xed

I bisected this down to the addition of the proxy_ops into tracefs for
lockdown. It appears that the allocation of the proxy_ops and then freeing
it in the destroy_inode callback, is causing havoc with the memory system.
Reading the documentation about destroy_inode, I'm not sure that this is the
proper way to handle allocating and then freeing the fops of the inode.

Instead of allocating the proxy_ops (and then having to free it), I created
a static proxy_ops. As tracefs only uses a subset of all the file_operations
methods, that subset can be defined in the static proxy_ops, and then the
passed in fops during the creation of the inode is saved in the dentry, and
that is use to call the real functions by the proxy_ops.

Fixes: ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 fs/tracefs/inode.c | 153 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 135 insertions(+), 18 deletions(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9fc14e38927f..d0e8e4a16812 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -20,6 +20,7 @@
 #include <linux/parser.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
+#include <linux/poll.h>
 #include <linux/security.h>
 
 #define TRACEFS_DEFAULT_MODE	0700
@@ -28,7 +29,7 @@ static struct vfsmount *tracefs_mount;
 static int tracefs_mount_count;
 static bool tracefs_registered;
 
-static int default_open_file(struct inode *inode, struct file *filp)
+static int proxy_open(struct inode *inode, struct file *filp)
 {
 	struct dentry *dentry = filp->f_path.dentry;
 	struct file_operations *real_fops;
@@ -47,6 +48,138 @@ static int default_open_file(struct inode *inode, struct file *filp)
 	return real_fops->open(inode, filp);
 }
 
+static ssize_t proxy_read(struct file *file, char __user *buf,
+			  size_t count, loff_t *pos)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct file_operations *real_fops;
+
+	if (!dentry)
+		return -EINVAL;
+
+	real_fops = dentry->d_fsdata;
+
+	if (real_fops->read)
+		return real_fops->read(file, buf, count, pos);
+	else
+		return -EINVAL;
+}
+
+static ssize_t proxy_write(struct file *file, const char __user *p,
+			   size_t count, loff_t *pos)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct file_operations *real_fops;
+
+	if (!dentry)
+		return -EINVAL;
+
+	real_fops = dentry->d_fsdata;
+
+	if (real_fops->write)
+		return real_fops->write(file, p, count, pos);
+	else
+		return -EINVAL;
+}
+
+static loff_t proxy_llseek(struct file *file, loff_t offset, int whence)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct file_operations *real_fops;
+	loff_t (*fn)(struct file *, loff_t, int);
+
+	if (!dentry)
+		return -EINVAL;
+
+	real_fops = dentry->d_fsdata;
+
+	fn = no_llseek;
+	if (file->f_mode & FMODE_LSEEK) {
+		if (real_fops->llseek)
+			fn = real_fops->llseek;
+	}
+	return fn(file, offset, whence);
+}
+
+static int proxy_release(struct inode *inode, struct file *filp)
+{
+	struct dentry *dentry = filp->f_path.dentry;
+	struct file_operations *real_fops;
+
+	if (!dentry)
+		return 0;
+
+	real_fops = dentry->d_fsdata;
+
+	if (real_fops->release)
+		return real_fops->release(inode, filp);
+	return 0;
+}
+
+static __poll_t proxy_poll(struct file *file, struct poll_table_struct *pt)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct file_operations *real_fops;
+
+	if (!dentry)
+		return 0;
+
+	real_fops = dentry->d_fsdata;
+
+	if (unlikely(!real_fops->poll))
+		return DEFAULT_POLLMASK;
+	return real_fops->poll(file, pt);
+}
+
+static ssize_t proxy_splice_read(struct file *in, loff_t *ppos,
+				 struct pipe_inode_info *pipe, size_t len,
+				 unsigned int flags)
+{
+	struct dentry *dentry = in->f_path.dentry;
+	struct file_operations *real_fops;
+	ssize_t (*splice_read)(struct file *, loff_t *,
+			       struct pipe_inode_info *, size_t, unsigned int);
+
+	if (!dentry)
+		return 0;
+
+	real_fops = dentry->d_fsdata;
+
+	if (real_fops->splice_read)
+		splice_read = real_fops->splice_read;
+	else
+		splice_read = generic_file_splice_read;
+
+	return splice_read(in, ppos, pipe, len, flags);
+}
+
+static int proxy_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct file_operations *real_fops;
+
+	if (!dentry)
+		return 0;
+
+	real_fops = dentry->d_fsdata;
+
+	if (!real_fops->mmap)
+		return -ENODEV;
+
+	return real_fops->mmap(file, vma);
+}
+
+static const struct file_operations proxy_fops = {
+	.open		= proxy_open,
+	.read		= proxy_read,
+	.write		= proxy_write,
+	.llseek		= proxy_llseek,
+	.release	= proxy_release,
+	.poll		= proxy_poll,
+	.splice_read	= proxy_splice_read,
+	.mmap		= proxy_mmap,
+};
+
 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 {
@@ -241,12 +374,6 @@ static int tracefs_apply_options(struct super_block *sb)
 	return 0;
 }
 
-static void tracefs_destroy_inode(struct inode *inode)
-{
-	if (S_ISREG(inode->i_mode))
-		kfree(inode->i_fop);
-}
-
 static int tracefs_remount(struct super_block *sb, int *flags, char *data)
 {
 	int err;
@@ -283,7 +410,6 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
 static const struct super_operations tracefs_super_operations = {
 	.statfs		= simple_statfs,
 	.remount_fs	= tracefs_remount,
-	.destroy_inode  = tracefs_destroy_inode,
 	.show_options	= tracefs_show_options,
 };
 
@@ -414,7 +540,6 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 				   struct dentry *parent, void *data,
 				   const struct file_operations *fops)
 {
-	struct file_operations *proxy_fops;
 	struct dentry *dentry;
 	struct inode *inode;
 
@@ -430,20 +555,12 @@ struct dentry *tracefs_create_file(const char *name, umode_t mode,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
-	proxy_fops = kzalloc(sizeof(struct file_operations), GFP_KERNEL);
-	if (unlikely(!proxy_fops)) {
-		iput(inode);
-		return failed_creating(dentry);
-	}
-
 	if (!fops)
 		fops = &tracefs_file_operations;
 
 	dentry->d_fsdata = (void *)fops;
-	memcpy(proxy_fops, fops, sizeof(*proxy_fops));
-	proxy_fops->open = default_open_file;
 	inode->i_mode = mode;
-	inode->i_fop = proxy_fops;
+	inode->i_fop = &proxy_fops;
 	inode->i_private = data;
 	d_instantiate(dentry, inode);
 	fsnotify_create(dentry->d_parent->d_inode, dentry);
-- 
2.20.1


[-- Attachment #2: ftrace-test-mkinstances --]
[-- Type: application/octet-stream, Size: 1000 bytes --]

#!/bin/bash

tracefs=`cat /proc/mounts  |grep tracefs| head -1 | cut -d' ' -f2`

if [ -z "$tracefs" ]; then
	echo "tracefs not mounted"
	exit 0
fi

if [ ! -d $tracefs/instances ]; then
	echo "No instances directory"
	exit 0
fi

cd $tracefs/instances

mkdir x
rmdir x
result=$?

if [ $result -ne 0 ]; then
	echo "instance rmdir not supported, skipping this test"
	exit 0
fi

instance_slam() {
	while :; do
		mkdir x
		mkdir y
		mkdir z
		rmdir x
		rmdir y
		rmdir z
	done 2>/dev/null
}

instance_slam &
p1=$!
echo $p1

instance_slam &
p2=$!
echo $p2

instance_slam &
p3=$!
echo $p3

instance_slam &
p4=$!
echo $p4

instance_slam &
p5=$!
echo $p5

for i in `seq 10`; do
	ls
	sleep 1
done
kill -1 $p1 
kill -1 $p2
kill -1 $p3
kill -1 $p4
kill -1 $p5

echo "Wait for processes to finish"
wait $p1 $p2 $p3 $p4 $p5
echo "all processes finished, wait for cleanup"
sleep 2

mkdir x y z
ls x y z
rmdir x y z
for d in x y z; do
	if [ -d $d ]; then
		echo $d still exists
		exit -1
	fi
done
echo SUCCESS
exit 0

[-- Attachment #3: ftrace-test-mkinstances-2 --]
[-- Type: application/octet-stream, Size: 890 bytes --]

#!/bin/bash

tracefs=`cat /proc/mounts  |grep tracefs| head -1 | cut -d' ' -f2`

if [ -z "$tracefs" ]; then
	echo "tracefs not mounted"
	exit 0
fi

if [ ! -d $tracefs/instances ]; then
	echo "No instances directory"
	exit 0
fi

cd $tracefs/instances

instance_slam() {
	while :; do
		mkdir foo &> /dev/null
		rmdir foo &> /dev/null
	done
}

instance_read() {
	while :; do
		cat foo/trace &> /dev/null
	done
}

instance_set() {
	while :; do
		echo 1 > foo/events/sched/sched_switch
	done 2> /dev/null
}

instance_slam &
x=`jobs -l`
p1=`echo $x | cut -d' ' -f2`
echo $p1

instance_set &
x=`jobs -l | tail -1`
p2=`echo $x | cut -d' ' -f2`
echo $p2

sleep 10

kill -1 $p1 
kill -1 $p2

echo "Wait for processes to finish"
wait $p1 $p2
echo "all processes finished, wait for cleanup"
sleep 2

mkdir foo
ls foo
rmdir foo
if [ -d foo ]; then
	echo foo still exists
	exit -1
fi
echo SUCCESS
exit 0

^ permalink raw reply related

* Re: [PATCH] perf_event: Add support for LSM and SELinux checks
From: James Morris @ 2019-10-11 16:35 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, Peter Zijlstra, rostedt, primiano, rsavitski, jeffv,
	kernel-team, Alexei Starovoitov, Arnaldo Carvalho de Melo, bpf,
	Daniel Borkmann, Ingo Molnar, Jiri Olsa, Kees Cook,
	linux-security-module, Matthew Garrett, Namhyung Kim, selinux,
	Song Liu, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Yonghong Song
In-Reply-To: <20191011160330.199604-1-joel@joelfernandes.org>

On Fri, 11 Oct 2019, Joel Fernandes (Google) wrote:

> This patch adds LSM and SELinux access checking which will be used in
> Android to access perf_event_open(2) for the purposes of attaching BPF
> programs to tracepoints, perf profiling and other operations from
> userspace. These operations are intended for production systems.


Acked-by: James Morris <jmorris@namei.org>

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply

* [PATCH] perf_event: Add support for LSM and SELinux checks
From: Joel Fernandes (Google) @ 2019-10-11 16:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google), Peter Zijlstra, rostedt, primiano,
	rsavitski, jeffv, kernel-team, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, bpf, Daniel Borkmann, Ingo Molnar,
	James Morris, Jiri Olsa, Kees Cook, linux-security-module,
	Matthew Garrett, Namhyung Kim, selinux, Song Liu,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Yonghong Song

In currentl mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl.  This has a number of
limitations:

1. The sysctl is only a single value. Many types of accesses are controlled
   based on the single value thus making the control very limited and
   coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
   all processes get access to perf_event_open(2) opening the door to
   security issues.

This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.

5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
   syscall itself. The hook is called from all the places that the
   perf_event_paranoid sysctl is checked to keep it consistent with the
   systctl. The hook gets passed a 'type' argument which controls CPU,
   kernel and tracepoint accesses (in this context, CPU, kernel and
   tracepoint have the same semantics as the perf_event_paranoid sysctl).
   Additionally, I added an 'open' type which is similar to
   perf_event_paranoid sysctl == 3 patch carried in Android and several other
   distros but was rejected in mainline [1] in 2016.

2. perf_event_alloc: This allocates a new security object for the event
   which stores the current SID within the event. It will be useful when
   the perf event's FD is passed through IPC to another process which may
   try to read the FD. Appropriate security checks will limit access.

3. perf_event_free: Called when the event is closed.

4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.

5. perf_event_write: Called from the ioctl(2) syscalls for the event.

[1] https://lwn.net/Articles/696240/

Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.

To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rostedt@goodmis.org
Cc: primiano@google.com
Cc: rsavitski@google.com
Cc: jeffv@google.com
Cc: kernel-team@android.com
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
Changes since RFC:
 o Small nits, style changes (James Morris).
 o Consolidation of code (Peter Zijlstra).


 arch/x86/events/intel/bts.c         |  8 ++--
 arch/x86/events/intel/core.c        |  5 ++-
 arch/x86/events/intel/p4.c          |  5 ++-
 include/linux/lsm_hooks.h           | 15 +++++++
 include/linux/perf_event.h          | 28 +++++++++---
 include/linux/security.h            | 39 +++++++++++++++-
 include/uapi/linux/perf_event.h     |  9 ++++
 kernel/events/core.c                | 52 +++++++++++++++++-----
 kernel/trace/trace_event_perf.c     | 15 ++++---
 security/security.c                 | 27 +++++++++++
 security/selinux/hooks.c            | 69 +++++++++++++++++++++++++++++
 security/selinux/include/classmap.h |  2 +
 security/selinux/include/objsec.h   |  6 ++-
 13 files changed, 250 insertions(+), 30 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 5ee3fed881d3..38de4a7f6752 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -549,9 +549,11 @@ static int bts_event_init(struct perf_event *event)
 	 * Note that the default paranoia setting permits unprivileged
 	 * users to profile the kernel.
 	 */
-	if (event->attr.exclude_kernel && perf_paranoid_kernel() &&
-	    !capable(CAP_SYS_ADMIN))
-		return -EACCES;
+	if (event->attr.exclude_kernel) {
+		ret = perf_allow_kernel(&event->attr);
+		if (ret)
+			return ret;
+	}
 
 	if (x86_add_exclusive(x86_lbr_exclusive_bts))
 		return -EBUSY;
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 27ee47a7be66..32967a9e9962 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3315,8 +3315,9 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	if (x86_pmu.version < 3)
 		return -EINVAL;
 
-	if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
-		return -EACCES;
+	ret = perf_allow_cpu(&event->attr);
+	if (ret)
+		return ret;
 
 	event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
 
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index dee579efb2b2..a4cc66005ce8 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -776,8 +776,9 @@ static int p4_validate_raw_event(struct perf_event *event)
 	 * the user needs special permissions to be able to use it
 	 */
 	if (p4_ht_active() && p4_event_bind_map[v].shared) {
-		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
-			return -EACCES;
+		v = perf_allow_cpu(&event->attr);
+		if (v)
+			return v;
 	}
 
 	/* ESCR EventMask bits may be invalid */
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a3763247547c..20d8cf194fb7 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1818,6 +1818,14 @@ union security_list_options {
 	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
 #endif /* CONFIG_BPF_SYSCALL */
 	int (*locked_down)(enum lockdown_reason what);
+#ifdef CONFIG_PERF_EVENTS
+	int (*perf_event_open)(struct perf_event_attr *attr, int type);
+	int (*perf_event_alloc)(struct perf_event *event);
+	void (*perf_event_free)(struct perf_event *event);
+	int (*perf_event_read)(struct perf_event *event);
+	int (*perf_event_write)(struct perf_event *event);
+
+#endif
 };
 
 struct security_hook_heads {
@@ -2060,6 +2068,13 @@ struct security_hook_heads {
 	struct hlist_head bpf_prog_free_security;
 #endif /* CONFIG_BPF_SYSCALL */
 	struct hlist_head locked_down;
+#ifdef CONFIG_PERF_EVENTS
+	struct hlist_head perf_event_open;
+	struct hlist_head perf_event_alloc;
+	struct hlist_head perf_event_free;
+	struct hlist_head perf_event_read;
+	struct hlist_head perf_event_write;
+#endif
 } __randomize_layout;
 
 /*
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 61448c19a132..664bb7f99c46 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -56,6 +56,7 @@ struct perf_guest_info_callbacks {
 #include <linux/perf_regs.h>
 #include <linux/cgroup.h>
 #include <linux/refcount.h>
+#include <linux/security.h>
 #include <asm/local.h>
 
 struct perf_callchain_entry {
@@ -721,6 +722,9 @@ struct perf_event {
 	struct perf_cgroup		*cgrp; /* cgroup event is attach to */
 #endif
 
+#ifdef CONFIG_SECURITY
+	void *security;
+#endif
 	struct list_head		sb_list;
 #endif /* CONFIG_PERF_EVENTS */
 };
@@ -1241,19 +1245,33 @@ extern int perf_cpu_time_max_percent_handler(struct ctl_table *table, int write,
 int perf_event_max_stack_handler(struct ctl_table *table, int write,
 				 void __user *buffer, size_t *lenp, loff_t *ppos);
 
-static inline bool perf_paranoid_tracepoint_raw(void)
+static inline int perf_is_paranoid(void)
 {
 	return sysctl_perf_event_paranoid > -1;
 }
 
-static inline bool perf_paranoid_cpu(void)
+static inline int perf_allow_kernel(struct perf_event_attr *attr)
 {
-	return sysctl_perf_event_paranoid > 0;
+	if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
 }
 
-static inline bool perf_paranoid_kernel(void)
+static inline int perf_allow_cpu(struct perf_event_attr *attr)
 {
-	return sysctl_perf_event_paranoid > 1;
+	if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	return security_perf_event_open(attr, PERF_SECURITY_CPU);
+}
+
+static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
+{
+	if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
 }
 
 extern void perf_event_init(void);
diff --git a/include/linux/security.h b/include/linux/security.h
index a8d59d612d27..273e11c66ed7 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1894,5 +1894,42 @@ static inline void security_bpf_prog_free(struct bpf_prog_aux *aux)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_BPF_SYSCALL */
 
-#endif /* ! __LINUX_SECURITY_H */
+#ifdef CONFIG_PERF_EVENTS
+struct perf_event_attr;
+
+#ifdef CONFIG_SECURITY
+extern int security_perf_event_open(struct perf_event_attr *attr, int type);
+extern int security_perf_event_alloc(struct perf_event *event);
+extern void security_perf_event_free(struct perf_event *event);
+extern int security_perf_event_read(struct perf_event *event);
+extern int security_perf_event_write(struct perf_event *event);
+#else
+static inline int security_perf_event_open(struct perf_event_attr *attr,
+					   int type)
+{
+	return 0;
+}
 
+static inline int security_perf_event_alloc(struct perf_event *event)
+{
+	return 0;
+}
+
+static inline void security_perf_event_free(struct perf_event *event)
+{
+	return 0;
+}
+
+static inline int security_perf_event_read(struct perf_event *event)
+{
+	return 0;
+}
+
+static inline int security_perf_event_write(struct perf_event *event)
+{
+	return 0;
+}
+#endif /* CONFIG_SECURITY */
+#endif /* CONFIG_PERF_EVENTS */
+
+#endif /* ! __LINUX_SECURITY_H */
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index bb7b271397a6..2af95f937a5b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -427,6 +427,15 @@ struct perf_event_attr {
 	__u16	__reserved_2;	/* align to __u64 */
 };
 
+
+/* Access to perf_event_open(2) syscall. */
+#define PERF_SECURITY_OPEN		0
+
+/* Finer grained perf_event_open(2) access control. */
+#define PERF_SECURITY_CPU		1
+#define PERF_SECURITY_KERNEL		2
+#define PERF_SECURITY_TRACEPOINT	3
+
 /*
  * Structure used by below PERF_EVENT_IOC_QUERY_BPF command
  * to query bpf programs attached to the same perf tracepoint
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4655adbbae10..913e64e0790c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4217,8 +4217,9 @@ find_get_context(struct pmu *pmu, struct task_struct *task,
 
 	if (!task) {
 		/* Must be root to operate on a CPU event: */
-		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
-			return ERR_PTR(-EACCES);
+		err = perf_allow_cpu(&event->attr);
+		if (err)
+			return ERR_PTR(err);
 
 		cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
 		ctx = &cpuctx->ctx;
@@ -4761,6 +4762,7 @@ int perf_event_release_kernel(struct perf_event *event)
 	}
 
 no_ctx:
+	security_perf_event_free(event);
 	put_event(event); /* Must be the 'last' reference */
 	return 0;
 }
@@ -4980,6 +4982,10 @@ perf_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	struct perf_event_context *ctx;
 	int ret;
 
+	ret = security_perf_event_read(event);
+	if (ret)
+		return ret;
+
 	ctx = perf_event_ctx_lock(event);
 	ret = __perf_read(event, buf, count);
 	perf_event_ctx_unlock(event, ctx);
@@ -5244,6 +5250,11 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	struct perf_event_context *ctx;
 	long ret;
 
+	/* Treat ioctl like writes as it is likely a mutating operation. */
+	ret = security_perf_event_write(event);
+	if (ret)
+		return ret;
+
 	ctx = perf_event_ctx_lock(event);
 	ret = _perf_ioctl(event, cmd, arg);
 	perf_event_ctx_unlock(event, ctx);
@@ -5706,6 +5717,10 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
 	if (!(vma->vm_flags & VM_SHARED))
 		return -EINVAL;
 
+	ret = security_perf_event_read(event);
+	if (ret)
+		return ret;
+
 	vma_size = vma->vm_end - vma->vm_start;
 
 	if (vma->vm_pgoff == 0) {
@@ -5819,7 +5834,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
 	lock_limit >>= PAGE_SHIFT;
 	locked = atomic64_read(&vma->vm_mm->pinned_vm) + extra;
 
-	if ((locked > lock_limit) && perf_paranoid_tracepoint_raw() &&
+	if ((locked > lock_limit) && perf_is_paranoid() &&
 		!capable(CAP_IPC_LOCK)) {
 		ret = -EPERM;
 		goto unlock;
@@ -10553,11 +10568,16 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		}
 	}
 
+	err = security_perf_event_alloc(event);
+	if (err)
+		goto err_security;
+
 	/* symmetric to unaccount_event() in _free_event() */
 	account_event(event);
 
 	return event;
 
+err_security:
 err_addr_filters:
 	kfree(event->addr_filter_ranges);
 
@@ -10675,9 +10695,11 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			attr->branch_sample_type = mask;
 		}
 		/* privileged levels capture (kernel, hv): check permissions */
-		if ((mask & PERF_SAMPLE_BRANCH_PERM_PLM)
-		    && perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
-			return -EACCES;
+		if (mask & PERF_SAMPLE_BRANCH_PERM_PLM) {
+			ret = perf_allow_kernel(attr);
+			if (ret)
+				return ret;
+		}
 	}
 
 	if (attr->sample_type & PERF_SAMPLE_REGS_USER) {
@@ -10890,13 +10912,19 @@ SYSCALL_DEFINE5(perf_event_open,
 	if (flags & ~PERF_FLAG_ALL)
 		return -EINVAL;
 
+	/* Do we allow access to perf_event_open(2) ? */
+	err = security_perf_event_open(&attr, PERF_SECURITY_OPEN);
+	if (err)
+		return err;
+
 	err = perf_copy_attr(attr_uptr, &attr);
 	if (err)
 		return err;
 
 	if (!attr.exclude_kernel) {
-		if (perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
-			return -EACCES;
+		err = perf_allow_kernel(&attr);
+		if (err)
+			return err;
 	}
 
 	if (attr.namespaces) {
@@ -10913,9 +10941,11 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	/* Only privileged users can get physical addresses */
-	if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR) &&
-	    perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
-		return -EACCES;
+	if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR)) {
+		err = perf_allow_kernel(&attr);
+		if (err)
+			return err;
+	}
 
 	err = security_locked_down(LOCKDOWN_PERF);
 	if (err && (attr.sample_type & PERF_SAMPLE_REGS_INTR))
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 0892e38ed6fb..0917fee6ee7c 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -8,6 +8,7 @@
 
 #include <linux/module.h>
 #include <linux/kprobes.h>
+#include <linux/security.h>
 #include "trace.h"
 #include "trace_probe.h"
 
@@ -26,8 +27,10 @@ static int	total_ref_count;
 static int perf_trace_event_perm(struct trace_event_call *tp_event,
 				 struct perf_event *p_event)
 {
+	int ret;
+
 	if (tp_event->perf_perm) {
-		int ret = tp_event->perf_perm(tp_event, p_event);
+		ret = tp_event->perf_perm(tp_event, p_event);
 		if (ret)
 			return ret;
 	}
@@ -46,8 +49,9 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
 
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event)) {
-		if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
-			return -EPERM;
+		ret = perf_allow_tracepoint(&p_event->attr);
+		if (ret)
+			return ret;
 
 		if (!is_sampling_event(p_event))
 			return 0;
@@ -82,8 +86,9 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
 	 * ...otherwise raw tracepoint data can be a severe data leak,
 	 * only allow root to have these.
 	 */
-	if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
-		return -EPERM;
+	ret = perf_allow_tracepoint(&p_event->attr);
+	if (ret)
+		return ret;
 
 	return 0;
 }
diff --git a/security/security.c b/security/security.c
index 1bc000f834e2..cd2d18d2d279 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2404,3 +2404,30 @@ int security_locked_down(enum lockdown_reason what)
 	return call_int_hook(locked_down, 0, what);
 }
 EXPORT_SYMBOL(security_locked_down);
+
+#ifdef CONFIG_PERF_EVENTS
+int security_perf_event_open(struct perf_event_attr *attr, int type)
+{
+	return call_int_hook(perf_event_open, 0, attr, type);
+}
+
+int security_perf_event_alloc(struct perf_event *event)
+{
+	return call_int_hook(perf_event_alloc, 0, event);
+}
+
+void security_perf_event_free(struct perf_event *event)
+{
+	call_void_hook(perf_event_free, event);
+}
+
+int security_perf_event_read(struct perf_event *event)
+{
+	return call_int_hook(perf_event_read, 0, event);
+}
+
+int security_perf_event_write(struct perf_event *event)
+{
+	return call_int_hook(perf_event_write, 0, event);
+}
+#endif /* CONFIG_PERF_EVENTS */
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9625b99e677f..28eb05490d59 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -6795,6 +6795,67 @@ struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
 	.lbs_msg_msg = sizeof(struct msg_security_struct),
 };
 
+#ifdef CONFIG_PERF_EVENTS
+static int selinux_perf_event_open(struct perf_event_attr *attr, int type)
+{
+	u32 requested, sid = current_sid();
+
+	if (type == PERF_SECURITY_OPEN)
+		requested = PERF_EVENT__OPEN;
+	else if (type == PERF_SECURITY_CPU)
+		requested = PERF_EVENT__CPU;
+	else if (type == PERF_SECURITY_KERNEL)
+		requested = PERF_EVENT__KERNEL;
+	else if (type == PERF_SECURITY_TRACEPOINT)
+		requested = PERF_EVENT__TRACEPOINT;
+	else
+		return -EINVAL;
+
+	return avc_has_perm(&selinux_state, sid, sid, SECCLASS_PERF_EVENT,
+			    requested, NULL);
+}
+
+static int selinux_perf_event_alloc(struct perf_event *event)
+{
+	struct perf_event_security_struct *perfsec;
+
+	perfsec = kzalloc(sizeof(*perfsec), GFP_KERNEL);
+	if (!perfsec)
+		return -ENOMEM;
+
+	perfsec->sid = current_sid();
+	event->security = perfsec;
+
+	return 0;
+}
+
+static void selinux_perf_event_free(struct perf_event *event)
+{
+	struct perf_event_security_struct *perfsec = event->security;
+
+	event->security = NULL;
+	kfree(perfsec);
+}
+
+static int selinux_perf_event_read(struct perf_event *event)
+{
+	struct perf_event_security_struct *perfsec = event->security;
+	u32 sid = current_sid();
+
+	return avc_has_perm(&selinux_state, sid, perfsec->sid,
+			    SECCLASS_PERF_EVENT, PERF_EVENT__READ, NULL);
+}
+
+static int selinux_perf_event_write(struct perf_event *event)
+{
+	struct perf_event_security_struct *perfsec = event->security;
+	u32 sid = current_sid();
+
+	return avc_has_perm(&selinux_state, sid, perfsec->sid,
+			    SECCLASS_PERF_EVENT, PERF_EVENT__WRITE, NULL);
+}
+#endif
+
 static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(binder_set_context_mgr, selinux_binder_set_context_mgr),
 	LSM_HOOK_INIT(binder_transaction, selinux_binder_transaction),
@@ -7030,6 +7091,14 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
 	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
 #endif
+
+#ifdef CONFIG_PERF_EVENTS
+	LSM_HOOK_INIT(perf_event_open, selinux_perf_event_open),
+	LSM_HOOK_INIT(perf_event_alloc, selinux_perf_event_alloc),
+	LSM_HOOK_INIT(perf_event_free, selinux_perf_event_free),
+	LSM_HOOK_INIT(perf_event_read, selinux_perf_event_read),
+	LSM_HOOK_INIT(perf_event_write, selinux_perf_event_write),
+#endif
 };
 
 static __init int selinux_init(void)
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 32e9b03be3dd..7db24855e12d 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -244,6 +244,8 @@ struct security_class_mapping secclass_map[] = {
 	  {"map_create", "map_read", "map_write", "prog_load", "prog_run"} },
 	{ "xdp_socket",
 	  { COMMON_SOCK_PERMS, NULL } },
+	{ "perf_event",
+	  {"open", "cpu", "kernel", "tracepoint", "read", "write"} },
 	{ NULL }
   };
 
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 586b7abd0aa7..a4a86cbcfb0a 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -141,7 +141,11 @@ struct pkey_security_struct {
 };
 
 struct bpf_security_struct {
-	u32 sid;  /*SID of bpf obj creater*/
+	u32 sid;  /* SID of bpf obj creator */
+};
+
+struct perf_event_security_struct {
+	u32 sid;  /* SID of perf_event obj creator */
 };
 
 extern struct lsm_blob_sizes selinux_blob_sizes;
-- 
2.23.0.700.g56cf767bdb-goog

^ permalink raw reply related

* Re: [PATCH RFC] perf_event: Add support for LSM and SELinux checks
From: Joel Fernandes @ 2019-10-11 15:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, rostedt, primiano, rsavitski, jeffv, kernel-team,
	Alexei Starovoitov, Arnaldo Carvalho de Melo, bpf,
	Daniel Borkmann, Ingo Molnar, James Morris, Jiri Olsa, Kees Cook,
	linux-security-module, Matthew Garrett, Namhyung Kim, selinux,
	Song Liu, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Yonghong Song
In-Reply-To: <20191011070543.GV2328@hirez.programming.kicks-ass.net>

On Fri, Oct 11, 2019 at 09:05:43AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 10, 2019 at 02:31:14PM -0400, Joel Fernandes wrote:
> > On Thu, Oct 10, 2019 at 07:09:49PM +0200, Peter Zijlstra wrote:
> 
> > > Yes, I did notice, I found it weird.
> > > 
> > > If you have CAP_IPC_LIMIT you should be able to bust mlock memory
> > > limits, so I don't see why we should further relate that to paranoid.
> > > 
> > > The way I wrote it, we also allow to bust the limit if we have disabled
> > > all paranoid checks. Which makes some sense I suppose.
> > > 
> > > The original commit is this:
> > > 
> > >   459ec28ab404 ("perf_counter: Allow mmap if paranoid checks are turned off")
> > 
> > I am thinking we can just a new function perf_is_paranoid() that has nothing
> > to do with the CAP_SYS_ADMIN check and doesn't have tracepoint wording:
> > 
> > static inline int perf_is_paranoid(void)
> > {
> > 	return sysctl_perf_event_paranoid > -1;
> > }
> > 
> > And then call that from the mmap() code:
> > if (locked > lock_limit && perf_is_paranoid() && !capable(CAP_IPC_LOCK)) {
> > 	return -EPERM;
> > }
> > 
> > I don't think we need to add selinux security checks here since we are
> > already adding security checks earlier in mmap(). This will make the code and
> > its intention more clear and in line with the commit 459ec28ab404 you
> > mentioned. Thoughts?
> 
> Mostly that I'm confused by the current code ;-)
> 
> Like I said, CAP_IPC_LIMIT on its own should already allow busting the
> limit, I don't really see why we should make it conditional on paranoid.
> 
> But if you want to preserve behaviour (arguably a sane thing for your
> patch) then yes, feel free to do as you propose.

Ok, I will do it as I proposed above and resend patch today. Thanks!

 - Joel


^ permalink raw reply

* Re: [Patch v7 0/4] Create and consolidate trusted keys subsystem
From: Jarkko Sakkinen @ 2019-10-11 12:37 UTC (permalink / raw)
  To: Sumit Garg
  Cc: dhowells, peterhuewe, keyrings, linux-integrity, linux-crypto,
	linux-security-module, herbert, davem, jgg, arnd, gregkh, jejb,
	zohar, jmorris, serge, jsnitsel, linux-kernel, daniel.thompson
In-Reply-To: <1570425935-7435-1-git-send-email-sumit.garg@linaro.org>

On Mon, Oct 07, 2019 at 10:55:31AM +0530, Sumit Garg wrote:
> This patch-set does restructuring of trusted keys code to create and
> consolidate trusted keys subsystem.
> 
> Also, patch #2 replaces tpm1_buf code used in security/keys/trusted.c and
> crypto/asymmertic_keys/asym_tpm.c files to use the common tpm_buf code.
> 
> Changes in v7:
> 1. Rebased to top of tpmdd/master
> 2. Patch #4: update tpm2 trusted keys code to use tpm_send() instead of
>    tpm_transmit_cmd() which is an internal function.
> 
> Changes in v6:
> 1. Switch TPM asymmetric code also to use common tpm_buf code. These
>    changes required patches #1 and #2 update, so I have dropped review
>    tags from those patches.
> 2. Incorporated miscellaneous comments from Jarkko.
> 
> Changes in v5:
> 1. Drop 5/5 patch as its more relavant along with TEE patch-set.
> 2. Add Reviewed-by tag for patch #2.
> 3. Fix build failure when "CONFIG_HEADER_TEST" and
>    "CONFIG_KERNEL_HEADER_TEST" config options are enabled.
> 4. Misc changes to rename files.
> 
> Changes in v4:
> 1. Separate patch for export of tpm_buf code to include/linux/tpm.h
> 2. Change TPM1.x trusted keys code to use common tpm_buf
> 3. Keep module name as trusted.ko only
> 
> Changes in v3:
> 
> Move TPM2 trusted keys code to trusted keys subsystem.
> 
> Changes in v2:
> 
> Split trusted keys abstraction patch for ease of review.
> 
> Sumit Garg (4):
>   tpm: Move tpm_buf code to include/linux/
>   KEYS: Use common tpm_buf for trusted and asymmetric keys
>   KEYS: trusted: Create trusted keys subsystem
>   KEYS: trusted: Move TPM2 trusted keys code
> 
>  crypto/asymmetric_keys/asym_tpm.c                  | 101 +++----
>  drivers/char/tpm/tpm-interface.c                   |  56 ----
>  drivers/char/tpm/tpm.h                             | 226 ---------------
>  drivers/char/tpm/tpm2-cmd.c                        | 307 --------------------
>  include/Kbuild                                     |   1 -
>  include/keys/{trusted.h => trusted_tpm.h}          |  49 +---
>  include/linux/tpm.h                                | 251 ++++++++++++++--
>  security/keys/Makefile                             |   2 +-
>  security/keys/trusted-keys/Makefile                |   8 +
>  .../{trusted.c => trusted-keys/trusted_tpm1.c}     |  96 +++----
>  security/keys/trusted-keys/trusted_tpm2.c          | 314 +++++++++++++++++++++
>  11 files changed, 652 insertions(+), 759 deletions(-)
>  rename include/keys/{trusted.h => trusted_tpm.h} (77%)
>  create mode 100644 security/keys/trusted-keys/Makefile
>  rename security/keys/{trusted.c => trusted-keys/trusted_tpm1.c} (94%)
>  create mode 100644 security/keys/trusted-keys/trusted_tpm2.c
> 
> -- 
> 2.7.4
> 

I fixed a merge conflict caused by James' commit. Already pushed.
Compiling test kernel ATM i.e. tested-by's will follow later.

/Jarkko

^ permalink raw reply

* Re: [Patch v7 4/4] KEYS: trusted: Move TPM2 trusted keys code
From: Jarkko Sakkinen @ 2019-10-11 12:25 UTC (permalink / raw)
  To: Sumit Garg
  Cc: dhowells, peterhuewe, keyrings, linux-integrity, linux-crypto,
	linux-security-module, herbert, davem, jgg, arnd, gregkh, jejb,
	zohar, jmorris, serge, jsnitsel, linux-kernel, daniel.thompson
In-Reply-To: <1570425935-7435-5-git-send-email-sumit.garg@linaro.org>

On Mon, Oct 07, 2019 at 10:55:35AM +0530, Sumit Garg wrote:
> Move TPM2 trusted keys code to trusted keys subsystem. The reason
> being it's better to consolidate all the trusted keys code to a single
> location so that it can be maintained sanely.
> 
> Also, utilize existing tpm_send() exported API which wraps the internal
> tpm_transmit_cmd() API.
> 
> Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>

Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

/Jarkko

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox