public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist
@ 2014-02-03 17:53 Andy Lutomirski
  2014-02-03 18:11 ` Oleg Nesterov
  2014-02-03 20:23 ` Steve Grubb
  0 siblings, 2 replies; 5+ messages in thread
From: Andy Lutomirski @ 2014-02-03 17:53 UTC (permalink / raw)
  To: linux-audit, linux-kernel
  Cc: Andy Lutomirski, Andi Kleen, Oleg Nesterov, Steve Grubb,
	Eric Paris

This toggles TIF_SYSCALL_AUDIT as needed when rules change instead of
leaving it set whenever rules might be set in the future.  This reduces
syscall latency from >60ns to closer to 40ns on my laptop.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---

This is not the best-tested patch in the world.  Someone who actually
knows how to use syscall auditing should probably give it a spin.  It
fixes an IMO huge performance issue, though.

 include/linux/audit.h |  9 +++++--
 kernel/auditfilter.c  |  4 ++--
 kernel/auditsc.c      | 65 ++++++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index a406419..534ba85 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -298,7 +298,9 @@ static inline void audit_mmap_fd(int fd, int flags)
 		__audit_mmap_fd(fd, flags);
 }
 
-extern int audit_n_rules;
+extern void audit_inc_n_rules(void);
+extern void audit_dec_n_rules(void);
+
 extern int audit_signals;
 #else /* CONFIG_AUDITSYSCALL */
 static inline int audit_alloc(struct task_struct *task)
@@ -404,7 +406,10 @@ static inline void audit_mmap_fd(int fd, int flags)
 { }
 static inline void audit_ptrace(struct task_struct *t)
 { }
-#define audit_n_rules 0
+static inline void audit_inc_n_rules(void)
+{ }
+static inline void audit_dec_n_rules(void)
+{ }
 #define audit_signals 0
 #endif /* CONFIG_AUDITSYSCALL */
 
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index 51f3fd4..0ce531d 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -903,7 +903,7 @@ static inline int audit_add_rule(struct audit_entry *entry)
 	}
 #ifdef CONFIG_AUDITSYSCALL
 	if (!dont_count)
-		audit_n_rules++;
+		audit_inc_n_rules();
 
 	if (!audit_match_signal(entry))
 		audit_signals++;
@@ -955,7 +955,7 @@ static inline int audit_del_rule(struct audit_entry *entry)
 
 #ifdef CONFIG_AUDITSYSCALL
 	if (!dont_count)
-		audit_n_rules--;
+		audit_dec_n_rules();
 
 	if (!audit_match_signal(entry))
 		audit_signals--;
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 90594c9..363d0bb 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -79,8 +79,15 @@
 /* no execve audit message should be longer than this (userspace limits) */
 #define MAX_EXECVE_AUDIT_LEN 7500
 
-/* number of audit rules */
-int audit_n_rules;
+/*
+ * number of audit rules
+ *
+ * n_rules_lock protects audit_n_rules.  It also prevents audit_alloc from
+ * racing against changes to audit_n_rules: to change someone else's
+ * audit_context or TIF_SYSCALL_AUDIT, you need write_lock(&n_rules_lock).
+ */
+static DEFINE_RWLOCK(n_rules_lock);
+static int audit_n_rules;
 
 /* determines whether we collect data for signals sent */
 int audit_signals;
@@ -911,6 +918,47 @@ static inline struct audit_context *audit_alloc_context(enum audit_state state)
 	return context;
 }
 
+void audit_inc_n_rules()
+{
+	struct task_struct *p, *g;
+
+	write_lock(&n_rules_lock);
+
+	if (audit_n_rules++ != 0)
+		goto out;  /* The overall state isn't changing. */
+
+	read_lock(&tasklist_lock);
+	do_each_thread(g, p) {
+		if (p->audit_context)
+			set_tsk_thread_flag(p, TIF_SYSCALL_AUDIT);
+	} while_each_thread(g, p);
+	read_unlock(&tasklist_lock);
+
+out:
+	write_unlock(&n_rules_lock);
+}
+
+void audit_dec_n_rules()
+{
+	struct task_struct *p, *g;
+
+	write_lock(&n_rules_lock);
+	--audit_n_rules;
+	BUG_ON(audit_n_rules < 0);
+
+	if (audit_n_rules != 0)
+		goto out;  /* The overall state isn't changing. */
+
+	read_lock(&tasklist_lock);
+	do_each_thread(g, p) {
+		clear_tsk_thread_flag(p, TIF_SYSCALL_AUDIT);
+	} while_each_thread(g, p);
+	read_unlock(&tasklist_lock);
+
+out:
+	write_unlock(&n_rules_lock);
+}
+
 /**
  * audit_alloc - allocate an audit context block for a task
  * @tsk: task
@@ -931,6 +979,11 @@ int audit_alloc(struct task_struct *tsk)
 
 	state = audit_filter_task(tsk, &key);
 	if (state == AUDIT_DISABLED) {
+		/*
+		 * There is no relevant race against inc/dec_n_rules
+		 * here -- inc won't set TIF_SYSCALL_AUDIT, and, if dec
+		 * clears it redundantly, there's no harm done.
+		 */
 		clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
 		return 0;
 	}
@@ -942,8 +995,14 @@ int audit_alloc(struct task_struct *tsk)
 	}
 	context->filterkey = key;
 
+	read_lock(&n_rules_lock);
 	tsk->audit_context  = context;
-	set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+	if (audit_n_rules)
+		set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+	else
+		clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
+	read_unlock(&n_rules_lock);
+
 	return 0;
 }
 
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist
  2014-02-03 17:53 [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist Andy Lutomirski
@ 2014-02-03 18:11 ` Oleg Nesterov
  2014-02-03 18:33   ` Andy Lutomirski
  2014-02-03 20:23 ` Steve Grubb
  1 sibling, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2014-02-03 18:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-audit, linux-kernel, Andi Kleen, Steve Grubb, Eric Paris

On 02/03, Andy Lutomirski wrote:
>
> @@ -911,6 +918,47 @@ static inline struct audit_context *audit_alloc_context(enum audit_state state)
>  	return context;
>  }
>  
> +void audit_inc_n_rules()
> +{
> +	struct task_struct *p, *g;
> +
> +	write_lock(&n_rules_lock);
> +
> +	if (audit_n_rules++ != 0)
> +		goto out;  /* The overall state isn't changing. */
> +
> +	read_lock(&tasklist_lock);
> +	do_each_thread(g, p) {
> +		if (p->audit_context)
> +			set_tsk_thread_flag(p, TIF_SYSCALL_AUDIT);
> +	} while_each_thread(g, p);
> +	read_unlock(&tasklist_lock);

Cosmetic, but I'd suggest to use for_each_process_thread() instead
of do_each_thread/while_each_thread.

And I am not sure why n_rules_lock is rwlock_t... OK, to make
audit_alloc() more scalable, I guess. Please see below.

> @@ -942,8 +995,14 @@ int audit_alloc(struct task_struct *tsk)
>  	}
>  	context->filterkey = key;
>
> +	read_lock(&n_rules_lock);
>  	tsk->audit_context  = context;
> -	set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> +	if (audit_n_rules)
> +		set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> +	else
> +		clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> +	read_unlock(&n_rules_lock);

Perhaps this is fine, but n_rules_lock can't prevent the race with
audit_inc/dec_n_rules(). The problem is, this is called before the
new task is visible to for_each_process_thread().

If we want to fix this race, we need something like audit_sync_flags()
called after copy_process() drops tasklist, or from tasklist_lock
protected section (in this case it doesn't need n_rules_lock).

Or perhaps audit_alloc() should not try to clear TIF_SYSCALL_AUDIT at all.
In both cases n_rules_lock can be spinlock_t.

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist
  2014-02-03 18:11 ` Oleg Nesterov
@ 2014-02-03 18:33   ` Andy Lutomirski
  0 siblings, 0 replies; 5+ messages in thread
From: Andy Lutomirski @ 2014-02-03 18:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-audit, linux-kernel@vger.kernel.org, Andi Kleen,
	Steve Grubb, Eric Paris

On Mon, Feb 3, 2014 at 10:11 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/03, Andy Lutomirski wrote:
>>
>> @@ -911,6 +918,47 @@ static inline struct audit_context *audit_alloc_context(enum audit_state state)
>>       return context;
>>  }
>>
>> +void audit_inc_n_rules()
>> +{
>> +     struct task_struct *p, *g;
>> +
>> +     write_lock(&n_rules_lock);
>> +
>> +     if (audit_n_rules++ != 0)
>> +             goto out;  /* The overall state isn't changing. */
>> +
>> +     read_lock(&tasklist_lock);
>> +     do_each_thread(g, p) {
>> +             if (p->audit_context)
>> +                     set_tsk_thread_flag(p, TIF_SYSCALL_AUDIT);
>> +     } while_each_thread(g, p);
>> +     read_unlock(&tasklist_lock);
>
> Cosmetic, but I'd suggest to use for_each_process_thread() instead
> of do_each_thread/while_each_thread.

I didn't notice that option :)

>
> And I am not sure why n_rules_lock is rwlock_t... OK, to make
> audit_alloc() more scalable, I guess. Please see below.
>

Yes.  I should probably also use the irqsave variant.

>> @@ -942,8 +995,14 @@ int audit_alloc(struct task_struct *tsk)
>>       }
>>       context->filterkey = key;
>>
>> +     read_lock(&n_rules_lock);
>>       tsk->audit_context  = context;
>> -     set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
>> +     if (audit_n_rules)
>> +             set_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
>> +     else
>> +             clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
>> +     read_unlock(&n_rules_lock);
>
> Perhaps this is fine, but n_rules_lock can't prevent the race with
> audit_inc/dec_n_rules(). The problem is, this is called before the
> new task is visible to for_each_process_thread().

Hmm.  I missed that.

>
> If we want to fix this race, we need something like audit_sync_flags()
> called after copy_process() drops tasklist, or from tasklist_lock
> protected section (in this case it doesn't need n_rules_lock).
>
> Or perhaps audit_alloc() should not try to clear TIF_SYSCALL_AUDIT at all.
> In both cases n_rules_lock can be spinlock_t.

Here are two options:

1. Sync the flags inside tasklist_lock, as I think you're suggesting.
This seems simple.

2. Make this whole thing lazy -- always set TIF_SYSCALL_AUDIT for new
tasks, but clear it on the first syscall.

I'm leaning toward number 1.

Thanks for the instant review!

--Andy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist
  2014-02-03 17:53 [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist Andy Lutomirski
  2014-02-03 18:11 ` Oleg Nesterov
@ 2014-02-03 20:23 ` Steve Grubb
  2014-02-03 22:08   ` Andy Lutomirski
  1 sibling, 1 reply; 5+ messages in thread
From: Steve Grubb @ 2014-02-03 20:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-audit, linux-kernel, Andi Kleen, Oleg Nesterov, Eric Paris

On Monday, February 03, 2014 09:53:23 AM Andy Lutomirski wrote:
> This toggles TIF_SYSCALL_AUDIT as needed when rules change instead of
> leaving it set whenever rules might be set in the future.  This reduces
> syscall latency from >60ns to closer to 40ns on my laptop.

Does this mean that we have processes that don't have the  TIF_SYSCALL_AUDIT 
flag set? When rules get loaded, how do we get the flag put back into all 
processes?

The theory of ops is supposed to be that for anyone not needing audit, there 
is only the cost of  "if (tif & TIF_SYSCALL_AUDIT)". That should be it. If you 
have audit enabled or had it enabled (which means it might be loaded with new 
rules), we want to inspect the syscall. There should be a short circuit based 
on checking that any rules has ever been loaded or are currently loaded before 
doing any real collection.

-Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist
  2014-02-03 20:23 ` Steve Grubb
@ 2014-02-03 22:08   ` Andy Lutomirski
  0 siblings, 0 replies; 5+ messages in thread
From: Andy Lutomirski @ 2014-02-03 22:08 UTC (permalink / raw)
  To: Steve Grubb
  Cc: linux-audit, linux-kernel@vger.kernel.org, Andi Kleen,
	Oleg Nesterov, Eric Paris

On Mon, Feb 3, 2014 at 12:23 PM, Steve Grubb <sgrubb@redhat.com> wrote:
> On Monday, February 03, 2014 09:53:23 AM Andy Lutomirski wrote:
>> This toggles TIF_SYSCALL_AUDIT as needed when rules change instead of
>> leaving it set whenever rules might be set in the future.  This reduces
>> syscall latency from >60ns to closer to 40ns on my laptop.
>
> Does this mean that we have processes that don't have the  TIF_SYSCALL_AUDIT
> flag set? When rules get loaded, how do we get the flag put back into all
> processes?

By looping over all processes and setting the flag, which is what my patch does.

>
> The theory of ops is supposed to be that for anyone not needing audit, there
> is only the cost of  "if (tif & TIF_SYSCALL_AUDIT)".

On current kernels *all* processes have TIF_SYSCALL_AUDIT, even if
they don't need auditing because there's nothing to audit.  So
everything pays the full cost.

> That should be it. If you
> have audit enabled or had it enabled (which means it might be loaded with new
> rules), we want to inspect the syscall.
>

My point is that there's nothing to inspect -- there are no rules.
Unless the audit code needs to do something just in case a non-syscall
audit event gets written, in which case the audit code should IMO be
fixed.  (This is what Eric is talking about, I think.)

--Andy

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-03 22:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03 17:53 [PATCH] audit: Only use the syscall slowpath when syscall audit rules exist Andy Lutomirski
2014-02-03 18:11 ` Oleg Nesterov
2014-02-03 18:33   ` Andy Lutomirski
2014-02-03 20:23 ` Steve Grubb
2014-02-03 22:08   ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox