PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
@ 2011-03-07 21:06 Tony Jones
  2011-03-08 18:02 ` David Howells
  0 siblings, 1 reply; 12+ messages in thread
From: Tony Jones @ 2011-03-07 21:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-audit, David Howells, Eric Paris, Al Viro

Commit c69e8d9c01db added calls to get_task_cred and put_cred in 
audit_filter_rules.  Profiling with a large number of audit rules active on the
exit chain shows that we are spending upto 48% in this routine for syscall 
intensive tests, most of which is in the atomic ops. 

The following patch acquires the cred if a rule requires it.  In our particular
case above, most rules had no cred requirement and this dropped the time spent
in audit_filter_rules down to ~12%.  An alternative would be for the caller to 
acquire the cred just once for the whole chain and pass into audit_filter_rules.
I can create an alternate patch doing this if required.

Signed-off-by: Tony Jones <tonyj@suse.de>
---

 kernel/auditsc.c |   24 +++++++++++++++++++++---
 1 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index f49a031..4a930a1 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -450,7 +450,7 @@ static int audit_filter_rules(struct task_struct *tsk,
 			      struct audit_names *name,
 			      enum audit_state *state)
 {
-	const struct cred *cred = get_task_cred(tsk);
+	const struct cred *cred=NULL;
 	int i, j, need_sid = 1;
 	u32 sid;
 
@@ -470,27 +470,43 @@ static int audit_filter_rules(struct task_struct *tsk,
 			}
 			break;
 		case AUDIT_UID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->uid, f->op, f->val);
 			break;
 		case AUDIT_EUID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->euid, f->op, f->val);
 			break;
 		case AUDIT_SUID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->suid, f->op, f->val);
 			break;
 		case AUDIT_FSUID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->fsuid, f->op, f->val);
 			break;
 		case AUDIT_GID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->gid, f->op, f->val);
 			break;
 		case AUDIT_EGID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->egid, f->op, f->val);
 			break;
 		case AUDIT_SGID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->sgid, f->op, f->val);
 			break;
 		case AUDIT_FSGID:
+ 			if (!cred) 
+				cred=get_task_cred(tsk);
 			result = audit_comparator(cred->fsgid, f->op, f->val);
 			break;
 		case AUDIT_PERS:
@@ -638,7 +654,8 @@ static int audit_filter_rules(struct task_struct *tsk,
 		}
 
 		if (!result) {
-			put_cred(cred);
+			if (cred)
+				put_cred(cred);
 			return 0;
 		}
 	}
@@ -656,7 +673,8 @@ static int audit_filter_rules(struct task_struct *tsk,
 	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
 	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
 	}
-	put_cred(cred);
+	if (cred)
+		put_cred(cred);
 	return 1;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-07 21:06 PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead Tony Jones
@ 2011-03-08 18:02 ` David Howells
  2011-03-10 20:25   ` Tony Jones
  0 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2011-03-08 18:02 UTC (permalink / raw)
  To: Tony Jones; +Cc: dhowells, linux-kernel, linux-audit, Eric Paris, Al Viro

Tony Jones <tonyj@suse.de> wrote:

> Commit c69e8d9c01db added calls to get_task_cred and put_cred in
> audit_filter_rules.  Profiling with a large number of audit rules active on
> the exit chain shows that we are spending upto 48% in this routine for
> syscall intensive tests, most of which is in the atomic ops.
> 
> The following patch acquires the cred if a rule requires it.  In our
> particular case above, most rules had no cred requirement and this dropped
> the time spent in audit_filter_rules down to ~12%.  An alternative would be
> for the caller to acquire the cred just once for the whole chain and pass
> into audit_filter_rules.  I can create an alternate patch doing this if
> required.

There's no actual need to get a ref on the named task's creds.

If tsk == current, no locking is needed at all.

If tsk != current, the RCU read lock is sufficient.  See task_cred_xxx() in
include/linux/cred.h.

Hmmm...  I wonder...  The audit filter uses tsk->real_cred, but is that
correct?  Should it be using tsk->cred?  And is tsk always going to be
current?

> +	const struct cred *cred=NULL;

Binary operators like '=' should have a space on each side.

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-08 18:02 ` David Howells
@ 2011-03-10 20:25   ` Tony Jones
  2011-03-11 16:33     ` David Howells
  0 siblings, 1 reply; 12+ messages in thread
From: Tony Jones @ 2011-03-10 20:25 UTC (permalink / raw)
  To: David Howells; +Cc: linux-kernel, linux-audit, Eric Paris, Al Viro

On Tue, Mar 08, 2011 at 06:02:53PM +0000, David Howells wrote:
> Tony Jones <tonyj@suse.de> wrote:
> 
> > Commit c69e8d9c01db added calls to get_task_cred and put_cred in
> > audit_filter_rules.  Profiling with a large number of audit rules active on
> > the exit chain shows that we are spending upto 48% in this routine for
> > syscall intensive tests, most of which is in the atomic ops.
> > 
> > The following patch acquires the cred if a rule requires it.  In our
> > particular case above, most rules had no cred requirement and this dropped
> > the time spent in audit_filter_rules down to ~12%.  An alternative would be
> > for the caller to acquire the cred just once for the whole chain and pass
> > into audit_filter_rules.  I can create an alternate patch doing this if
> > required.
> 
> There's no actual need to get a ref on the named task's creds.
> 
> If tsk == current, no locking is needed at all.
> 
> If tsk != current, the RCU read lock is sufficient.  See task_cred_xxx() in
> include/linux/cred.h.
> 
> Hmmm...  I wonder...  The audit filter uses tsk->real_cred, but is that
> correct?  Should it be using tsk->cred?  And is tsk always going to be
> current?

Hi David.

I'm not seeing the 'tsk->real_cred' usage, can you clarify?

I went through the call tree.  Assuming my analysis is correct the only case 
where it's not current is the calls from copy_process.  I believe there is no 
need for rcu in this case either.

Tony

------

audit_filter_rules
	<- audit_filter_task
		<- audit_alloc
			<- copy_process

	<- audit_filter_syscall
		<- audit_get_context 
			<- audit_free 
				<- copy_process (error path)
				<- do_exit (tsk == current)
			<- audit_syscall_exit (tsk == current)

		<- audit_syscall_entry (tsk == current)

	<- audit_filter_inodes
		<- audit_update_watch (tsk == current)
		<- audit_get_context (see above)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-10 20:25   ` Tony Jones
@ 2011-03-11 16:33     ` David Howells
  2011-03-15 17:38       ` Tony Jones
  0 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2011-03-11 16:33 UTC (permalink / raw)
  To: Tony Jones; +Cc: dhowells, linux-kernel, linux-audit, Eric Paris, Al Viro

Tony Jones <tonyj@suse.de> wrote:

> I'm not seeing the 'tsk->real_cred' usage, can you clarify?

get_task_cred() and task_cred_xxxx() call __task_cred() which uses
tsk->real_cred.  These are the real credentials of the process, and the ones
that are used when the process is being acted upon and the ones that are
visible through /proc.

However, if a task is acting upon something, task->cred is used instead.  These
are not visible from the outside and may be overridden.  current_cred_xxx()
uses these.

It's possible that the credentials being used in audit_filter_rules() are
incorrect under most circumstances and should be task->cred, not
task->real_cred.

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-11 16:33     ` David Howells
@ 2011-03-15 17:38       ` Tony Jones
  2011-03-15 17:44         ` Eric Paris
  2011-03-15 20:04         ` David Howells
  0 siblings, 2 replies; 12+ messages in thread
From: Tony Jones @ 2011-03-15 17:38 UTC (permalink / raw)
  To: David Howells; +Cc: linux-kernel, linux-audit, Eric Paris, Al Viro

On Fri, Mar 11, 2011 at 04:33:52PM +0000, David Howells wrote:
> get_task_cred() and task_cred_xxxx() call __task_cred() which uses

Sorry, I thought you were referring to the remainder of auditsc.c

> It's possible that the credentials being used in audit_filter_rules() are
> incorrect under most circumstances and should be task->cred, not
> task->real_cred.

Agree. Also I believe it is safe to use tsk->cred directly as tsk == current 
or tsk is being created by copy_process.  

Eric, can you ACK/NACK?

The following patch corresponds to the above.

Tony
---


Commit c69e8d9c01db added calls to get_task_cred and put_cred in
audit_filter_rules.  Profiling with a large number of audit rules active on the
exit chain shows that we are spending upto 48% in this routine for syscall 
intensive tests, most of which is in the atomic ops.

The code should be accessing tsk->cred rather than tsk->real_cred.  Also, since
tsk is current (or tsk is being created by copy_process) direct access to 
tsk->cred is possible.

Signed-off-by: Tony Jones <tonyj@suse.de>
---

 kernel/auditsc.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index f49a031..750c08ef 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -441,16 +441,21 @@ static int match_tree_refs(struct audit_context *ctx, struct audit_tree *tree)
 	return 0;
 }
 
-/* Determine if any context name data matches a rule's watch data */
-/* Compare a task_struct with an audit_rule.  Return 1 on match, 0
- * otherwise. */
+/*
+ * Determine if any context name data matches a rule's watch data
+ * Compare a task_struct with an audit_rule.  Return 1 on match, 0
+ * otherwise.
+ *
+ * Note: tsk==current or we are being indirectly called from copy_process()
+ * so direct access to tsk->cred is allowed.
+ */
 static int audit_filter_rules(struct task_struct *tsk,
 			      struct audit_krule *rule,
 			      struct audit_context *ctx,
 			      struct audit_names *name,
 			      enum audit_state *state)
 {
-	const struct cred *cred = get_task_cred(tsk);
+	const struct cred *cred = tsk->cred;
 	int i, j, need_sid = 1;
 	u32 sid;
 
@@ -637,10 +642,8 @@ static int audit_filter_rules(struct task_struct *tsk,
 			break;
 		}
 
-		if (!result) {
-			put_cred(cred);
+		if (!result)
 			return 0;
-		}
 	}
 
 	if (ctx) {
@@ -656,7 +659,6 @@ static int audit_filter_rules(struct task_struct *tsk,
 	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
 	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
 	}
-	put_cred(cred);
 	return 1;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-15 17:38       ` Tony Jones
@ 2011-03-15 17:44         ` Eric Paris
  2011-03-15 20:11           ` David Howells
  2011-03-15 20:04         ` David Howells
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Paris @ 2011-03-15 17:44 UTC (permalink / raw)
  To: Tony Jones; +Cc: David Howells, linux-kernel, linux-audit, Al Viro

On Tue, 2011-03-15 at 10:38 -0700, Tony Jones wrote:
> On Fri, Mar 11, 2011 at 04:33:52PM +0000, David Howells wrote:
> > get_task_cred() and task_cred_xxxx() call __task_cred() which uses
> 
> Sorry, I thought you were referring to the remainder of auditsc.c
> 
> > It's possible that the credentials being used in audit_filter_rules() are
> > incorrect under most circumstances and should be task->cred, not
> > task->real_cred.
> 
> Agree. Also I believe it is safe to use tsk->cred directly as tsk == current 
> or tsk is being created by copy_process.

Is there any way we can enforce this condition in the code?
WARN_ON(cred != current->cred && cred->refcnt != 1) or something like
that (I just made that conditional up off the top of my head, but it
explains what I'm talking about)?  Who knows what somebody else might do
with this code someday.....

> Eric, can you ACK/NACK?
> 
> The following patch corresponds to the above.
> 
> Tony
> ---
> 
> 
> Commit c69e8d9c01db added calls to get_task_cred and put_cred in
> audit_filter_rules.  Profiling with a large number of audit rules active on the
> exit chain shows that we are spending upto 48% in this routine for syscall 
> intensive tests, most of which is in the atomic ops.
> 
> The code should be accessing tsk->cred rather than tsk->real_cred.  Also, since
> tsk is current (or tsk is being created by copy_process) direct access to 
> tsk->cred is possible.
> 
> Signed-off-by: Tony Jones <tonyj@suse.de>

Acked-by: Eric Paris <eparis@redhat.com>

> ---
> 
>  kernel/auditsc.c |   18 ++++++++++--------
>  1 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index f49a031..750c08ef 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -441,16 +441,21 @@ static int match_tree_refs(struct audit_context *ctx, struct audit_tree *tree)
>  	return 0;
>  }
>  
> -/* Determine if any context name data matches a rule's watch data */
> -/* Compare a task_struct with an audit_rule.  Return 1 on match, 0
> - * otherwise. */
> +/*
> + * Determine if any context name data matches a rule's watch data
> + * Compare a task_struct with an audit_rule.  Return 1 on match, 0
> + * otherwise.
> + *
> + * Note: tsk==current or we are being indirectly called from copy_process()
> + * so direct access to tsk->cred is allowed.
> + */
>  static int audit_filter_rules(struct task_struct *tsk,
>  			      struct audit_krule *rule,
>  			      struct audit_context *ctx,
>  			      struct audit_names *name,
>  			      enum audit_state *state)
>  {
> -	const struct cred *cred = get_task_cred(tsk);
> +	const struct cred *cred = tsk->cred;
>  	int i, j, need_sid = 1;
>  	u32 sid;
>  
> @@ -637,10 +642,8 @@ static int audit_filter_rules(struct task_struct *tsk,
>  			break;
>  		}
>  
> -		if (!result) {
> -			put_cred(cred);
> +		if (!result)
>  			return 0;
> -		}
>  	}
>  
>  	if (ctx) {
> @@ -656,7 +659,6 @@ static int audit_filter_rules(struct task_struct *tsk,
>  	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
>  	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
>  	}
> -	put_cred(cred);
>  	return 1;
>  }
>  



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-15 17:44         ` Eric Paris
@ 2011-03-15 20:11           ` David Howells
  2011-03-17 18:11             ` Tony Jones
  0 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2011-03-15 20:11 UTC (permalink / raw)
  To: Eric Paris; +Cc: dhowells, Tony Jones, linux-kernel, linux-audit, Al Viro

Eric Paris <eparis@redhat.com> wrote:

> WARN_ON(cred != current->cred && cred->refcnt != 1)

'tsk->parent == current' perhaps?  Or audit_alloc() could pass a flag
indicating the state, or just look to see if tsk->audit_context is still NULL.

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-15 20:11           ` David Howells
@ 2011-03-17 18:11             ` Tony Jones
  2011-03-21 13:57               ` Eric Paris
  0 siblings, 1 reply; 12+ messages in thread
From: Tony Jones @ 2011-03-17 18:11 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Paris, linux-kernel, linux-audit, Al Viro

On Tue, Mar 15, 2011 at 08:11:17PM +0000, David Howells wrote:
> Eric Paris <eparis@redhat.com> wrote:
> 
> > WARN_ON(cred != current->cred && cred->refcnt != 1)
> 
> 'tsk->parent == current' perhaps?  Or audit_alloc() could pass a flag
> indicating the state, or just look to see if tsk->audit_context is still NULL.
> 
> David

Round 3.   tsk->parent == current isn't an option as it's not set by
copy_process until after audit_alloc.  I used a flag to provide an explicit
indication.  I didn't have audit_alloc pass the flag into audit_filter_task
as there is already a explicit "process creation time" comment for this static
function. If you want it pushed all the way upto audit_alloc, let me know and
I'll revise.

Oddly sparse didn't throw any warnings about the direct use of tsk->cred.

tony
---

Commit c69e8d9c01db added calls to get_task_cred and put_cred in
audit_filter_rules.  Profiling with a large number of audit rules active on the
exit chain shows that we are spending upto 48% in this routine for syscall
intensive tests, most of which is in the atomic ops.

1. The code should be accessing tsk->cred rather than tsk->real_cred.  
2. Since tsk is current (or tsk is being created by copy_process) access to 
tsk->cred without rcu read lock is possible.  At the request of the audit
maintainer, a new flag has been added to audit_filter_rules in order to make 
this explicit and guide future code.

Signed-off-by: Tony Jones <tonyj@suse.de>
---
 kernel/auditsc.c |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index f49a031..281dcf1 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -443,17 +443,25 @@ static int match_tree_refs(struct audit_context *ctx, struct audit_tree *tree)
 
 /* Determine if any context name data matches a rule's watch data */
 /* Compare a task_struct with an audit_rule.  Return 1 on match, 0
- * otherwise. */
+ * otherwise.
+ *
+ * If task_creation is true, this is an explicit indication that we are
+ * filtering a task rule at task creation time.  This and tsk == current are
+ * the only situations where tsk->cred may be accessed without an rcu read lock.
+ */
 static int audit_filter_rules(struct task_struct *tsk,
 			      struct audit_krule *rule,
 			      struct audit_context *ctx,
 			      struct audit_names *name,
-			      enum audit_state *state)
+			      enum audit_state *state,
+			      bool task_creation)
 {
-	const struct cred *cred = get_task_cred(tsk);
+	const struct cred *cred;
 	int i, j, need_sid = 1;
 	u32 sid;
 
+	cred = rcu_dereference_check(tsk->cred, tsk == current || task_creation);
+
 	for (i = 0; i < rule->field_count; i++) {
 		struct audit_field *f = &rule->fields[i];
 		int result = 0;
@@ -637,10 +645,8 @@ static int audit_filter_rules(struct task_struct *tsk,
 			break;
 		}
 
-		if (!result) {
-			put_cred(cred);
+		if (!result)
 			return 0;
-		}
 	}
 
 	if (ctx) {
@@ -656,7 +662,6 @@ static int audit_filter_rules(struct task_struct *tsk,
 	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
 	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
 	}
-	put_cred(cred);
 	return 1;
 }
 
@@ -671,7 +676,8 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
-		if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
+		if (audit_filter_rules(tsk, &e->rule, NULL, NULL,
+				       &state, true)) {
 			if (state == AUDIT_RECORD_CONTEXT)
 				*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
 			rcu_read_unlock();
@@ -705,7 +711,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
 		list_for_each_entry_rcu(e, list, list) {
 			if ((e->rule.mask[word] & bit) == bit &&
 			    audit_filter_rules(tsk, &e->rule, ctx, NULL,
-					       &state)) {
+					       &state, false)) {
 				rcu_read_unlock();
 				ctx->current_state = state;
 				return state;
@@ -743,7 +749,8 @@ void audit_filter_inodes(struct task_struct *tsk, struct audit_context *ctx)
 
 		list_for_each_entry_rcu(e, list, list) {
 			if ((e->rule.mask[word] & bit) == bit &&
-			    audit_filter_rules(tsk, &e->rule, ctx, n, &state)) {
+			    audit_filter_rules(tsk, &e->rule, ctx, n,
+				    	       &state, false)) {
 				rcu_read_unlock();
 				ctx->current_state = state;
 				return;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-17 18:11             ` Tony Jones
@ 2011-03-21 13:57               ` Eric Paris
  2011-04-27 13:12                 ` Jiri Kosina
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Paris @ 2011-03-21 13:57 UTC (permalink / raw)
  To: Tony Jones; +Cc: David Howells, linux-kernel, linux-audit, Al Viro

On Thu, 2011-03-17 at 11:11 -0700, Tony Jones wrote:
> On Tue, Mar 15, 2011 at 08:11:17PM +0000, David Howells wrote:
> > Eric Paris <eparis@redhat.com> wrote:
> > 
> > > WARN_ON(cred != current->cred && cred->refcnt != 1)
> > 
> > 'tsk->parent == current' perhaps?  Or audit_alloc() could pass a flag
> > indicating the state, or just look to see if tsk->audit_context is still NULL.
> > 
> > David
> 
> Round 3.   tsk->parent == current isn't an option as it's not set by
> copy_process until after audit_alloc.  I used a flag to provide an explicit
> indication.  I didn't have audit_alloc pass the flag into audit_filter_task
> as there is already a explicit "process creation time" comment for this static
> function. If you want it pushed all the way upto audit_alloc, let me know and
> I'll revise.
> 
> Oddly sparse didn't throw any warnings about the direct use of tsk->cred.
> 
> tony
> ---
> 
> Commit c69e8d9c01db added calls to get_task_cred and put_cred in
> audit_filter_rules.  Profiling with a large number of audit rules active on the
> exit chain shows that we are spending upto 48% in this routine for syscall
> intensive tests, most of which is in the atomic ops.
> 
> 1. The code should be accessing tsk->cred rather than tsk->real_cred.  
> 2. Since tsk is current (or tsk is being created by copy_process) access to 
> tsk->cred without rcu read lock is possible.  At the request of the audit
> maintainer, a new flag has been added to audit_filter_rules in order to make 
> this explicit and guide future code.
> 
> Signed-off-by: Tony Jones <tonyj@suse.de>

Acked-by: Eric Paris <eparis@redhat.com>

> ---
>  kernel/auditsc.c |   27 +++++++++++++++++----------
>  1 files changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index f49a031..281dcf1 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -443,17 +443,25 @@ static int match_tree_refs(struct audit_context *ctx, struct audit_tree *tree)
>  
>  /* Determine if any context name data matches a rule's watch data */
>  /* Compare a task_struct with an audit_rule.  Return 1 on match, 0
> - * otherwise. */
> + * otherwise.
> + *
> + * If task_creation is true, this is an explicit indication that we are
> + * filtering a task rule at task creation time.  This and tsk == current are
> + * the only situations where tsk->cred may be accessed without an rcu read lock.
> + */
>  static int audit_filter_rules(struct task_struct *tsk,
>  			      struct audit_krule *rule,
>  			      struct audit_context *ctx,
>  			      struct audit_names *name,
> -			      enum audit_state *state)
> +			      enum audit_state *state,
> +			      bool task_creation)
>  {
> -	const struct cred *cred = get_task_cred(tsk);
> +	const struct cred *cred;
>  	int i, j, need_sid = 1;
>  	u32 sid;
>  
> +	cred = rcu_dereference_check(tsk->cred, tsk == current || task_creation);
> +
>  	for (i = 0; i < rule->field_count; i++) {
>  		struct audit_field *f = &rule->fields[i];
>  		int result = 0;
> @@ -637,10 +645,8 @@ static int audit_filter_rules(struct task_struct *tsk,
>  			break;
>  		}
>  
> -		if (!result) {
> -			put_cred(cred);
> +		if (!result)
>  			return 0;
> -		}
>  	}
>  
>  	if (ctx) {
> @@ -656,7 +662,6 @@ static int audit_filter_rules(struct task_struct *tsk,
>  	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
>  	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
>  	}
> -	put_cred(cred);
>  	return 1;
>  }
>  
> @@ -671,7 +676,8 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
>  
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
> -		if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
> +		if (audit_filter_rules(tsk, &e->rule, NULL, NULL,
> +				       &state, true)) {
>  			if (state == AUDIT_RECORD_CONTEXT)
>  				*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
>  			rcu_read_unlock();
> @@ -705,7 +711,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
>  		list_for_each_entry_rcu(e, list, list) {
>  			if ((e->rule.mask[word] & bit) == bit &&
>  			    audit_filter_rules(tsk, &e->rule, ctx, NULL,
> -					       &state)) {
> +					       &state, false)) {
>  				rcu_read_unlock();
>  				ctx->current_state = state;
>  				return state;
> @@ -743,7 +749,8 @@ void audit_filter_inodes(struct task_struct *tsk, struct audit_context *ctx)
>  
>  		list_for_each_entry_rcu(e, list, list) {
>  			if ((e->rule.mask[word] & bit) == bit &&
> -			    audit_filter_rules(tsk, &e->rule, ctx, n, &state)) {
> +			    audit_filter_rules(tsk, &e->rule, ctx, n,
> +				    	       &state, false)) {
>  				rcu_read_unlock();
>  				ctx->current_state = state;
>  				return;



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-21 13:57               ` Eric Paris
@ 2011-04-27 13:12                 ` Jiri Kosina
  2011-04-27 16:26                   ` Tony Jones
  0 siblings, 1 reply; 12+ messages in thread
From: Jiri Kosina @ 2011-04-27 13:12 UTC (permalink / raw)
  To: Eric Paris; +Cc: Tony Jones, David Howells, linux-kernel, linux-audit, Al Viro

On Mon, 21 Mar 2011, Eric Paris wrote:

> > Commit c69e8d9c01db added calls to get_task_cred and put_cred in
> > audit_filter_rules.  Profiling with a large number of audit rules active on the
> > exit chain shows that we are spending upto 48% in this routine for syscall
> > intensive tests, most of which is in the atomic ops.
> > 
> > 1. The code should be accessing tsk->cred rather than tsk->real_cred.  
> > 2. Since tsk is current (or tsk is being created by copy_process) access to 
> > tsk->cred without rcu read lock is possible.  At the request of the audit
> > maintainer, a new flag has been added to audit_filter_rules in order to make 
> > this explicit and guide future code.
> > 
> > Signed-off-by: Tony Jones <tonyj@suse.de>
> 
> Acked-by: Eric Paris <eparis@redhat.com>

I don't see the patch in linux-next as of today. As it has been acked by 
subsystem maintainer, I have picked it up into my tree ("retransmission 
mode").

If anyone has any objections, please let me know. Thanks.

> 
> > ---
> >  kernel/auditsc.c |   27 +++++++++++++++++----------
> >  1 files changed, 17 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index f49a031..281dcf1 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -443,17 +443,25 @@ static int match_tree_refs(struct audit_context *ctx, struct audit_tree *tree)
> >  
> >  /* Determine if any context name data matches a rule's watch data */
> >  /* Compare a task_struct with an audit_rule.  Return 1 on match, 0
> > - * otherwise. */
> > + * otherwise.
> > + *
> > + * If task_creation is true, this is an explicit indication that we are
> > + * filtering a task rule at task creation time.  This and tsk == current are
> > + * the only situations where tsk->cred may be accessed without an rcu read lock.
> > + */
> >  static int audit_filter_rules(struct task_struct *tsk,
> >  			      struct audit_krule *rule,
> >  			      struct audit_context *ctx,
> >  			      struct audit_names *name,
> > -			      enum audit_state *state)
> > +			      enum audit_state *state,
> > +			      bool task_creation)
> >  {
> > -	const struct cred *cred = get_task_cred(tsk);
> > +	const struct cred *cred;
> >  	int i, j, need_sid = 1;
> >  	u32 sid;
> >  
> > +	cred = rcu_dereference_check(tsk->cred, tsk == current || task_creation);
> > +
> >  	for (i = 0; i < rule->field_count; i++) {
> >  		struct audit_field *f = &rule->fields[i];
> >  		int result = 0;
> > @@ -637,10 +645,8 @@ static int audit_filter_rules(struct task_struct *tsk,
> >  			break;
> >  		}
> >  
> > -		if (!result) {
> > -			put_cred(cred);
> > +		if (!result)
> >  			return 0;
> > -		}
> >  	}
> >  
> >  	if (ctx) {
> > @@ -656,7 +662,6 @@ static int audit_filter_rules(struct task_struct *tsk,
> >  	case AUDIT_NEVER:    *state = AUDIT_DISABLED;	    break;
> >  	case AUDIT_ALWAYS:   *state = AUDIT_RECORD_CONTEXT; break;
> >  	}
> > -	put_cred(cred);
> >  	return 1;
> >  }
> >  
> > @@ -671,7 +676,8 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
> >  
> >  	rcu_read_lock();
> >  	list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
> > -		if (audit_filter_rules(tsk, &e->rule, NULL, NULL, &state)) {
> > +		if (audit_filter_rules(tsk, &e->rule, NULL, NULL,
> > +				       &state, true)) {
> >  			if (state == AUDIT_RECORD_CONTEXT)
> >  				*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
> >  			rcu_read_unlock();
> > @@ -705,7 +711,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
> >  		list_for_each_entry_rcu(e, list, list) {
> >  			if ((e->rule.mask[word] & bit) == bit &&
> >  			    audit_filter_rules(tsk, &e->rule, ctx, NULL,
> > -					       &state)) {
> > +					       &state, false)) {
> >  				rcu_read_unlock();
> >  				ctx->current_state = state;
> >  				return state;
> > @@ -743,7 +749,8 @@ void audit_filter_inodes(struct task_struct *tsk, struct audit_context *ctx)
> >  
> >  		list_for_each_entry_rcu(e, list, list) {
> >  			if ((e->rule.mask[word] & bit) == bit &&
> > -			    audit_filter_rules(tsk, &e->rule, ctx, n, &state)) {
> > +			    audit_filter_rules(tsk, &e->rule, ctx, n,
> > +				    	       &state, false)) {
> >  				rcu_read_unlock();
> >  				ctx->current_state = state;
> >  				return;

-- 
Jiri Kosina
SUSE Labs, Novell Inc.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-04-27 13:12                 ` Jiri Kosina
@ 2011-04-27 16:26                   ` Tony Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Tony Jones @ 2011-04-27 16:26 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Eric Paris, David Howells, linux-kernel, linux-audit, Al Viro

On Wed, Apr 27, 2011 at 03:12:23PM +0200, Jiri Kosina wrote:
> I don't see the patch in linux-next as of today. As it has been acked by 
> subsystem maintainer, I have picked it up into my tree ("retransmission 
> mode").
> 
> If anyone has any objections, please let me know. Thanks.

I spoke to Eric about this 10 days ago and he was going to create an audit
tree for the next window. 

Tony

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead
  2011-03-15 17:38       ` Tony Jones
  2011-03-15 17:44         ` Eric Paris
@ 2011-03-15 20:04         ` David Howells
  1 sibling, 0 replies; 12+ messages in thread
From: David Howells @ 2011-03-15 20:04 UTC (permalink / raw)
  To: Tony Jones; +Cc: dhowells, linux-kernel, linux-audit, Eric Paris, Al Viro

Tony Jones <tonyj@suse.de> wrote:

> Agree. Also I believe it is safe to use tsk->cred directly as tsk == current 
> or tsk is being created by copy_process.  

You can't quite access it like that without sparse throwing a warning.  The
pointer is marked with an __rcu attribute, so you need to use something like
this:

	cred = rcu_dereference_check(tsk->cred, (tsk == current ||
						 called_from_copy_process());

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-04-27 16:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-07 21:06 PATCH [1/1]: audit: acquire creds selectively to reduce atomic op overhead Tony Jones
2011-03-08 18:02 ` David Howells
2011-03-10 20:25   ` Tony Jones
2011-03-11 16:33     ` David Howells
2011-03-15 17:38       ` Tony Jones
2011-03-15 17:44         ` Eric Paris
2011-03-15 20:11           ` David Howells
2011-03-17 18:11             ` Tony Jones
2011-03-21 13:57               ` Eric Paris
2011-04-27 13:12                 ` Jiri Kosina
2011-04-27 16:26                   ` Tony Jones
2011-03-15 20:04         ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).