Re: [kernel-hardening] [RFC PATCH 1/1] seccomp: provide information about the previous syscall

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Jann Horn <jann@thejh.net>
To: Daniel Sangorrin <daniel.sangorrin@toshiba.co.jp>
Cc: keescook@chromium.org, luto@amacapital.net, wad@chromium.org,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	kernel-hardening@lists.openwall.com
Subject: Re: [kernel-hardening] [RFC PATCH 1/1] seccomp: provide information about the previous syscall
Date: Fri, 22 Jan 2016 11:48:37 +0100	[thread overview]
Message-ID: <20160122104837.GA14440@pc.thejh.net> (raw)
In-Reply-To: <1453444200-8604-2-git-send-email-daniel.sangorrin@toshiba.co.jp>

[-- Attachment #1: Type: text/plain, Size: 4780 bytes --]

On Fri, Jan 22, 2016 at 03:30:00PM +0900, Daniel Sangorrin wrote:
> This patch allows applications to restrict the order in which
> its system calls may be requested. In order to do that, we
> provide seccomp-BPF scripts with information about the
> previous system call requested.
>
> An example use case consists of detecting (and stopping) return
> oriented attacks that disturb the normal execution flow of
> a user program.


The intent here is to mitigate attacks in which an attacker has
e.g. a function pointer overwrite without a high degree of stack
control or the ability to perform a stack pivot, correct? So that
e.g. a one-gadget system() call won't succeed?

Do you have data on how effective this protection is using just
the previous system call number?

I think that for example, the "magic ROP gadget" in glibc that
can be used given just a single pointer overwrite and stdin
control (https://gist.github.com/zachriggle/ca24daf4e8be953a3f96),
which (as far as I can tell) is in the middle of the system()
implementation, could be used as long as a transition to one of
the following syscalls is allowed:

 - rt_sigaction
 - rt_sigprocmask
 - clone
 - execve

I'm not sure how many interesting syscalls typically transition
to that, perhaps you can comment on that?

However, when exploiting network servers, this magic gadget
won't help much - an attacker would probably have to either
call into an interesting function in the application or use
ROP. In the latter case, this protection won't help much -
especially considering that most syscalls just return
-EFAULT / -EINVAL when you supply nonsense arguments, ROPping
through a "pop rax;ret" gadget and a "syscall;ret" gadget
should make it fairly easy to bypass the protection. There
are a bunch of occurences of both gadgets in Debian's libc
(and these are just the trivial ones):

$ hexdump -C /lib/x86_64-linux-gnu/libc-2.19.so | grep '58 c3'
000382e0  00 00 48 8b 00 5b 8b 40  58 c3 48 8d 05 4f 8a 36  |..H..[.@X.H..O.6|
000383b0  58 c3 48 8d 05 87 89 36  00 48 39 c3 74 0e 48 89  |X.H....6.H9.t.H.|
00038450  40 58 c3 48 8d 05 e6 88  36 00 48 39 c3 74 0e 48  |@X.H....6.H9.t.H|
000d9a00  48 89 44 24 18 e8 56 ff  ff ff 48 83 c4 58 c3 90  |H.D$..V...H..X..|
000e51d0  c3 0f 1f 80 00 00 00 00  48 8b 40 58 c3 0f 1f 00  |........H.@X....|
000ea2f0  48 83 3d 58 c3 2b 00 00  48 8b 1d 69 8b 2b 00 64  |H.=X.+..H..i.+.d|
00160520  48 c3 fa ff 58 c3 fa ff  68 c3 fa ff 80 c3 fa ff  |H...X...h.......|
00171470  58 c3 f8 ff 84 60 02 00  74 c3 f8 ff 94 62 02 00  |X....`..t....b..|
$ hexdump -C /lib/x86_64-linux-gnu/libc-2.19.so | grep '0f 05 c3'
000b85b0  b8 6e 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.n..............|
000b85c0  b8 66 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.f..............|
000b85d0  b8 6b 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.k..............|
000b85e0  b8 68 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.h..............|
000b85f0  b8 6c 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.l..............|
000b87f0  b8 6f 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |.o..............|
000d9260  b8 5f 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |._..............|
000e6400  b8 e4 00 00 00 0f 05 c3  0f 1f 84 00 00 00 00 00  |................|
000fff60  48 63 3f b8 03 00 00 00  0f 05 c3 0f 1f 44 00 00  |Hc?..........D..|

So an attacker would craft the stack like this:
[pop rax;ret address]
[first syscall for transition]
[syscall;ret address]
[pop rax;ret address]
[second syscall for transition]
[syscall;ret address]
[...]
[normal ROP for whatever the attacker wants to do]


Maybe someone who knows a bit more about binary exploiting
can comment on this, especially how likely it is that a
manipulation of a network service's program flow is successful
in the presence of full ASLR and so on without ROP.


Also, there is a potential functional issue: What about signal handlers?
Signal handlers will require transitions from all syscalls to any syscall
that occurs at the start of a signal handler to be allowed as far as I
can tell.


> @@ -443,6 +448,11 @@ static long seccomp_attach_filter(unsigned int flags,
>  			return ret;
>  	}
>  
> +	/* Initialize the prev_nr field only once */
> +	if (current->seccomp.filter == NULL)
> +		current->seccomp.prev_nr =
> +			syscall_get_nr(current, task_pt_regs(current));
> +
>  	/*
>  	 * If there is an existing filter, make it the prev and don't drop its
>  	 * task reference.

What about SECCOMP_FILTER_FLAG_TSYNC? When a thread is transitioned from
SECCOMP_MODE_DISABLED to SECCOMP_MODE_FILTER by another thread, its initial
prev_nr will be 0, which would e.g. appear to be the read() syscall on
x86_64, right?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

next prev parent reply	other threads:[~2016-01-22 10:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22  6:29 [RFC PATCH 0/1] Adding previous syscall context to seccomp Daniel Sangorrin
2016-01-22  6:30 ` [RFC PATCH 1/1] seccomp: provide information about the previous syscall Daniel Sangorrin
2016-01-22 10:48   ` Jann Horn [this message]
2016-01-22 17:17     ` [kernel-hardening] " Andy Lutomirski
2016-01-25  3:39     ` Daniel Sangorrin
2016-01-22 17:30   ` Alexei Starovoitov
2016-01-22 21:23     ` Kees Cook
2016-01-22 22:10       ` [kernel-hardening] " Paul Moore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160122104837.GA14440@pc.thejh.net \
    --to=jann@thejh.net \
    --cc=daniel.sangorrin@toshiba.co.jp \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox