linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS
@ 2016-01-29  1:06 Kees Cook
  2016-01-29  2:33 ` Andy Lutomirski
  0 siblings, 1 reply; 4+ messages in thread
From: Kees Cook @ 2016-01-29  1:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: jeffv, Andy Lutomirski, oleg, Will Drewry, linux-doc, linux-api,
	linux-security-module, kernel-hardening

Tracing processes for syscall usage can be done one step at a time with
SECCOMP_RET_TRAP, but this will block the syscall. Alternatively, using
a ptrace manager to handle SECCOMP_RET_TRACE returns can be used but is
heavy weight and depends on the ptrace infrastructure. A light-weight
method to learn syscalls is needed, which can reuse the existing delivery
of SIGSYS but without skipping the syscall. This is implemented as
SECCOMP_RET_ACK which is as permissive as SECCOMP_RET_ALLOW but delivers
SIGSYS after syscall completion, as long as the SECCOMP_RET_DATA is
non-zero. A signal handler can install a new rule for each syscall as
they are signaled with SECCOMP_RET_DATA set to 0 to disable reporting
for that syscall in the future (which is required for restarting syscalls
that are signal-sensitive like nanosleep).

Registers from the signal will reflect registers after the syscall returns
rather than before. Signal-sensitive syscalls will trigger EINTR, so they
must be whitelisted before they are resumed. Not allowing the sigreturn
syscall (and likely prctl to whitelist) will make using SECCOMP_RET_ACK
useless.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
I don't like the name SECCOMP_RET_ACK, and SECCOMP_RET_ALLOW_SIGSYS
seems too long. SECCOMP_RET_RAISE? SECCOMP_RET_SIGSYS?
---
 Documentation/prctl/seccomp_filter.txt | 16 ++++++++++++++++
 include/uapi/linux/seccomp.h           |  1 +
 kernel/seccomp.c                       |  5 +++++
 3 files changed, 22 insertions(+)

diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
index 1e469ef75778..847da72d94f4 100644
--- a/Documentation/prctl/seccomp_filter.txt
+++ b/Documentation/prctl/seccomp_filter.txt
@@ -138,6 +138,22 @@ SECCOMP_RET_TRACE:
 	allow use of ptrace, even of other sandboxed processes, without
 	extreme care; ptracers can use this mechanism to escape.)
 
+SECCOMP_RET_ACK:
+	When the SECCOMP_RET_DATA portion is 0, this is the same
+	as SECCOMP_RET_ALLOW. When non-zero, this is the same as
+	SECCOMP_RET_TRAP except the syscall is executed normally
+	and register contents will show the state after the syscall.
+
+	For syscalls that are sensitive to pending signals, the
+	raised signal will interrupt the syscall. If these syscalls
+	are restarted immediately, they will loop forever. Users of
+	SECCOMP_RET_ACK need to add a new filter for each syscall
+	that sets a zero SECCOMP_RET_DATA to disable these kinds of
+	syscalls if they are not explicitly whitelisted to being with.
+
+	Whitelisting sigreturn (and likely prctl) is needed to use
+	SECCOMP_RET_ACK in a meaningful way.
+
 SECCOMP_RET_ALLOW:
 	Results in the system call being executed.
 
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a43ff1e..285cd3a04052 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -29,6 +29,7 @@
 #define SECCOMP_RET_TRAP	0x00030000U /* disallow and force a SIGSYS */
 #define SECCOMP_RET_ERRNO	0x00050000U /* returns an errno */
 #define SECCOMP_RET_TRACE	0x7ff00000U /* pass to a tracer or disallow */
+#define SECCOMP_RET_ACK		0x7ffc0000U /* allow and send SIGSYS */
 #define SECCOMP_RET_ALLOW	0x7fff0000U /* allow */
 
 /* Masks for the return value sections. */
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 580ac2d4024f..6eefbb2060d8 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -608,6 +608,11 @@ static u32 __seccomp_phase1_filter(int this_syscall, struct seccomp_data *sd)
 	case SECCOMP_RET_TRACE:
 		return filter_ret;  /* Save the rest for phase 2. */
 
+	case SECCOMP_RET_ACK:
+		/* Post SIGSYS on syscall return, with 16 bits of data. */
+		if (data)
+			seccomp_send_sigsys(this_syscall, data);
+		/* Fall through. */
 	case SECCOMP_RET_ALLOW:
 		return SECCOMP_PHASE1_OK;
 
-- 
2.6.3


-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS
  2016-01-29  1:06 [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS Kees Cook
@ 2016-01-29  2:33 ` Andy Lutomirski
       [not found]   ` <CALCETrVpa1zxVXOJoYKZ0zLCdH07dBKbw0Yo-BSJRH0eP_RBvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Lutomirski @ 2016-01-29  2:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel@vger.kernel.org, jeffv, Oleg Nesterov, Will Drewry,
	linux-doc@vger.kernel.org, Linux API, LSM List,
	kernel-hardening@lists.openwall.com

On Thu, Jan 28, 2016 at 5:06 PM, Kees Cook <keescook@chromium.org> wrote:
> Tracing processes for syscall usage can be done one step at a time with
> SECCOMP_RET_TRAP, but this will block the syscall. Alternatively, using
> a ptrace manager to handle SECCOMP_RET_TRACE returns can be used but is
> heavy weight and depends on the ptrace infrastructure. A light-weight
> method to learn syscalls is needed, which can reuse the existing delivery
> of SIGSYS but without skipping the syscall. This is implemented as
> SECCOMP_RET_ACK which is as permissive as SECCOMP_RET_ALLOW but delivers
> SIGSYS after syscall completion, as long as the SECCOMP_RET_DATA is
> non-zero. A signal handler can install a new rule for each syscall as
> they are signaled with SECCOMP_RET_DATA set to 0 to disable reporting
> for that syscall in the future (which is required for restarting syscalls
> that are signal-sensitive like nanosleep).
>
> Registers from the signal will reflect registers after the syscall returns
> rather than before. Signal-sensitive syscalls will trigger EINTR, so they
> must be whitelisted before they are resumed. Not allowing the sigreturn
> syscall (and likely prctl to whitelist) will make using SECCOMP_RET_ACK
> useless.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Could this use task_work to queue the signal on return to user mode
instead?  Would that solve the EINTR issues?

--Andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS
       [not found]   ` <CALCETrVpa1zxVXOJoYKZ0zLCdH07dBKbw0Yo-BSJRH0eP_RBvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-01-29  3:03     ` Jeffrey Vander Stoep
  2016-01-31 20:19     ` Andy Lutomirski
  1 sibling, 0 replies; 4+ messages in thread
From: Jeffrey Vander Stoep @ 2016-01-29  3:03 UTC (permalink / raw)
  To: Andy Lutomirski, Kees Cook
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Oleg Nesterov, Will Drewry,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	LSM List,
	kernel-hardening-ZwoEplunGu1jrUoiu81ncdBPR1lH4CV8@public.gmane.org

Thanks! This is just what I need.

What are the drawbacks to returning the sigsys before executing the
system call? Otherwise this loses the benefit of properly reporting
registers for argument inspection.

How about SECCOMP_RET_PERMISSIVE? Describes the application rather
than the implementation. Otherwise preference is for
SECCOMP_RET_ALLOW_SIGSYS.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS
       [not found]   ` <CALCETrVpa1zxVXOJoYKZ0zLCdH07dBKbw0Yo-BSJRH0eP_RBvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-01-29  3:03     ` Jeffrey Vander Stoep
@ 2016-01-31 20:19     ` Andy Lutomirski
  1 sibling, 0 replies; 4+ messages in thread
From: Andy Lutomirski @ 2016-01-31 20:19 UTC (permalink / raw)
  To: Kees Cook
  Cc: LSM List, linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Oleg Nesterov, Linux API, Will Drewry,
	kernel-hardening-ZwoEplunGu1jrUoiu81ncdBPR1lH4CV8@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jeffrey Vander Stoep

On Jan 28, 2016 6:33 PM, "Andy Lutomirski" <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>
> On Thu, Jan 28, 2016 at 5:06 PM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> > Tracing processes for syscall usage can be done one step at a time with
> > SECCOMP_RET_TRAP, but this will block the syscall. Alternatively, using
> > a ptrace manager to handle SECCOMP_RET_TRACE returns can be used but is
> > heavy weight and depends on the ptrace infrastructure. A light-weight
> > method to learn syscalls is needed, which can reuse the existing delivery
> > of SIGSYS but without skipping the syscall. This is implemented as
> > SECCOMP_RET_ACK which is as permissive as SECCOMP_RET_ALLOW but delivers
> > SIGSYS after syscall completion, as long as the SECCOMP_RET_DATA is
> > non-zero. A signal handler can install a new rule for each syscall as
> > they are signaled with SECCOMP_RET_DATA set to 0 to disable reporting
> > for that syscall in the future (which is required for restarting syscalls
> > that are signal-sensitive like nanosleep).
> >
> > Registers from the signal will reflect registers after the syscall returns
> > rather than before. Signal-sensitive syscalls will trigger EINTR, so they
> > must be whitelisted before they are resumed. Not allowing the sigreturn
> > syscall (and likely prctl to whitelist) will make using SECCOMP_RET_ACK
> > useless.
> >
> > Signed-off-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>
> Could this use task_work to queue the signal on return to user mode
> instead?  Would that solve the EINTR issues?
>

As another option, use the existing TRAP option but add a way for a
process to set a flag such that it can delete and re-add a filter.
Then you get SIGSYS, delete the old filter, add a new one that allows
the current syscall, and resume.  No funny business with EINTR or
clobbered regs.

--Andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-01-31 20:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-29  1:06 [RFC][PATCH] seccomp: add SECCOMP_RET_ACK for non-fatal SIGSYS Kees Cook
2016-01-29  2:33 ` Andy Lutomirski
     [not found]   ` <CALCETrVpa1zxVXOJoYKZ0zLCdH07dBKbw0Yo-BSJRH0eP_RBvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-29  3:03     ` Jeffrey Vander Stoep
2016-01-31 20:19     ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).