All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dmitry V. Levin" <ldv@altlinux.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Kees Cook <keescook@chromium.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Will Drewry <wad@chromium.org>, Oleg Nesterov <oleg@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Linux MIPS Mailing List <linux-mips@linux-mips.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	linux-security-module <linux-security-module@vger.kernel.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases
Date: Fri, 6 Feb 2015 05:32:49 +0300	[thread overview]
Message-ID: <20150206023249.GB31540@altlinux.org> (raw)
In-Reply-To: <CALCETrXsCUje+_V=Ud+TB4A2jH2M7yqyoCFMLEyxOD6pd7Di5w@mail.gmail.com>

On Thu, Feb 05, 2015 at 04:09:06PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 5, 2015 at 3:49 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Thu, Feb 5, 2015 at 3:39 PM, Dmitry V. Levin <ldv@altlinux.org> wrote:
[...]
> >> There is a clear difference: before these changes, SECCOMP_RET_ERRNO used
> >> to keep the syscall number unchanged and suppress syscall-exit-stop event,
> >> which was awful because userspace cannot distinguish syscall-enter-stop
> >> from syscall-exit-stop and therefore relies on the kernel that
> >> syscall-enter-stop is followed by syscall-exit-stop (or tracee's death, etc.).
> >>
> >> After these changes, SECCOMP_RET_ERRNO no longer causes syscall-exit-stop
> >> events to be suppressed, but now the syscall number is lost.
> >
> > Ah-ha! Okay, thanks, I understand now. I think this means seccomp
> > phase1 should not treat RET_ERRNO as a "skip" event. Andy, what do you
> > think here?
> 
> I still don't quite see how this change caused this.

I have a test for this at
http://sourceforge.net/p/strace/code/ci/HEAD/~/tree/test/seccomp.c

> I can play with
> it a bit more.  But RET_ERRNO *has* to be some kind of skip event,
> because it needs to skip the syscall.
> 
> We could change this by treating RET_ERRNO as an instruction to enter
> phase 2 and then asking for a skip in phase 2 without changing
> orig_ax, but IMO this is pretty ugly.
> 
> I think this all kind of sucks.  We're trying to run ptrace after
> seccomp, so ptrace is seeing the syscalls as transformed by seccomp.
> That means that if we use RET_TRAP, then ptrace will see the
> possibly-modified syscall, if we use RET_ERRNO, then ptrace is (IMO
> correctly given the current design) showing syscall -1, and if we use
> RET_KILL, then ptrace just sees the process mysteriously die.

Userspace is usually not prepared to see syscall -1.
For example, strace had to be patched, otherwise it just skipped such
syscalls as "not a syscall" events or did other improper things:
http://sourceforge.net/p/strace/code/ci/c3948327717c29b10b5e00a436dc138b4ab1a486
http://sourceforge.net/p/strace/code/ci/8e398b6c4020fb2d33a5b3e40271ebf63199b891

A slightly different but related story: userspace is also not prepared
to handle large errno values produced by seccomp filters like this:
BPF_STMT(BPF_RET, SECCOMP_RET_ERRNO | SECCOMP_RET_DATA)

For example, glibc assumes that syscalls do not return errno values greater than 0xfff:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sysdep.h#l55
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/syscall.S#l20

If it isn't too late, I'd recommend changing SECCOMP_RET_DATA mask
applied in SECCOMP_RET_ERRNO case from current 0xffff to 0xfff.


-- 
ldv

WARNING: multiple messages have this Message-ID (diff)
From: ldv@altlinux.org (Dmitry V. Levin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases
Date: Fri, 6 Feb 2015 05:32:49 +0300	[thread overview]
Message-ID: <20150206023249.GB31540@altlinux.org> (raw)
In-Reply-To: <CALCETrXsCUje+_V=Ud+TB4A2jH2M7yqyoCFMLEyxOD6pd7Di5w@mail.gmail.com>

On Thu, Feb 05, 2015 at 04:09:06PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 5, 2015 at 3:49 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Thu, Feb 5, 2015 at 3:39 PM, Dmitry V. Levin <ldv@altlinux.org> wrote:
[...]
> >> There is a clear difference: before these changes, SECCOMP_RET_ERRNO used
> >> to keep the syscall number unchanged and suppress syscall-exit-stop event,
> >> which was awful because userspace cannot distinguish syscall-enter-stop
> >> from syscall-exit-stop and therefore relies on the kernel that
> >> syscall-enter-stop is followed by syscall-exit-stop (or tracee's death, etc.).
> >>
> >> After these changes, SECCOMP_RET_ERRNO no longer causes syscall-exit-stop
> >> events to be suppressed, but now the syscall number is lost.
> >
> > Ah-ha! Okay, thanks, I understand now. I think this means seccomp
> > phase1 should not treat RET_ERRNO as a "skip" event. Andy, what do you
> > think here?
> 
> I still don't quite see how this change caused this.

I have a test for this at
http://sourceforge.net/p/strace/code/ci/HEAD/~/tree/test/seccomp.c

> I can play with
> it a bit more.  But RET_ERRNO *has* to be some kind of skip event,
> because it needs to skip the syscall.
> 
> We could change this by treating RET_ERRNO as an instruction to enter
> phase 2 and then asking for a skip in phase 2 without changing
> orig_ax, but IMO this is pretty ugly.
> 
> I think this all kind of sucks.  We're trying to run ptrace after
> seccomp, so ptrace is seeing the syscalls as transformed by seccomp.
> That means that if we use RET_TRAP, then ptrace will see the
> possibly-modified syscall, if we use RET_ERRNO, then ptrace is (IMO
> correctly given the current design) showing syscall -1, and if we use
> RET_KILL, then ptrace just sees the process mysteriously die.

Userspace is usually not prepared to see syscall -1.
For example, strace had to be patched, otherwise it just skipped such
syscalls as "not a syscall" events or did other improper things:
http://sourceforge.net/p/strace/code/ci/c3948327717c29b10b5e00a436dc138b4ab1a486
http://sourceforge.net/p/strace/code/ci/8e398b6c4020fb2d33a5b3e40271ebf63199b891

A slightly different but related story: userspace is also not prepared
to handle large errno values produced by seccomp filters like this:
BPF_STMT(BPF_RET, SECCOMP_RET_ERRNO | SECCOMP_RET_DATA)

For example, glibc assumes that syscalls do not return errno values greater than 0xfff:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sysdep.h#l55
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/syscall.S#l20

If it isn't too late, I'd recommend changing SECCOMP_RET_DATA mask
applied in SECCOMP_RET_ERRNO case from current 0xffff to 0xfff.


-- 
ldv

  reply	other threads:[~2015-02-06  2:32 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-05 22:13 [PATCH v5 0/5] x86: two-phase syscall tracing and seccomp fastpath Andy Lutomirski
2014-09-05 22:13 ` Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 1/5] x86,x32,audit: Fix x32's AUDIT_ARCH wrt audit Andy Lutomirski
2014-09-05 22:13   ` Andy Lutomirski
2014-09-09  2:43   ` [tip:x86/seccomp] x86, x32, audit: " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 2/5] x86,entry: Only call user_exit if TIF_NOHZ Andy Lutomirski
2014-09-05 22:13   ` Andy Lutomirski
2014-09-09  2:43   ` [tip:x86/seccomp] x86, entry: " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases Andy Lutomirski
2014-09-05 22:13   ` Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] " tip-bot for Andy Lutomirski
2015-02-05 21:19   ` [PATCH v5 3/5] " Dmitry V. Levin
2015-02-05 21:19     ` Dmitry V. Levin
2015-02-05 21:27     ` Kees Cook
2015-02-05 21:27       ` Kees Cook
2015-02-05 21:40       ` Dmitry V. Levin
2015-02-05 21:40         ` Dmitry V. Levin
2015-02-05 21:52         ` Andy Lutomirski
2015-02-05 21:52           ` Andy Lutomirski
2015-02-05 23:12           ` Kees Cook
2015-02-05 23:12             ` Kees Cook
2015-02-05 23:39             ` Dmitry V. Levin
2015-02-05 23:39               ` Dmitry V. Levin
2015-02-05 23:49               ` Kees Cook
2015-02-05 23:49                 ` Kees Cook
2015-02-06  0:09                 ` Andy Lutomirski
2015-02-06  0:09                   ` Andy Lutomirski
2015-02-06  2:32                   ` Dmitry V. Levin [this message]
2015-02-06  2:32                     ` Dmitry V. Levin
2015-02-06  2:38                     ` Andy Lutomirski
2015-02-06  2:38                       ` Andy Lutomirski
2015-02-06 19:23                       ` Kees Cook
2015-02-06 19:23                         ` Kees Cook
2015-02-06 19:32                         ` Andy Lutomirski
2015-02-06 19:32                           ` Andy Lutomirski
2015-02-06 20:07                           ` Kees Cook
2015-02-06 20:07                             ` Kees Cook
2015-02-06 20:12                             ` Andy Lutomirski
2015-02-06 20:12                               ` Andy Lutomirski
2015-02-06 20:16                               ` Kees Cook
2015-02-06 20:16                                 ` Kees Cook
2015-02-06 20:20                                 ` Andy Lutomirski
2015-02-06 20:20                                   ` Andy Lutomirski
2015-02-06 23:17                             ` a method to distinguish between syscall-enter/exit-stop Dmitry V. Levin
2015-02-06 23:17                               ` Dmitry V. Levin
2015-02-07  1:07                               ` Kees Cook
2015-02-07  1:07                                 ` Kees Cook
2015-02-07  3:04                                 ` Dmitry V. Levin
2015-02-07  3:04                                   ` Dmitry V. Levin
2015-02-06 20:11                         ` [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases H. Peter Anvin
2015-02-06 20:11                           ` H. Peter Anvin
2014-09-05 22:13 ` [PATCH v5 4/5] x86_64,entry: Treat regs->ax the same in fastpath and slowpath syscalls Andy Lutomirski
2014-09-05 22:13   ` [PATCH v5 4/5] x86_64, entry: " Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] x86_64, entry: Treat regs-> ax " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 5/5] x86_64,entry: Use split-phase syscall_trace_enter for 64-bit syscalls Andy Lutomirski
2014-09-05 22:13   ` [PATCH v5 5/5] x86_64, entry: " Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] " tip-bot for Andy Lutomirski
2014-09-08 19:29 ` [PATCH v5 0/5] x86: two-phase syscall tracing and seccomp fastpath Kees Cook
2014-09-08 19:29   ` Kees Cook
2014-09-08 19:49   ` H. Peter Anvin
2014-09-08 19:49     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150206023249.GB31540@altlinux.org \
    --to=ldv@altlinux.org \
    --cc=ast@plumgrid.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.