public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Dmitry V. Levin" <ldv@altlinux.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Kees Cook <keescook@chromium.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Will Drewry <wad@chromium.org>, Oleg Nesterov <oleg@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	Linux MIPS Mailing List <linux-mips@linux-mips.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	linux-security-module <linux-security-module@vger.kernel.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases
Date: Fri, 6 Feb 2015 05:32:49 +0300	[thread overview]
Message-ID: <20150206023249.GB31540@altlinux.org> (raw)
In-Reply-To: <CALCETrXsCUje+_V=Ud+TB4A2jH2M7yqyoCFMLEyxOD6pd7Di5w@mail.gmail.com>

On Thu, Feb 05, 2015 at 04:09:06PM -0800, Andy Lutomirski wrote:
> On Thu, Feb 5, 2015 at 3:49 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Thu, Feb 5, 2015 at 3:39 PM, Dmitry V. Levin <ldv@altlinux.org> wrote:
[...]
> >> There is a clear difference: before these changes, SECCOMP_RET_ERRNO used
> >> to keep the syscall number unchanged and suppress syscall-exit-stop event,
> >> which was awful because userspace cannot distinguish syscall-enter-stop
> >> from syscall-exit-stop and therefore relies on the kernel that
> >> syscall-enter-stop is followed by syscall-exit-stop (or tracee's death, etc.).
> >>
> >> After these changes, SECCOMP_RET_ERRNO no longer causes syscall-exit-stop
> >> events to be suppressed, but now the syscall number is lost.
> >
> > Ah-ha! Okay, thanks, I understand now. I think this means seccomp
> > phase1 should not treat RET_ERRNO as a "skip" event. Andy, what do you
> > think here?
> 
> I still don't quite see how this change caused this.

I have a test for this at
http://sourceforge.net/p/strace/code/ci/HEAD/~/tree/test/seccomp.c

> I can play with
> it a bit more.  But RET_ERRNO *has* to be some kind of skip event,
> because it needs to skip the syscall.
> 
> We could change this by treating RET_ERRNO as an instruction to enter
> phase 2 and then asking for a skip in phase 2 without changing
> orig_ax, but IMO this is pretty ugly.
> 
> I think this all kind of sucks.  We're trying to run ptrace after
> seccomp, so ptrace is seeing the syscalls as transformed by seccomp.
> That means that if we use RET_TRAP, then ptrace will see the
> possibly-modified syscall, if we use RET_ERRNO, then ptrace is (IMO
> correctly given the current design) showing syscall -1, and if we use
> RET_KILL, then ptrace just sees the process mysteriously die.

Userspace is usually not prepared to see syscall -1.
For example, strace had to be patched, otherwise it just skipped such
syscalls as "not a syscall" events or did other improper things:
http://sourceforge.net/p/strace/code/ci/c3948327717c29b10b5e00a436dc138b4ab1a486
http://sourceforge.net/p/strace/code/ci/8e398b6c4020fb2d33a5b3e40271ebf63199b891

A slightly different but related story: userspace is also not prepared
to handle large errno values produced by seccomp filters like this:
BPF_STMT(BPF_RET, SECCOMP_RET_ERRNO | SECCOMP_RET_DATA)

For example, glibc assumes that syscalls do not return errno values greater than 0xfff:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sysdep.h#l55
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/syscall.S#l20

If it isn't too late, I'd recommend changing SECCOMP_RET_DATA mask
applied in SECCOMP_RET_ERRNO case from current 0xffff to 0xfff.


-- 
ldv

  reply	other threads:[~2015-02-06  2:32 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-05 22:13 [PATCH v5 0/5] x86: two-phase syscall tracing and seccomp fastpath Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 1/5] x86,x32,audit: Fix x32's AUDIT_ARCH wrt audit Andy Lutomirski
2014-09-09  2:43   ` [tip:x86/seccomp] x86, x32, audit: " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 2/5] x86,entry: Only call user_exit if TIF_NOHZ Andy Lutomirski
2014-09-09  2:43   ` [tip:x86/seccomp] x86, entry: " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] " tip-bot for Andy Lutomirski
2015-02-05 21:19   ` [PATCH v5 3/5] " Dmitry V. Levin
2015-02-05 21:27     ` Kees Cook
2015-02-05 21:40       ` Dmitry V. Levin
2015-02-05 21:52         ` Andy Lutomirski
2015-02-05 23:12           ` Kees Cook
2015-02-05 23:39             ` Dmitry V. Levin
2015-02-05 23:49               ` Kees Cook
2015-02-06  0:09                 ` Andy Lutomirski
2015-02-06  2:32                   ` Dmitry V. Levin [this message]
2015-02-06  2:38                     ` Andy Lutomirski
2015-02-06 19:23                       ` Kees Cook
2015-02-06 19:32                         ` Andy Lutomirski
2015-02-06 20:07                           ` Kees Cook
2015-02-06 20:12                             ` Andy Lutomirski
2015-02-06 20:16                               ` Kees Cook
2015-02-06 20:20                                 ` Andy Lutomirski
2015-02-06 23:17                             ` a method to distinguish between syscall-enter/exit-stop Dmitry V. Levin
2015-02-07  1:07                               ` Kees Cook
2015-02-07  3:04                                 ` Dmitry V. Levin
2015-02-06 20:11                         ` [PATCH v5 3/5] x86: Split syscall_trace_enter into two phases H. Peter Anvin
2014-09-05 22:13 ` [PATCH v5 4/5] x86_64,entry: Treat regs->ax the same in fastpath and slowpath syscalls Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] x86_64, entry: Treat regs-> ax " tip-bot for Andy Lutomirski
2014-09-05 22:13 ` [PATCH v5 5/5] x86_64,entry: Use split-phase syscall_trace_enter for 64-bit syscalls Andy Lutomirski
2014-09-09  2:44   ` [tip:x86/seccomp] x86_64, entry: " tip-bot for Andy Lutomirski
2014-09-08 19:29 ` [PATCH v5 0/5] x86: two-phase syscall tracing and seccomp fastpath Kees Cook
2014-09-08 19:49   ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150206023249.GB31540@altlinux.org \
    --to=ldv@altlinux.org \
    --cc=ast@plumgrid.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox