public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <dvlasenk@redhat.com>
To: David Drysdale <drysdale@google.com>,
	Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Will Drewry <wad@chromium.org>, Ingo Molnar <mingo@kernel.org>
Cc: Alok Kataria <akataria@vmware.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Oleg Nesterov <oleg@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>, X86 ML <x86@kernel.org>
Subject: Re: [Regression v4.2 ?] 32-bit seccomp-BPF returned errno values wrong in VM?
Date: Thu, 13 Aug 2015 17:17:36 +0200	[thread overview]
Message-ID: <55CCB510.3060807@redhat.com> (raw)
In-Reply-To: <CAHse=S8UeYVJqRBx-_wf=jeDERTJxzM=YRH5izBpQnrCHOxcDA@mail.gmail.com>

On 08/13/2015 10:30 AM, David Drysdale wrote:
> Hi folks,
> 
> I've got an odd regression with the v4.2 rc kernel, and I wondered if anyone
> else could reproduce it.
> 
> The problem occurs with a seccomp-bpf filter program that's set up to return
> an errno value -- an errno of 1 is always returned instead of what's in the
> filter, plus other oddities (selftest output below).
> 
> The problem seems to need a combination of circumstances to occur:
> 
>  - The seccomp-bpf userspace program needs to be 32-bit, running against a
>    64-bit kernel -- I'm testing with seccomp_bpf from
>    tools/testing/selftests/seccomp/, built via 'CFLAGS=-m32 make'.

Does it work correctly when built as 64-bit program?

> 
>  - The kernel needs to be running as a VM guest -- it occurs inside my
>    VMware Fusion host, but not if I run on bare metal.  Kees tells me he
>    cannot repro with a kvm guest though.
> 
> Bisecting indicates that the commit that induces the problem is
> 3f5159a9221f19b0, "x86/asm/entry/32: Update -ENOSYS handling to match the
> 64-bit logic", included in all the v4.2-rc* candidates.
> 
> Apologies if I've just got something odd with my local setup, but the
> bisection was unequivocal enough that I thought it worth reporting...
> 
> Thanks,
> David
> 
> 
> seccomp_bpf failure outputs:
> 
> seccomp_bpf.c:533:global.ERRNO_valid:Expected 7 (7) ==
> (*__errno_location ()) (1)

Test source code:

TEST(ERRNO_valid)
{
        struct sock_filter filter[] = {
                BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
                        offsetof(struct seccomp_data, nr)),
                BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_read, 0, 1),
                BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ERRNO | E2BIG),
                BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
        };
        struct sock_fprog prog = {
                .len = (unsigned short)ARRAY_SIZE(filter),
                .filter = filter,
        };
        long ret;
        pid_t parent = getppid();

        ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
        ASSERT_EQ(0, ret);

        ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog);
        ASSERT_EQ(0, ret);

        EXPECT_EQ(parent, syscall(__NR_getppid));
        EXPECT_EQ(-1, read(0, NULL, 0));
        EXPECT_EQ(E2BIG, errno);
}

The last EXPECT expects 7 (E2BIG) but sees 1.


I'm trying to see how that happens.
SECCOMP_RET_ERRNO action is processed as follows:


static u32 __seccomp_phase1_filter(int this_syscall, struct seccomp_data *sd)
{
...
        case SECCOMP_RET_ERRNO:
                /* Set low-order bits as an errno, capped at MAX_ERRNO. */
                if (data > MAX_ERRNO)
                        data = MAX_ERRNO;
                syscall_set_return_value(current, task_pt_regs(current),
                                         -data, 0);
                goto skip;
...
skip:
        audit_seccomp(this_syscall, 0, action);
        return SECCOMP_PHASE1_SKIP;  // "the syscall should not be invoked"
}

The above is called from:

unsigned long syscall_trace_enter_phase1(struct pt_regs *regs, u32 arch)
{
...
        if (work & _TIF_SECCOMP) {
...                ret = seccomp_phase1(&sd);
                if (ret == SECCOMP_PHASE1_SKIP) {
                        regs->orig_ax = -1;
                        ret = 0;
                }
		...
        }
        /* Do our best to finish without phase 2. */
        if (work == 0)
                return ret;  /* seccomp and/or nohz only (ret == 0 here) */
#ifdef CONFIG_AUDITSYSCALL
        if (work == _TIF_SYSCALL_AUDIT) {
                /*
                 * If there is no more work to be done except auditing,
                 * then audit in phase 1.  Phase 2 always audits, so, if
                 * we audit here, then we can't go on to phase 2.
                 */
                do_audit_syscall_entry(regs, arch);
                return 0;
        }
#endif
        return 1;  /* Something is enabled that we can't handle in phase 1 */
}
...
long syscall_trace_enter(struct pt_regs *regs)
{
        u32 arch = is_ia32_task() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
        unsigned long phase1_result = syscall_trace_enter_phase1(regs, arch);

        if (phase1_result == 0)
                return regs->orig_ax;
        else
                return syscall_trace_enter_phase2(regs, arch, phase1_result);
}


End result should be:
pt_regs->ax = -E2BIG (via syscall_set_return_value())
pt_regs->orig_ax = -1 ("skip syscall")
and syscall_trace_enter_phase1() usually returns with 0,
meaning "re-execute syscall at once, no phase2 needed".

This, in turn, is called from .S files, and when it returns there,
execution loops back to syscall dispatch.

Because of orig_ax = -1, syscall dispatch should skip calling syscall.
So -E2BIG should survive and be returned...

  reply	other threads:[~2015-08-13 15:17 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13  8:30 [Regression v4.2 ?] 32-bit seccomp-BPF returned errno values wrong in VM? David Drysdale
2015-08-13 15:17 ` Denys Vlasenko [this message]
2015-08-13 16:28   ` David Drysdale
2015-08-13 17:15     ` Andy Lutomirski
2015-08-13 17:39       ` David Drysdale
2015-08-13 18:47         ` Kees Cook
2015-08-13 21:35           ` Denys Vlasenko
2015-08-13 21:47             ` Andy Lutomirski
2015-08-13 22:49               ` Linus Torvalds
2015-08-13 22:54                 ` Linus Torvalds
2015-08-13 22:56                   ` Kees Cook
2015-08-13 22:59                     ` Andy Lutomirski
2015-08-13 23:14                       ` Kees Cook
2015-08-13 23:30                       ` Linus Torvalds
2015-08-14 11:58                       ` Denys Vlasenko
2015-08-14 14:27                         ` Andy Lutomirski
2015-08-14  7:33                     ` David Drysdale
2015-08-13 22:58                   ` Andy Lutomirski
2015-08-13 23:25                     ` Linus Torvalds
2015-08-13 22:27             ` Linus Torvalds
2015-08-14 11:20               ` Denys Vlasenko
2015-08-22 10:03                 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55CCB510.3060807@redhat.com \
    --to=dvlasenk@redhat.com \
    --cc=akataria@vmware.com \
    --cc=ast@plumgrid.com \
    --cc=bp@alien8.de \
    --cc=drysdale@google.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox