From: Al Viro <viro@ZenIV.linux.org.uk>
To: Meredydd Luff <meredydd@senatehouse.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org, Kees Cook <keescook@chromium.org>,
Ingo Molnar <mingo@redhat.com>, Jeff Dike <jdike@addtoit.com>,
Richard Weinberger <richard@nod.at>,
Andrew Morton <akpm@linux-foundation.org>,
linux-arch@vger.kernel.org
Subject: Re: [PATCH] [RFC] syscalls,x86: Add execveat() system call (v2)
Date: Thu, 2 Aug 2012 11:30:02 +0100 [thread overview]
Message-ID: <20120802103001.GJ6481@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAD=T17Gkryua5TQW5h18kCu+yiViuVv6NyxtMw3C9we0RWzmKQ@mail.gmail.com>
On Thu, Aug 02, 2012 at 10:14:53AM +0100, Meredydd Luff wrote:
> On Thu, Aug 2, 2012 at 7:55 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >> This means you need an x32 version of the function -- execve
> >> unfortunately is one of the few system calls which require a special x32
> >> version (although it's a simple wrapper around sys32_execve). See
> >> sys_x32_execve.
> >
> > I *really* strongly object to doing that thing before we sanitize the
> > situation with sys_execve().
>
> "That thing" = "creating an x32 entry stub", or "merging execveat() at all"?
>
> (snip)
> > The thing is, there's essentially no reason to have more than one
> > implementation. What they are (badly) doing is "we need to find
> > pt_regs to pass to do_execve(), the thing we are after has to be near
> > our stack frame, so let's try to get to it that way".
>
> Hang on...it's not just sys_execve that fits that description, is it?
> You seem to be describing every call that needs a pt_regs parameter,
> which at a glance is anything with a stub_ or PTREGSCALL in
> arch/x86/kernel/entry_{32,64}.S. That's: clone, fork, vfork,
> sigaltstack, iopl, execve, sigreturn, rt_sigreturn, vm86, vm86old.
> Most of those are handled by a common PTREGSCALL macro, but there are
> a few that get special treatment (different set on each arch - on
> x86-64 it's execve and rt_sigreturn ; on i386 it's just clone).
>
> Is there's something special about execve in particular, or do you
> want to overhaul all the ptregscalls?
You are looking at that from the wrong side; it's not that this
set of syscalls on x86 has the same kind of wrapper - it's that
on different architectures the kludges are seriously different
and fairly brittle. sigaltstack(), BTW, doesn't need to be
arch-specific at all - if you check what its pt_regs argument
is used for, we just need something like current_user_stack_pointer()
and that's it. It's a different patch series, anyway.
There's a difference between "here's the syscall implementation,
here's its hookup for x86, let other architectures update their
syscall tables" and "here's the implementation in arch/x86 and its
hookup for x86; let other architectures port it in whatever way
they need". The latter is a recipe for breakage - we'd been
through that kind of story quite a few times.
Out of that set, vm86/vm86old/iopl are genuine x86-only syscalls.
sigreturn and rt_sigreturn are weird, subtle and, sadly,
unmergable - syscall restart prevention logics is different
enough to make it not feasible at the moment. It's a serious
source of PITA, BTW - quite a few of those had the same bugs,
years after they'd been discovered and fixed on a subset of
architectures. Been there, fixed some of those, the latest
batch this April... fork/vfork/clone is an interesting
question - IIRC, there are seriously subtle issues of some
sort on sparc, but I don't remember details right now.
Really, go and grep for do_execve; most of the callers will
be in execve(2) implementations. Look at them carefully -
while some get pt_regs from some wrapper, there's a bunch
that does that manually in C. "Ugly" doesn't even begin
to describe what's being done...
FWIW, I've just pushed (completely untested) arm and alpha
parts of what I described into signal.git#execve2; x86 is
next. Note that after that sys_execve() is identical on
converted architectures and can be merged; ditto for
kernel_execve(). After I do x86 counterpart, I'll
take those guys to fs/exec.c under ifdef for new __ARCH_HAS_...
(and define it on already converted ones, obviously).
Then your patch goes there, except that implementation
gets put into fs/exec.c, under the same ifdef. And with
current_pt_regs() used instead of the extra argument,
of course. From that point on it can be used on any converted
architecture. And conversion consists of
* providing ret_from_kernel_execve() that would
put pt_regs where they belong and return to userland.
* providing current_pt_regs(), with default being
just task_pt_regs(current).
* defining that new __ARCH_HAS_...
* removing sys_execve()/kernel_execve() implementations;
the ones in fs/exec.c will work just fine.
Can be done at leisure, architecture by architecture. It's
obviously the next cycle fodder - we have only a couple of
days left of this merge window, anyway.
I really think that this pair of primitives is the right
way to deal with execve mess.
next prev parent reply other threads:[~2012-08-02 10:30 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-01 22:10 [PATCH] [RFC] syscalls,x86: Add execveat() system call (v2) Meredydd Luff
2012-08-01 22:53 ` H. Peter Anvin
2012-08-01 23:09 ` Meredydd Luff
2012-08-01 23:30 ` H. Peter Anvin
2012-08-01 23:32 ` H. Peter Anvin
[not found] ` <CAD=T17GaEMU_PDDvaJ9GVRUtzTZ069AN2yTuX6=rVo6jNO+6oA@mail.gmail.com>
2012-08-02 0:52 ` H. Peter Anvin
2012-08-02 6:55 ` Al Viro
2012-08-02 9:14 ` Meredydd Luff
2012-08-02 10:30 ` Al Viro [this message]
2012-08-02 10:51 ` Meredydd Luff
2012-08-09 19:19 ` Andy Lutomirski
2012-08-09 20:29 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120802103001.GJ6481@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=jdike@addtoit.com \
--cc=keescook@chromium.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=meredydd@senatehouse.org \
--cc=mingo@redhat.com \
--cc=richard@nod.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.