* DoS on x86_64
@ 2010-01-28 7:34 Mathias Krause
2010-01-28 8:18 ` [Security] " Andrew Morton
2010-01-28 17:10 ` Linus Torvalds
0 siblings, 2 replies; 32+ messages in thread
From: Mathias Krause @ 2010-01-28 7:34 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, linux-mm; +Cc: security
[-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --]
Hello security team,
I found by accident an reliable way to panic the kernel on an x86_64
system. Since this one can be triggered by an unprivileged user I
CCed security@kernel.org. I also haven't found a corresponding bug on
bugzilla.kernel.org. So, what to do to trigger the bug:
1. Enable core dumps
2. Start an 32 bit program that tries to execve() an 64 bit program
3. The 64 bit program cannot be started by the kernel because it
can't find the interpreter, i.e. execve returns with an error
4. Generate a segmentation fault
5. panic
The problem seams to be located in fs/binfmt_elf.c:load_elf_binary().
It calls SET_PERSONALITY() prior checking that the ELF interpreter is
available. This in turn makes the previously 32 bit process a 64 bit
one which would be fine if execve() would succeed. But after the
SET_PERSONALITY() the open_exec() call fails (because it cannot find
the interpreter) and execve() almost instantly returns with an error.
If you now look at /proc/PID/maps you'll see, that it has the
vsyscall page mapped which shouldn't be. But the process is not dead
yet, it's still running. By now generating a segmentation fault and
in turn trying to generate a core dump the kernel just dies. I
haven't yet looked into this code but maybe you guys are much faster
than me and just can fix this problem :)
Test case for this bug is attached. It was tested on a 2.6.26.7 and
2.6.30.10, but I may affect even older kernels. So it may be
interesting for stable, too.
Greetings,
Mathias Krause
[-- Attachment #1.2: amd64_killer.tgz --]
[-- Type: application/octet-stream, Size: 796 bytes --]
[-- Attachment #2: Signierter Teil der Nachricht --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 7:34 DoS on x86_64 Mathias Krause
@ 2010-01-28 8:18 ` Andrew Morton
2010-01-28 15:41 ` H. Peter Anvin
2010-01-28 21:31 ` Mathias Krause
2010-01-28 17:10 ` Linus Torvalds
1 sibling, 2 replies; 32+ messages in thread
From: Andrew Morton @ 2010-01-28 8:18 UTC (permalink / raw)
To: Mathias Krause
Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security, H. Peter Anvin,
Mike Waychison
On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause <minipli@googlemail.com> wrote:
> I found by accident an reliable way to panic the kernel on an x86_64
> system. Since this one can be triggered by an unprivileged user I
> CCed security@kernel.org. I also haven't found a corresponding bug on
> bugzilla.kernel.org. So, what to do to trigger the bug:
>
> 1. Enable core dumps
> 2. Start an 32 bit program that tries to execve() an 64 bit program
> 3. The 64 bit program cannot be started by the kernel because it
> can't find the interpreter, i.e. execve returns with an error
> 4. Generate a segmentation fault
> 5. panic
hrm, isn't this the same as "failed exec() leaves caller with incorrect
personality", discussed in December? afacit nothing happened as a result
of that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 8:18 ` [Security] " Andrew Morton
@ 2010-01-28 15:41 ` H. Peter Anvin
2010-01-28 22:33 ` Linus Torvalds
2010-01-28 21:31 ` Mathias Krause
1 sibling, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-28 15:41 UTC (permalink / raw)
To: Andrew Morton
Cc: Mathias Krause, Thomas Gleixner, Ingo Molnar, linux-mm, security,
Mike Waychison, Michael Davidson, Luck, Tony, Roland McGrath,
James Morris
On 01/28/2010 12:18 AM, Andrew Morton wrote:
> On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause <minipli@googlemail.com> wrote:
>
>> I found by accident an reliable way to panic the kernel on an x86_64
>> system. Since this one can be triggered by an unprivileged user I
>> CCed security@kernel.org. I also haven't found a corresponding bug on
>> bugzilla.kernel.org. So, what to do to trigger the bug:
>>
>> 1. Enable core dumps
>> 2. Start an 32 bit program that tries to execve() an 64 bit program
>> 3. The 64 bit program cannot be started by the kernel because it
>> can't find the interpreter, i.e. execve returns with an error
>> 4. Generate a segmentation fault
>> 5. panic
>
> hrm, isn't this the same as "failed exec() leaves caller with incorrect
> personality", discussed in December? afacit nothing happened as a result
> of that.
Yes, it is. We closed the ptrace-related hole which made it exploitable
as something more than a DoS, but it got stalled out a bit at that point.
Funny enough I talked to Ralf about the whole situation as late as
yesterday. I did a bunch of digging into this about how to fix it
properly -- the code is infernally screwed up because of the compat
macro layer.
This is what it looks like from my point of view:
- At some point in the past, some personalities would play games with
the filename space in order to provide a separate namespace for
libraries. As a result, we had to at least partially switch
personalities before looking up the interpreter.
- There is no cleanup macro! The personality switch macro is supposed
to use an arch-specific deferred state change in order to handle
irreversible changes, but even setting a deferral bit can be a state
leak which could cause an exec to malfunction later.
- *As far as I have been able to discern*, there aren't actually any
architectures which use personalities which muck with the namespace
anymore. The x86 layer in IA64, in particular, used to do it, but that
code has been dead for a while; similar with the iBCS2 layer in i386.
- In my opinion, we should defer the personality switch until we have
passed the point of no return.
- The actual point of no return in the case of binfmt_elf.c is inside
the subroutine flush_old_exec() [which makes sense - the actual process
switch shouldn't be dependent on the binfmt] which isn't subject to
compat-level macro munging.
- The "right thing" probably is replacing the compat macros with an ops
struct. Replacing the SET_PERSONALITY() macro with a function pointer
would make it possible to pass it as a function pointer to
flush_old_exec() -- the current implementation as macros makes that
impossible.
- The only other realistic option seems to be to have a new macro to
clean up the effects of SET_PERSONALITY() and add it to all failure
paths. This can be done more straightforward than it sounds by moving
SET_PERSONALITY() down to just before flush_old_exec(), and then the
cleanup macro would be executed onto the (retval).
- Either way, this is a panarchitectural change, involving some pretty
grotty code in the form of the compat macros.
I guess I should do the x86 implementation of one of these, but I don't
see any way to fix the actual problem without touching every architecture.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 7:34 DoS on x86_64 Mathias Krause
2010-01-28 8:18 ` [Security] " Andrew Morton
@ 2010-01-28 17:10 ` Linus Torvalds
2010-01-28 21:49 ` Mathias Krause
1 sibling, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 17:10 UTC (permalink / raw)
To: Mathias Krause; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security
On Thu, 28 Jan 2010, Mathias Krause wrote:
>
> 1. Enable core dumps
> 2. Start an 32 bit program that tries to execve() an 64 bit program
> 3. The 64 bit program cannot be started by the kernel because it can't find
> the interpreter, i.e. execve returns with an error
> 4. Generate a segmentation fault
> 5. panic
Hmm. Nothing happens for me when I try this. I just get the expected
SIGSEGV. Can you post the oops/panic message?
I don't get a core-dump, even though it says I do:
[torvalds@nehalem amd64_killer]$ ./run.sh
* look at /proc/22768/maps and press enter to continue...
* executing ./poison...
* that failed (No such file or directory), as expected :)
* look at /proc/22768/maps and press enter to continue...
* fasten your seat belt, generating segmentation fault...
./run.sh: line 6: 22768 Segmentation fault (core dumped) ./amd64_killer ./poison
This is with current -git (I don't have any machines around running older
kernels), so maybe we fixed it already, of course.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 8:18 ` [Security] " Andrew Morton
2010-01-28 15:41 ` H. Peter Anvin
@ 2010-01-28 21:31 ` Mathias Krause
1 sibling, 0 replies; 32+ messages in thread
From: Mathias Krause @ 2010-01-28 21:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security, H. Peter Anvin,
Mike Waychison, Michael Davidson, Luck, Tony, Roland McGrath,
James Morris
[-- Attachment #1: Type: text/plain, Size: 903 bytes --]
Am 28.01.2010 um 09:18 schrieb Andrew Morton:
> On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause
> <minipli@googlemail.com> wrote:
>
>> I found by accident an reliable way to panic the kernel on an x86_64
>> system. Since this one can be triggered by an unprivileged user I
>> CCed security@kernel.org. I also haven't found a corresponding bug on
>> bugzilla.kernel.org. So, what to do to trigger the bug:
>>
>> 1. Enable core dumps
>> 2. Start an 32 bit program that tries to execve() an 64 bit program
>> 3. The 64 bit program cannot be started by the kernel because it
>> can't find the interpreter, i.e. execve returns with an error
>> 4. Generate a segmentation fault
>> 5. panic
>
> hrm, isn't this the same as "failed exec() leaves caller with
> incorrect
> personality", discussed in December? afacit nothing happened as a
> result
> of that.
Yes, indeed it is. Thanks for the pointer.
[-- Attachment #2: Signierter Teil der Nachricht --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 17:10 ` Linus Torvalds
@ 2010-01-28 21:49 ` Mathias Krause
2010-01-28 21:58 ` Linus Torvalds
0 siblings, 1 reply; 32+ messages in thread
From: Mathias Krause @ 2010-01-28 21:49 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security
[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]
Am 28.01.2010 um 18:10 schrieb Linus Torvalds:
>
>
> On Thu, 28 Jan 2010, Mathias Krause wrote:
>>
>> 1. Enable core dumps
>> 2. Start an 32 bit program that tries to execve() an 64 bit program
>> 3. The 64 bit program cannot be started by the kernel because it
>> can't find
>> the interpreter, i.e. execve returns with an error
>> 4. Generate a segmentation fault
>> 5. panic
>
> Hmm. Nothing happens for me when I try this. I just get the expected
> SIGSEGV. Can you post the oops/panic message?
>
I currently have no access to the actual machine but will send it to
you tomorrow.
> I don't get a core-dump, even though it says I do:
>
> [torvalds@nehalem amd64_killer]$ ./run.sh
> * look at /proc/22768/maps and press enter to continue...
> * executing ./poison...
> * that failed (No such file or directory), as expected :)
> * look at /proc/22768/maps and press enter to continue...
Have you looked at /proc/PID/maps at this point? On our machine the
[vdso] was gone and [vsyscall] was there instead -- at an 64 bit
address of course.
> * fasten your seat belt, generating segmentation fault...
> ./run.sh: line 6: 22768 Segmentation fault (core dumped) ./
> amd64_killer ./poison
>
> This is with current -git (I don't have any machines around running
> older
> kernels), so maybe we fixed it already, of course.
Since this is a production server I would rather stick to a stable
kernel and just pick the commit that fixes the issue. Can you please
tell me which one that may be?
Greets,
Mathias
[-- Attachment #2: Signierter Teil der Nachricht --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 21:49 ` Mathias Krause
@ 2010-01-28 21:58 ` Linus Torvalds
2010-01-28 22:08 ` Mathias Krause
0 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 21:58 UTC (permalink / raw)
To: Mathias Krause; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security
On Thu, 28 Jan 2010, Mathias Krause wrote:
> > I don't get a core-dump, even though it says I do:
> >
> > [torvalds@nehalem amd64_killer]$ ./run.sh
> > * look at /proc/22768/maps and press enter to continue...
> > * executing ./poison...
> > * that failed (No such file or directory), as expected :)
> > * look at /proc/22768/maps and press enter to continue...
>
> Have you looked at /proc/PID/maps at this point? On our machine the [vdso] was
> gone and [vsyscall] was there instead -- at an 64 bit address of course.
Yup. That's the behavior I see - except I see the [vdso] thing in both
cases.
So I agree that it has become a 64-bit process, and that the whole
personality crap is buggy.
I just don't see the crash.
> Since this is a production server I would rather stick to a stable kernel and
> just pick the commit that fixes the issue. Can you please tell me which one
> that may be?
I'd love to be able to say that it's been fixed in so-and-so, but since I
don't know what the oops is, I have a hard time even guessing _whether_ it
has actually been fixed or not, or whether the reason I don't see it is
something else totally unrelated.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 21:58 ` Linus Torvalds
@ 2010-01-28 22:08 ` Mathias Krause
2010-01-28 22:18 ` Linus Torvalds
0 siblings, 1 reply; 32+ messages in thread
From: Mathias Krause @ 2010-01-28 22:08 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security
[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]
Am 28.01.2010 um 22:58 schrieb Linus Torvalds:
>>> I don't get a core-dump, even though it says I do:
>>>
>>> [torvalds@nehalem amd64_killer]$ ./run.sh
>>> * look at /proc/22768/maps and press enter to continue...
>>> * executing ./poison...
>>> * that failed (No such file or directory), as expected :)
>>> * look at /proc/22768/maps and press enter to continue...
>>
>> Have you looked at /proc/PID/maps at this point? On our machine
>> the [vdso] was
>> gone and [vsyscall] was there instead -- at an 64 bit address of
>> course.
>
> Yup. That's the behavior I see - except I see the [vdso] thing in both
> cases.
>
> So I agree that it has become a 64-bit process, and that the whole
> personality crap is buggy.
So it's not really fixed yet :)
> I just don't see the crash.
This at least gives us the hint, the core writing code maybe was
modified in a way it does some additionally check that prevents the
kernel to crash in this case. But the crash should be reproducible on
the latest stable, 2.6.32.6 -- at least this is what I would read out
of the statement of hpa made.
>> Since this is a production server I would rather stick to a stable
>> kernel and
>> just pick the commit that fixes the issue. Can you please tell me
>> which one
>> that may be?
>
> I'd love to be able to say that it's been fixed in so-and-so, but
> since I
> don't know what the oops is, I have a hard time even guessing
> _whether_ it
> has actually been fixed or not, or whether the reason I don't see
> it is
> something else totally unrelated.
I'll look into the last commits to fs/exec.c and see if I can find
something that suits to my assumption.
Greets,
Mathias
[-- Attachment #2: Signierter Teil der Nachricht --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 22:08 ` Mathias Krause
@ 2010-01-28 22:18 ` Linus Torvalds
0 siblings, 0 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 22:18 UTC (permalink / raw)
To: Mathias Krause; +Cc: linux-mm, Thomas Gleixner, Ingo Molnar, security
On Thu, 28 Jan 2010, Mathias Krause wrote:
> >
> > So I agree that it has become a 64-bit process, and that the whole
> > personality crap is buggy.
>
> So it's not really fixed yet :)
Right. Peter looked at it at some point. But there's a big difference
between "we have always had problems with that execve() mess" and "we have
a nasty DoS that can be triggered by regular users", and the two may well
have independent fixes.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 15:41 ` H. Peter Anvin
@ 2010-01-28 22:33 ` Linus Torvalds
2010-01-28 22:47 ` Mathias Krause
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 22:33 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>
> - The actual point of no return in the case of binfmt_elf.c is inside
> the subroutine flush_old_exec() [which makes sense - the actual process
> switch shouldn't be dependent on the binfmt] which isn't subject to
> compat-level macro munging.
Why worry about it? We already do that additional
SET_PERSONALITY(loc->elf_ex);
_after_ the flush_old_exec() call anyway in fs/binfmt_elf.c.
So why not just simply remove the whole early SET_PERSONALITY thing, and
only keep that later one? The comment about "lookup of the interpreter" is
known to be irrelevant these days, so why don't we just remove it all?
I have _not_ tested any of this, and maybe there is some crazy reason why
this won't work, but I'm not seeing it.
I think we do have to do that "task_size" thing (which flush_old_exec()
also does), because it depends on the personality exactly the same way
STACK_TOP does. But why isn't the following patch "obviously correct"?
Linus
---
fs/binfmt_elf.c | 26 ++------------------------
1 files changed, 2 insertions(+), 24 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..c62462e 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -747,6 +723,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Do this immediately, since STACK_TOP as used in setup_arg_pages
may depend on the personality. */
SET_PERSONALITY(loc->elf_ex);
+ current->mm->task_size = TASK_SIZE;
+
if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 22:33 ` Linus Torvalds
@ 2010-01-28 22:47 ` Mathias Krause
2010-01-28 22:47 ` H. Peter Anvin
2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds
2 siblings, 0 replies; 32+ messages in thread
From: Mathias Krause @ 2010-01-28 22:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: H. Peter Anvin, Andrew Morton, security, Luck, Tony, James Morris,
Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar,
Thomas Gleixner, Roland McGrath
[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]
Am 28.01.2010 um 23:33 schrieb Linus Torvalds:
>>
>> - The actual point of no return in the case of binfmt_elf.c is inside
>> the subroutine flush_old_exec() [which makes sense - the actual
>> process
>> switch shouldn't be dependent on the binfmt] which isn't subject to
>> compat-level macro munging.
>
> Why worry about it? We already do that additional
>
> SET_PERSONALITY(loc->elf_ex);
>
> _after_ the flush_old_exec() call anyway in fs/binfmt_elf.c.
>
> So why not just simply remove the whole early SET_PERSONALITY
> thing, and
> only keep that later one? The comment about "lookup of the
> interpreter" is
> known to be irrelevant these days, so why don't we just remove it all?
>
> I have _not_ tested any of this, and maybe there is some crazy
> reason why
> this won't work, but I'm not seeing it.
>
> I think we do have to do that "task_size" thing (which
> flush_old_exec()
> also does), because it depends on the personality exactly the same way
> STACK_TOP does. But why isn't the following patch "obviously correct"?
Looks good to me because that's almost exactly the thing we already
tried, too. But by doing so we just got another Oops when executing a
32 bit program. But, in fact, we forgot the assignment of TASK_SIZE
which now clearly makes sense. I guess we can try this tomorrow. I'll
keep you informed.
Thanks for the patch. Looks promising :)
Greets,
Mathias
[-- Attachment #2: Signierter Teil der Nachricht --]
[-- Type: application/pgp-signature, Size: 186 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 22:33 ` Linus Torvalds
2010-01-28 22:47 ` Mathias Krause
@ 2010-01-28 22:47 ` H. Peter Anvin
2010-01-28 23:09 ` Linus Torvalds
2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds
2 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-28 22:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On 01/28/2010 02:33 PM, Linus Torvalds wrote:
>
>
> On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>>
>> - The actual point of no return in the case of binfmt_elf.c is inside
>> the subroutine flush_old_exec() [which makes sense - the actual process
>> switch shouldn't be dependent on the binfmt] which isn't subject to
>> compat-level macro munging.
>
> Why worry about it? We already do that additional
>
> SET_PERSONALITY(loc->elf_ex);
>
> _after_ the flush_old_exec() call anyway in fs/binfmt_elf.c.
>
> So why not just simply remove the whole early SET_PERSONALITY thing, and
> only keep that later one? The comment about "lookup of the interpreter" is
> known to be irrelevant these days, so why don't we just remove it all?
>
> I have _not_ tested any of this, and maybe there is some crazy reason why
> this won't work, but I'm not seeing it.
>
> I think we do have to do that "task_size" thing (which flush_old_exec()
> also does), because it depends on the personality exactly the same way
> STACK_TOP does. But why isn't the following patch "obviously correct"?
>
I was worrying about the use of TASK_SIZE, but I don't see any obvious
uses of it downstream in flush_old_exec() before we return.
So this patch, *plus* removing any delayed side effects from
SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is
intended to have a forward action from SET_PERSONALITY() to
flush_thread()] might just work. I will try it out.
-hpa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 22:33 ` Linus Torvalds
2010-01-28 22:47 ` Mathias Krause
2010-01-28 22:47 ` H. Peter Anvin
@ 2010-01-28 23:06 ` Linus Torvalds
2010-01-28 23:14 ` H. Peter Anvin
2 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 23:06 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mathias Krause, security, Luck, Tony, James Morris,
Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar,
Andrew Morton, Thomas Gleixner, Roland McGrath
On Thu, 28 Jan 2010, Linus Torvalds wrote:
>
> I have _not_ tested any of this, and maybe there is some crazy reason why
> this won't work, but I'm not seeing it.
Grr. We also do "arch_pick_mmap_layout()" in "flush_old_exec()".
That whole function is mis-named. It doesn't actually flush the old exec,
it also creates the new one.
However, we then re-do it afterwards in fs/binfmt_elf.c, so again, that
doesn't really matter.
What _does_ matter, however, is the crazy stuff we do in flush_thread()
wrt TIF_ABI_PENDING. That's just crazy.
So no, the trivial patch won't work.
How about splitting up "flush_old_exec()" into two pieces? We'd have a
"flush_old_exec()" and a "setup_new_exec()" piece, and all existing
callers of flush_old_exec() would just be changed to call both?
So here's a new patch. Again, TOTALLY UNTESTED. It may be equally broken,
for some other reason I haven't noticed yet (or because I just screwed
up). I've verified that it compiles for me, but that's it.
Caveat patchor.
Linus
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 1 +
fs/binfmt_aout.c | 1 +
fs/binfmt_elf.c | 27 ++-------------------------
fs/binfmt_elf_fdpic.c | 1 +
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 1 +
fs/exec.c | 24 ++++++++++++++----------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 24 insertions(+), 37 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..9bc3298 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -307,6 +307,7 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
retval = flush_old_exec(bprm);
if (retval)
return retval;
+ setup_new_exec(bprm);
regs->cs = __USER32_CS;
regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..19ae369 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -257,6 +257,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
retval = flush_old_exec(bprm);
if (retval)
return retval;
+ setup_new_exec();
/* OK, This is the point of no return */
#ifdef __alpha__
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..fd5b2ea 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
- arch_pick_mmap_layout(current->mm);
+
+ setup_new_exec(bprm);
/* Do this so that we can load the interpreter, if need be. We will
change some of these later */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..c38d396 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -314,6 +314,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
retval = flush_old_exec(bprm);
if (retval)
goto error;
+ setup_new_exec(bprm);
/* there's now no turning back... the old userspace image is dead,
* defunct, deceased, etc. after this point we have to exit via
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..42c6b4a 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm,
/* OK, This is the point of no return */
set_personality(PER_LINUX_32BIT);
+ setup_new_exec(bprm);
}
/*
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..cc8560f 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
/* OK, This is the point of no return */
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
+ setup_new_exec(bprm);
/* Set the task size for HP-UX processes such that
* the gateway page is outside the address space.
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..4bc488d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -963,6 +961,18 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;
bprm->mm = NULL; /* We're using it now */
+ return 0;
+
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ int i, ch;
+ char * name;
+ char tcomm[sizeof(current->comm)];
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
@@ -1019,14 +1029,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 22:47 ` H. Peter Anvin
@ 2010-01-28 23:09 ` Linus Torvalds
2010-01-28 23:27 ` H. Peter Anvin
0 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 23:09 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>
> So this patch, *plus* removing any delayed side effects from
> SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is
> intended to have a forward action from SET_PERSONALITY() to
> flush_thread()] might just work. I will try it out.
Yeah, if you do that, then my "split up" patch isn't necessary. And it
would make the code a whole lot more straightforward, and remove that
whole crazy TIF_ABI_PENDING thing.
Getting rid of the whole TIF_ABI_PENDING crap would be wonderful. It would
make SET_PERSONALITY() (and flush_thread()) way more obvious.
So that would be much better than the untested "split up flush_old_exec"
patch I just sent out. So forget that patch, and let's go with your
further cleanup approach instead.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds
@ 2010-01-28 23:14 ` H. Peter Anvin
0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-28 23:14 UTC (permalink / raw)
To: Linus Torvalds
Cc: Mathias Krause, security, Luck, Tony, James Morris,
Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar,
Andrew Morton, Thomas Gleixner, Roland McGrath
On 01/28/2010 03:06 PM, Linus Torvalds wrote:
>
>
> On Thu, 28 Jan 2010, Linus Torvalds wrote:
>>
>> I have _not_ tested any of this, and maybe there is some crazy reason why
>> this won't work, but I'm not seeing it.
>
> Grr. We also do "arch_pick_mmap_layout()" in "flush_old_exec()".
>
> That whole function is mis-named. It doesn't actually flush the old exec,
> it also creates the new one.
>
> However, we then re-do it afterwards in fs/binfmt_elf.c, so again, that
> doesn't really matter.
>
> What _does_ matter, however, is the crazy stuff we do in flush_thread()
> wrt TIF_ABI_PENDING. That's just crazy.
>
> So no, the trivial patch won't work.
>
> How about splitting up "flush_old_exec()" into two pieces? We'd have a
> "flush_old_exec()" and a "setup_new_exec()" piece, and all existing
> callers of flush_old_exec() would just be changed to call both?
>
Ah yes. This really is a lot better than the track which I originally
was thinking about, which was something like adding a callout from
flush_old_exec().
I will try this... plus remove the TIF_ABI_PENDING stuff from x86, and
see how it works.
-hpa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 23:09 ` Linus Torvalds
@ 2010-01-28 23:27 ` H. Peter Anvin
2010-01-28 23:46 ` Linus Torvalds
2010-01-29 4:43 ` Linus Torvalds
0 siblings, 2 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-28 23:27 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On 01/28/2010 03:09 PM, Linus Torvalds wrote:
>
>
> On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>>
>> So this patch, *plus* removing any delayed side effects from
>> SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is
>> intended to have a forward action from SET_PERSONALITY() to
>> flush_thread()] might just work. I will try it out.
>
> Yeah, if you do that, then my "split up" patch isn't necessary. And it
> would make the code a whole lot more straightforward, and remove that
> whole crazy TIF_ABI_PENDING thing.
>
> Getting rid of the whole TIF_ABI_PENDING crap would be wonderful. It would
> make SET_PERSONALITY() (and flush_thread()) way more obvious.
>
> So that would be much better than the untested "split up flush_old_exec"
> patch I just sent out. So forget that patch, and let's go with your
> further cleanup approach instead.
>
I think your splitup patch might still be a good idea in the sense that
your flush_old_exec() is the parts that can fail.
So I think the splitup patch, plus removing delayed effects, might be
the right thing to do? Testing that approach now...
-hpa
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 23:27 ` H. Peter Anvin
@ 2010-01-28 23:46 ` Linus Torvalds
2010-01-29 4:43 ` Linus Torvalds
1 sibling, 0 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-28 23:46 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>
> I think your splitup patch might still be a good idea in the sense that
> your flush_old_exec() is the parts that can fail.
Yeah. And it does have the advantage that then the naming really matches
what it does.
Side note: my splitup was purely by "can fail"/"cannot fail", it wasn't
really by "flush old"/"setup new". So the split could certainly be done
better, even if from a practical perspective it probably doesn't much
matter.
> So I think the splitup patch, plus removing delayed effects, might be
> the right thing to do? Testing that approach now...
Ok, thanks.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-28 23:27 ` H. Peter Anvin
2010-01-28 23:46 ` Linus Torvalds
@ 2010-01-29 4:43 ` Linus Torvalds
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
` (2 more replies)
1 sibling, 3 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-29 4:43 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>
> I think your splitup patch might still be a good idea in the sense that
> your flush_old_exec() is the parts that can fail.
>
> So I think the splitup patch, plus removing delayed effects, might be
> the right thing to do? Testing that approach now...
So I didn't see any patch from you, so here's my try instead.
I'll follow up with two patches: the first one does the split-up (and I
tried to make it very obvious that it has _no_ semantic changes what-so-
ever and is purely a preparatory patch), and the second actually changes
the ELF loader to do the SET_PERSONALITY() call in the sane spot, and gets
rid of that crazy indirect bit.
Comments?
It looks like ppc/sparc have had similar issues, I have _not_ done those
architectures. I don't imagine that they'll complain much.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 4:43 ` Linus Torvalds
@ 2010-01-29 4:43 ` Linus Torvalds
2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds
2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin
2010-01-29 5:29 ` H. Peter Anvin
2 siblings, 2 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-29 4:43 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 28 Jan 2010 19:56:16 -0800
Subject: [PATCH 1/2] Split 'flush_old_exec' into two functions
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 1 +
fs/binfmt_aout.c | 2 ++
fs/binfmt_elf.c | 2 ++
fs/binfmt_elf_fdpic.c | 2 ++
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 2 ++
fs/exec.c | 24 ++++++++++++------------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..9bc3298 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -307,6 +307,7 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
retval = flush_old_exec(bprm);
if (retval)
return retval;
+ setup_new_exec(bprm);
regs->cs = __USER32_CS;
regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..56ef825 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -259,6 +259,8 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
return retval;
/* OK, This is the point of no return */
+ setup_new_exec(bprm);
+
#ifdef __alpha__
SET_AOUT_PERSONALITY(bprm, ex);
#else
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..c7e6973 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -741,6 +741,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
goto out_free_dentry;
/* OK, This is the point of no return */
+ setup_new_exec(bprm);
+
current->flags &= ~PF_FORKNOEXEC;
current->mm->def_flags = def_flags;
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..26d0ba3 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -318,6 +318,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
/* there's now no turning back... the old userspace image is dead,
* defunct, deceased, etc. after this point we have to exit via
* error_kill */
+ setup_new_exec(bprm);
+
set_personality(PER_LINUX_FDPIC);
if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..d6a43eb 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -518,6 +518,7 @@ static int load_flat_file(struct linux_binprm * bprm,
}
/* OK, This is the point of no return */
+ setup_new_exec(bprm);
set_personality(PER_LINUX_32BIT);
}
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..1189fb1 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -225,6 +225,8 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
goto out_free;
/* OK, This is the point of no return */
+ setup_new_exec(bprm);
+
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..9e10e6e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -959,8 +957,16 @@ int flush_old_exec(struct linux_binprm * bprm)
* Release all of the old mmap stuff
*/
retval = exec_mmap(bprm->mm);
- if (retval)
- goto out;
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ char * name;
+ int i, ch;
+ char tcomm[sizeof(current->comm)];
bprm->mm = NULL; /* We're using it now */
@@ -1019,14 +1025,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
1.7.0.rc0.33.g7c3932
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
@ 2010-01-29 4:47 ` Linus Torvalds
2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
1 sibling, 0 replies; 32+ messages in thread
From: Linus Torvalds @ 2010-01-29 4:47 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 28 Jan 2010 20:37:30 -0800
Subject: [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This not only makes things more straightforward, it removes more lines of
code than it adds too. Partly because we don't need that comment about the
crazy case any more.
What say you guys? For a change, I've actually tested these two patches. I
never saw the problem that Mathias saw, but I _could_ see the incorrect
64-bit [vsyscall] vma in /proc/xyz/maps after the failed execve, and that
is gone with this patch.
That said, I don't really have any 32-bit binaries. The only 32-bit binary
I tested was Mathias test-case. It worked. The code looks sane. What can I
say?
arch/x86/ia32/ia32_aout.c | 1 -
arch/x86/include/asm/elf.h | 6 ++----
arch/x86/include/asm/thread_info.h | 3 +--
arch/x86/kernel/process.c | 12 ------------
fs/binfmt_elf.c | 26 ++------------------------
fs/exec.c | 2 +-
6 files changed, 6 insertions(+), 44 deletions(-)
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 9bc3298..a6b35c9 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -316,7 +316,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_ABI_PENDING);
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b4501ee..4c17e20 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -183,11 +183,9 @@ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define COMPAT_SET_PERSONALITY(ex) \
do { \
- if (test_thread_flag(TIF_IA32)) \
- clear_thread_flag(TIF_ABI_PENDING); \
- else \
- set_thread_flag(TIF_ABI_PENDING); \
+ set_thread_flag(TIF_IA32); \
current->personality |= force_personality32; \
+ current_thread_info()->status |= TS_COMPAT; \
} while (0)
#define COMPAT_ELF_PLATFORM ("i686")
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 375c917..4ce5b15 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,7 +87,7 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
#define TIF_FORK 18 /* ret_from_fork */
-#define TIF_ABI_PENDING 19
+ /* 19 - unused */
#define TIF_MEMDIE 20
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -112,7 +112,6 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
-#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FREEZE (1 << TIF_FREEZE)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 02c3ee0..c9b3522 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -115,18 +115,6 @@ void flush_thread(void)
{
struct task_struct *tsk = current;
-#ifdef CONFIG_X86_64
- if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
- clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
- if (test_tsk_thread_flag(tsk, TIF_IA32)) {
- clear_tsk_thread_flag(tsk, TIF_IA32);
- } else {
- set_tsk_thread_flag(tsk, TIF_IA32);
- current_thread_info()->status |= TS_COMPAT;
- }
- }
-#endif
-
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
/*
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index c7e6973..bcf5ddb 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -740,6 +716,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (retval)
goto out_free_dentry;
+ SET_PERSONALITY(loc->elf_ex);
+
/* OK, This is the point of no return */
setup_new_exec(bprm);
diff --git a/fs/exec.c b/fs/exec.c
index 9e10e6e..5835fe2 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -723,7 +723,6 @@ static int exec_mmap(struct mm_struct *mm)
tsk->active_mm = mm;
activate_mm(active_mm, mm);
task_unlock(tsk);
- arch_pick_mmap_layout(mm);
if (old_mm) {
up_read(&old_mm->mmap_sem);
BUG_ON(active_mm != old_mm);
@@ -969,6 +968,7 @@ void setup_new_exec(struct linux_binprm * bprm)
char tcomm[sizeof(current->comm)];
bprm->mm = NULL; /* We're using it now */
+ arch_pick_mmap_layout(current->mm);
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
--
1.7.0.rc0.33.g7c3932
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-29 4:43 ` Linus Torvalds
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
@ 2010-01-29 5:05 ` H. Peter Anvin
2010-01-29 5:29 ` H. Peter Anvin
2 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
[-- Attachment #1: Type: text/plain, Size: 730 bytes --]
On 01/28/2010 08:43 PM, Linus Torvalds wrote:
>
>
> On Thu, 28 Jan 2010, H. Peter Anvin wrote:
>>
>> I think your splitup patch might still be a good idea in the sense that
>> your flush_old_exec() is the parts that can fail.
>>
>> So I think the splitup patch, plus removing delayed effects, might be
>> the right thing to do? Testing that approach now...
>
> So I didn't see any patch from you, so here's my try instead.
>
Sorry, was chasing bugs. These two patches on top of your original
split patch works for me in testing so far. I'm going to compare the
code with what your two new patches produce.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-the-flush_old_exec-patch-from-Linus.patch --]
[-- Type: text/x-patch; name="0001-Fix-the-flush_old_exec-patch-from-Linus.patch", Size: 0 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds
@ 2010-01-29 5:17 ` H. Peter Anvin
1 sibling, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:17 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
On 01/28/2010 08:43 PM, Linus Torvalds wrote:
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -318,6 +318,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
> /* there's now no turning back... the old userspace image is dead,
> * defunct, deceased, etc. after this point we have to exit via
> * error_kill */
> + setup_new_exec(bprm);
> +
> set_personality(PER_LINUX_FDPIC);
> if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
> current->personality |= READ_IMPLIES_EXEC;
> diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
> index d4a00ea..d6a43eb 100644
> --- a/fs/binfmt_flat.c
> +++ b/fs/binfmt_flat.c
> @@ -518,6 +518,7 @@ static int load_flat_file(struct linux_binprm * bprm,
> }
>
> /* OK, This is the point of no return */
> + setup_new_exec(bprm);
> set_personality(PER_LINUX_32BIT);
> }
For all of these, wouldn't it make more sense to call set_personality()
*before* calling setup_new_exec()? The sequencing that would seem sane
to me is:
flush_old_exec();
set_personality();
setup_new_exec();
... since setup_new_exec() should be able to depend on the target
personality, including TASK_SIZE and arch_pick_mmap_layout().
Similarly, for binfmt_elf the following sequence would seem to make sense:
/* OK, This is the point of no return */
/* Do this immediately, since STACK_TOP as used in setup_arg_pages
may depend on the personality. */
SET_PERSONALITY(loc->elf_ex);
if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
setup_new_exec(bprm);
current->flags &= ~PF_FORKNOEXEC;
current->mm->def_flags = def_flags;
... then there shouldn't be a need to call SET_PERSONALITY() and
arch_pick_mmap_layout() twice...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64
2010-01-29 4:43 ` Linus Torvalds
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin
@ 2010-01-29 5:29 ` H. Peter Anvin
2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
` (4 more replies)
2 siblings, 5 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:29 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison,
Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner,
Mathias Krause, Roland McGrath
Here is a cleaned-up version of my patches... I pretty much used your
descriptions straight off. The first one is your original split patch
with my (small) changes folded in, the second one is my removal of
TIF_ABI_PENDING.
The main difference, again, is that my variant sets the personality
*before* calling setup_new_exec().
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 5:29 ` H. Peter Anvin
@ 2010-01-29 5:34 ` H. Peter Anvin
2010-01-29 5:34 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
` (3 subsequent siblings)
4 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:34 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
From: Linus Torvalds <torvalds@linux-foundation.org>
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 10 ++++++----
fs/binfmt_aout.c | 1 +
fs/binfmt_elf.c | 27 ++-------------------------
fs/binfmt_elf_fdpic.c | 3 +++
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 1 +
fs/exec.c | 26 ++++++++++++++++----------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 33 insertions(+), 41 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..435d2a5 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (retval)
return retval;
- regs->cs = __USER32_CS;
- regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
- regs->r13 = regs->r14 = regs->r15 = 0;
-
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ABI_PENDING);
+ setup_new_exec(bprm);
+
+ regs->cs = __USER32_CS;
+ regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
+ regs->r13 = regs->r14 = regs->r15 = 0;
+
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
current->mm->end_data = ex.a_data +
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..fdd3970 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
#else
set_personality(PER_LINUX);
#endif
+ setup_new_exec(bprm);
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..fd5b2ea 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
- arch_pick_mmap_layout(current->mm);
+
+ setup_new_exec(bprm);
/* Do this so that we can load the interpreter, if need be. We will
change some of these later */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..18d7729 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
set_personality(PER_LINUX_FDPIC);
if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
+
+ setup_new_exec(bprm);
+
set_binfmt(&elf_fdpic_format);
current->mm->start_code = 0;
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..42c6b4a 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm,
/* OK, This is the point of no return */
set_personality(PER_LINUX_32BIT);
+ setup_new_exec(bprm);
}
/*
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..cc8560f 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
/* OK, This is the point of no return */
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
+ setup_new_exec(bprm);
/* Set the task size for HP-UX processes such that
* the gateway page is outside the address space.
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..675c3f4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;
bprm->mm = NULL; /* We're using it now */
+ return 0;
+
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ int i, ch;
+ char * name;
+ char tcomm[sizeof(current->comm)];
+
+ arch_pick_mmap_layout(current->mm);
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
@@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 5:34 ` H. Peter Anvin
0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:34 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/x86/ia32/ia32_aout.c | 1 -
arch/x86/include/asm/elf.h | 10 ++--------
arch/x86/include/asm/thread_info.h | 2 --
arch/x86/kernel/process.c | 12 +++---------
arch/x86/kernel/process_64.c | 11 +++++++++++
5 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 435d2a5..f9f4724 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_ABI_PENDING);
setup_new_exec(bprm);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b4501ee..1994d3f 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -181,14 +181,8 @@ do { \
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define compat_start_thread start_thread_ia32
-#define COMPAT_SET_PERSONALITY(ex) \
-do { \
- if (test_thread_flag(TIF_IA32)) \
- clear_thread_flag(TIF_ABI_PENDING); \
- else \
- set_thread_flag(TIF_ABI_PENDING); \
- current->personality |= force_personality32; \
-} while (0)
+void set_personality_ia32(void);
+#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32()
#define COMPAT_ELF_PLATFORM ("i686")
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 375c917..e0d2890 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,7 +87,6 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
#define TIF_FORK 18 /* ret_from_fork */
-#define TIF_ABI_PENDING 19
#define TIF_MEMDIE 20
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -112,7 +111,6 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
-#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FREEZE (1 << TIF_FREEZE)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 02c3ee0..7d42304 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -116,15 +116,9 @@ void flush_thread(void)
struct task_struct *tsk = current;
#ifdef CONFIG_X86_64
- if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
- clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
- if (test_tsk_thread_flag(tsk, TIF_IA32)) {
- clear_tsk_thread_flag(tsk, TIF_IA32);
- } else {
- set_tsk_thread_flag(tsk, TIF_IA32);
- current_thread_info()->status |= TS_COMPAT;
- }
- }
+ /* Set up the first "return" to user space */
+ if (test_tsk_thread_flag(tsk, TIF_IA32))
+ current_thread_info()->status |= TS_COMPAT;
#endif
flush_ptrace_hw_breakpoint(tsk);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f9e0331..41a26a8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -521,6 +521,17 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}
+void set_personality_ia32(void)
+{
+ /* inherit personality from parent */
+
+ /* Make sure to be in 32bit mode */
+ set_thread_flag(TIF_IA32);
+
+ /* Prepare the first "return" to user space */
+ current_thread_info()->status |= TS_COMPAT;
+}
+
unsigned long get_wchan(struct task_struct *p)
{
unsigned long stack;
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 5:29 ` H. Peter Anvin
2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 5:36 ` H. Peter Anvin
2010-01-29 5:36 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
` (2 subsequent siblings)
4 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:36 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
From: Linus Torvalds <torvalds@linux-foundation.org>
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 10 ++++++----
fs/binfmt_aout.c | 1 +
fs/binfmt_elf.c | 27 ++-------------------------
fs/binfmt_elf_fdpic.c | 3 +++
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 1 +
fs/exec.c | 26 ++++++++++++++++----------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 33 insertions(+), 41 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..435d2a5 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (retval)
return retval;
- regs->cs = __USER32_CS;
- regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
- regs->r13 = regs->r14 = regs->r15 = 0;
-
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ABI_PENDING);
+ setup_new_exec(bprm);
+
+ regs->cs = __USER32_CS;
+ regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
+ regs->r13 = regs->r14 = regs->r15 = 0;
+
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
current->mm->end_data = ex.a_data +
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..fdd3970 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
#else
set_personality(PER_LINUX);
#endif
+ setup_new_exec(bprm);
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..fd5b2ea 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
- arch_pick_mmap_layout(current->mm);
+
+ setup_new_exec(bprm);
/* Do this so that we can load the interpreter, if need be. We will
change some of these later */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..18d7729 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
set_personality(PER_LINUX_FDPIC);
if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
+
+ setup_new_exec(bprm);
+
set_binfmt(&elf_fdpic_format);
current->mm->start_code = 0;
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..42c6b4a 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm,
/* OK, This is the point of no return */
set_personality(PER_LINUX_32BIT);
+ setup_new_exec(bprm);
}
/*
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..cc8560f 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
/* OK, This is the point of no return */
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
+ setup_new_exec(bprm);
/* Set the task size for HP-UX processes such that
* the gateway page is outside the address space.
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..675c3f4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;
bprm->mm = NULL; /* We're using it now */
+ return 0;
+
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ int i, ch;
+ char * name;
+ char tcomm[sizeof(current->comm)];
+
+ arch_pick_mmap_layout(current->mm);
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
@@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 5:36 ` H. Peter Anvin
0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:36 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/x86/ia32/ia32_aout.c | 1 -
arch/x86/include/asm/elf.h | 10 ++--------
arch/x86/include/asm/thread_info.h | 2 --
arch/x86/kernel/process.c | 12 +++---------
arch/x86/kernel/process_64.c | 11 +++++++++++
5 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 435d2a5..f9f4724 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_ABI_PENDING);
setup_new_exec(bprm);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b4501ee..1994d3f 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -181,14 +181,8 @@ do { \
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define compat_start_thread start_thread_ia32
-#define COMPAT_SET_PERSONALITY(ex) \
-do { \
- if (test_thread_flag(TIF_IA32)) \
- clear_thread_flag(TIF_ABI_PENDING); \
- else \
- set_thread_flag(TIF_ABI_PENDING); \
- current->personality |= force_personality32; \
-} while (0)
+void set_personality_ia32(void);
+#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32()
#define COMPAT_ELF_PLATFORM ("i686")
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 375c917..e0d2890 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,7 +87,6 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
#define TIF_FORK 18 /* ret_from_fork */
-#define TIF_ABI_PENDING 19
#define TIF_MEMDIE 20
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -112,7 +111,6 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
-#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FREEZE (1 << TIF_FREEZE)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 02c3ee0..7d42304 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -116,15 +116,9 @@ void flush_thread(void)
struct task_struct *tsk = current;
#ifdef CONFIG_X86_64
- if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
- clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
- if (test_tsk_thread_flag(tsk, TIF_IA32)) {
- clear_tsk_thread_flag(tsk, TIF_IA32);
- } else {
- set_tsk_thread_flag(tsk, TIF_IA32);
- current_thread_info()->status |= TS_COMPAT;
- }
- }
+ /* Set up the first "return" to user space */
+ if (test_tsk_thread_flag(tsk, TIF_IA32))
+ current_thread_info()->status |= TS_COMPAT;
#endif
flush_ptrace_hw_breakpoint(tsk);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f9e0331..41a26a8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -521,6 +521,17 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}
+void set_personality_ia32(void)
+{
+ /* inherit personality from parent */
+
+ /* Make sure to be in 32bit mode */
+ set_thread_flag(TIF_IA32);
+
+ /* Prepare the first "return" to user space */
+ current_thread_info()->status |= TS_COMPAT;
+}
+
unsigned long get_wchan(struct task_struct *p)
{
unsigned long stack;
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 5:29 ` H. Peter Anvin
2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 5:41 ` H. Peter Anvin
2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
4 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:41 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
From: Linus Torvalds <torvalds@linux-foundation.org>
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 10 ++++++----
fs/binfmt_aout.c | 1 +
fs/binfmt_elf.c | 27 ++-------------------------
fs/binfmt_elf_fdpic.c | 3 +++
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 1 +
fs/exec.c | 26 ++++++++++++++++----------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 33 insertions(+), 41 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..435d2a5 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (retval)
return retval;
- regs->cs = __USER32_CS;
- regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
- regs->r13 = regs->r14 = regs->r15 = 0;
-
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ABI_PENDING);
+ setup_new_exec(bprm);
+
+ regs->cs = __USER32_CS;
+ regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
+ regs->r13 = regs->r14 = regs->r15 = 0;
+
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
current->mm->end_data = ex.a_data +
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..fdd3970 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
#else
set_personality(PER_LINUX);
#endif
+ setup_new_exec(bprm);
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..fd5b2ea 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
- arch_pick_mmap_layout(current->mm);
+
+ setup_new_exec(bprm);
/* Do this so that we can load the interpreter, if need be. We will
change some of these later */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..18d7729 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
set_personality(PER_LINUX_FDPIC);
if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
+
+ setup_new_exec(bprm);
+
set_binfmt(&elf_fdpic_format);
current->mm->start_code = 0;
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..42c6b4a 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm,
/* OK, This is the point of no return */
set_personality(PER_LINUX_32BIT);
+ setup_new_exec(bprm);
}
/*
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..cc8560f 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
/* OK, This is the point of no return */
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
+ setup_new_exec(bprm);
/* Set the task size for HP-UX processes such that
* the gateway page is outside the address space.
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..675c3f4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;
bprm->mm = NULL; /* We're using it now */
+ return 0;
+
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ int i, ch;
+ char * name;
+ char tcomm[sizeof(current->comm)];
+
+ arch_pick_mmap_layout(current->mm);
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
@@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 5:41 ` H. Peter Anvin
2010-01-29 5:44 ` H. Peter Anvin
0 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:41 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, H. Peter Anvin
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/x86/ia32/ia32_aout.c | 1 -
arch/x86/include/asm/elf.h | 10 ++--------
arch/x86/include/asm/thread_info.h | 2 --
arch/x86/kernel/process.c | 12 +++---------
arch/x86/kernel/process_64.c | 11 +++++++++++
5 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 435d2a5..f9f4724 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_ABI_PENDING);
setup_new_exec(bprm);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b4501ee..1994d3f 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -181,14 +181,8 @@ do { \
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define compat_start_thread start_thread_ia32
-#define COMPAT_SET_PERSONALITY(ex) \
-do { \
- if (test_thread_flag(TIF_IA32)) \
- clear_thread_flag(TIF_ABI_PENDING); \
- else \
- set_thread_flag(TIF_ABI_PENDING); \
- current->personality |= force_personality32; \
-} while (0)
+void set_personality_ia32(void);
+#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32()
#define COMPAT_ELF_PLATFORM ("i686")
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 375c917..e0d2890 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,7 +87,6 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
#define TIF_FORK 18 /* ret_from_fork */
-#define TIF_ABI_PENDING 19
#define TIF_MEMDIE 20
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -112,7 +111,6 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
-#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FREEZE (1 << TIF_FREEZE)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 02c3ee0..7d42304 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -116,15 +116,9 @@ void flush_thread(void)
struct task_struct *tsk = current;
#ifdef CONFIG_X86_64
- if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
- clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
- if (test_tsk_thread_flag(tsk, TIF_IA32)) {
- clear_tsk_thread_flag(tsk, TIF_IA32);
- } else {
- set_tsk_thread_flag(tsk, TIF_IA32);
- current_thread_info()->status |= TS_COMPAT;
- }
- }
+ /* Set up the first "return" to user space */
+ if (test_tsk_thread_flag(tsk, TIF_IA32))
+ current_thread_info()->status |= TS_COMPAT;
#endif
flush_ptrace_hw_breakpoint(tsk);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f9e0331..41a26a8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -521,6 +521,17 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}
+void set_personality_ia32(void)
+{
+ /* inherit personality from parent */
+
+ /* Make sure to be in 32bit mode */
+ set_thread_flag(TIF_IA32);
+
+ /* Prepare the first "return" to user space */
+ current_thread_info()->status |= TS_COMPAT;
+}
+
unsigned long get_wchan(struct task_struct *p)
{
unsigned long stack;
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
@ 2010-01-29 5:44 ` H. Peter Anvin
0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 5:44 UTC (permalink / raw)
To: H. Peter Anvin
Cc: torvalds, akpm, security, tony.luck, jmorris, mikew, md, linux-mm,
mingo, tglx, minipli, roland
On 01/28/2010 09:41 PM, H. Peter Anvin wrote:
>
> #ifdef CONFIG_X86_64
> - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
> - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
> - if (test_tsk_thread_flag(tsk, TIF_IA32)) {
> - clear_tsk_thread_flag(tsk, TIF_IA32);
> - } else {
> - set_tsk_thread_flag(tsk, TIF_IA32);
> - current_thread_info()->status |= TS_COMPAT;
> - }
> - }
> + /* Set up the first "return" to user space */
> + if (test_tsk_thread_flag(tsk, TIF_IA32))
> + current_thread_info()->status |= TS_COMPAT;
> #endif
>
This chunk should of course have been completely removed... let me do
that and re-test.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions
2010-01-29 5:29 ` H. Peter Anvin
` (2 preceding siblings ...)
2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 6:14 ` H. Peter Anvin
2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
4 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 6:14 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, ralf, H. Peter Anvin
From: Linus Torvalds <torvalds@linux-foundation.org>
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/sh/kernel/process_64.c | 2 +-
arch/x86/ia32/ia32_aout.c | 10 ++++++----
fs/binfmt_aout.c | 1 +
fs/binfmt_elf.c | 27 ++-------------------------
fs/binfmt_elf_fdpic.c | 3 +++
fs/binfmt_flat.c | 1 +
fs/binfmt_som.c | 1 +
fs/exec.c | 26 ++++++++++++++++----------
include/linux/binfmts.h | 1 +
include/linux/sched.h | 2 +-
10 files changed, 33 insertions(+), 41 deletions(-)
diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c
index 31f80c6..ec79faf 100644
--- a/arch/sh/kernel/process_64.c
+++ b/arch/sh/kernel/process_64.c
@@ -368,7 +368,7 @@ void exit_thread(void)
void flush_thread(void)
{
- /* Called by fs/exec.c (flush_old_exec) to remove traces of a
+ /* Called by fs/exec.c (setup_new_exec) to remove traces of a
* previously running executable. */
#ifdef CONFIG_SH_FPU
if (last_task_used_math == current) {
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 2a4d073..435d2a5 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (retval)
return retval;
- regs->cs = __USER32_CS;
- regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
- regs->r13 = regs->r14 = regs->r15 = 0;
-
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ABI_PENDING);
+ setup_new_exec(bprm);
+
+ regs->cs = __USER32_CS;
+ regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 =
+ regs->r13 = regs->r14 = regs->r15 = 0;
+
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
current->mm->end_data = ex.a_data +
diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c
index 346b694..fdd3970 100644
--- a/fs/binfmt_aout.c
+++ b/fs/binfmt_aout.c
@@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs)
#else
set_personality(PER_LINUX);
#endif
+ setup_new_exec(bprm);
current->mm->end_code = ex.a_text +
(current->mm->start_code = N_TXTADDR(ex));
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index edd90c4..fd5b2ea 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0')
goto out_free_interp;
- /*
- * The early SET_PERSONALITY here is so that the lookup
- * for the interpreter happens in the namespace of the
- * to-be-execed image. SET_PERSONALITY can select an
- * alternate root.
- *
- * However, SET_PERSONALITY is NOT allowed to switch
- * this task into the new images's memory mapping
- * policy - that is, TASK_SIZE must still evaluate to
- * that which is appropriate to the execing application.
- * This is because exit_mmap() needs to have TASK_SIZE
- * evaluate to the size of the old image.
- *
- * So if (say) a 64-bit application is execing a 32-bit
- * application it is the architecture's responsibility
- * to defer changing the value of TASK_SIZE until the
- * switch really is going to happen - do this in
- * flush_thread(). - akpm
- */
- SET_PERSONALITY(loc->elf_ex);
-
interpreter = open_exec(elf_interpreter);
retval = PTR_ERR(interpreter);
if (IS_ERR(interpreter))
@@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* Verify the interpreter has a valid arch */
if (!elf_check_arch(&loc->interp_elf_ex))
goto out_free_dentry;
- } else {
- /* Executables without an interpreter also need a personality */
- SET_PERSONALITY(loc->elf_ex);
}
/* Flush all traces of the currently running executable */
@@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;
- arch_pick_mmap_layout(current->mm);
+
+ setup_new_exec(bprm);
/* Do this so that we can load the interpreter, if need be. We will
change some of these later */
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index c57d9ce..18d7729 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
set_personality(PER_LINUX_FDPIC);
if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
+
+ setup_new_exec(bprm);
+
set_binfmt(&elf_fdpic_format);
current->mm->start_code = 0;
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index d4a00ea..42c6b4a 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm,
/* OK, This is the point of no return */
set_personality(PER_LINUX_32BIT);
+ setup_new_exec(bprm);
}
/*
diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c
index 2a9b533..cc8560f 100644
--- a/fs/binfmt_som.c
+++ b/fs/binfmt_som.c
@@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs)
/* OK, This is the point of no return */
current->flags &= ~PF_FORKNOEXEC;
current->personality = PER_HPUX;
+ setup_new_exec(bprm);
/* Set the task size for HP-UX processes such that
* the gateway page is outside the address space.
diff --git a/fs/exec.c b/fs/exec.c
index 632b02e..675c3f4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf)
int flush_old_exec(struct linux_binprm * bprm)
{
- char * name;
- int i, ch, retval;
- char tcomm[sizeof(current->comm)];
+ int retval;
/*
* Make sure we have a private signal table and that
@@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;
bprm->mm = NULL; /* We're using it now */
+ return 0;
+
+out:
+ return retval;
+}
+EXPORT_SYMBOL(flush_old_exec);
+
+void setup_new_exec(struct linux_binprm * bprm)
+{
+ int i, ch;
+ char * name;
+ char tcomm[sizeof(current->comm)];
+
+ arch_pick_mmap_layout(current->mm);
/* This is the point of no return */
current->sas_ss_sp = current->sas_ss_size = 0;
@@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm)
flush_signal_handlers(current, 0);
flush_old_files(current->files);
-
- return 0;
-
-out:
- return retval;
}
-
-EXPORT_SYMBOL(flush_old_exec);
+EXPORT_SYMBOL(setup_new_exec);
/*
* Prepare credentials and lock ->cred_guard_mutex.
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cd4349b..89c6249 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *);
extern int __must_check remove_arg_zero(struct linux_binprm *);
extern int search_binary_handler(struct linux_binprm *,struct pt_regs *);
extern int flush_old_exec(struct linux_binprm * bprm);
+extern void setup_new_exec(struct linux_binprm * bprm);
extern int suid_dumpable;
#define SUID_DUMP_DISABLE 0 /* No setuid dumping */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6f7bba9..abdfacc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1369,7 +1369,7 @@ struct task_struct {
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- - initialized normally by flush_old_exec */
+ - initialized normally by setup_new_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit
2010-01-29 5:29 ` H. Peter Anvin
` (3 preceding siblings ...)
2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
@ 2010-01-29 6:14 ` H. Peter Anvin
4 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2010-01-29 6:14 UTC (permalink / raw)
To: torvalds
Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo,
tglx, minipli, roland, ralf, H. Peter Anvin
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/x86/ia32/ia32_aout.c | 1 -
arch/x86/include/asm/elf.h | 10 ++--------
arch/x86/include/asm/thread_info.h | 2 --
arch/x86/kernel/process.c | 12 ------------
arch/x86/kernel/process_64.c | 11 +++++++++++
5 files changed, 13 insertions(+), 23 deletions(-)
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index 435d2a5..f9f4724 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs)
/* OK, This is the point of no return */
set_personality(PER_LINUX);
set_thread_flag(TIF_IA32);
- clear_thread_flag(TIF_ABI_PENDING);
setup_new_exec(bprm);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b4501ee..1994d3f 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -181,14 +181,8 @@ do { \
void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
#define compat_start_thread start_thread_ia32
-#define COMPAT_SET_PERSONALITY(ex) \
-do { \
- if (test_thread_flag(TIF_IA32)) \
- clear_thread_flag(TIF_ABI_PENDING); \
- else \
- set_thread_flag(TIF_ABI_PENDING); \
- current->personality |= force_personality32; \
-} while (0)
+void set_personality_ia32(void);
+#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32()
#define COMPAT_ELF_PLATFORM ("i686")
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 375c917..e0d2890 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -87,7 +87,6 @@ struct thread_info {
#define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* 32bit process */
#define TIF_FORK 18 /* ret_from_fork */
-#define TIF_ABI_PENDING 19
#define TIF_MEMDIE 20
#define TIF_DEBUG 21 /* uses debug registers */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
@@ -112,7 +111,6 @@ struct thread_info {
#define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK)
-#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING)
#define _TIF_DEBUG (1 << TIF_DEBUG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
#define _TIF_FREEZE (1 << TIF_FREEZE)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 02c3ee0..c9b3522 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -115,18 +115,6 @@ void flush_thread(void)
{
struct task_struct *tsk = current;
-#ifdef CONFIG_X86_64
- if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) {
- clear_tsk_thread_flag(tsk, TIF_ABI_PENDING);
- if (test_tsk_thread_flag(tsk, TIF_IA32)) {
- clear_tsk_thread_flag(tsk, TIF_IA32);
- } else {
- set_tsk_thread_flag(tsk, TIF_IA32);
- current_thread_info()->status |= TS_COMPAT;
- }
- }
-#endif
-
flush_ptrace_hw_breakpoint(tsk);
memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
/*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f9e0331..41a26a8 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -521,6 +521,17 @@ void set_personality_64bit(void)
current->personality &= ~READ_IMPLIES_EXEC;
}
+void set_personality_ia32(void)
+{
+ /* inherit personality from parent */
+
+ /* Make sure to be in 32bit mode */
+ set_thread_flag(TIF_IA32);
+
+ /* Prepare the first "return" to user space */
+ current_thread_info()->status |= TS_COMPAT;
+}
+
unsigned long get_wchan(struct task_struct *p)
{
unsigned long stack;
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
end of thread, other threads:[~2010-01-29 6:23 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-28 7:34 DoS on x86_64 Mathias Krause
2010-01-28 8:18 ` [Security] " Andrew Morton
2010-01-28 15:41 ` H. Peter Anvin
2010-01-28 22:33 ` Linus Torvalds
2010-01-28 22:47 ` Mathias Krause
2010-01-28 22:47 ` H. Peter Anvin
2010-01-28 23:09 ` Linus Torvalds
2010-01-28 23:27 ` H. Peter Anvin
2010-01-28 23:46 ` Linus Torvalds
2010-01-29 4:43 ` Linus Torvalds
2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds
2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds
2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin
2010-01-29 5:29 ` H. Peter Anvin
2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:34 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:36 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-29 5:44 ` H. Peter Anvin
2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin
2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin
2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds
2010-01-28 23:14 ` H. Peter Anvin
2010-01-28 21:31 ` Mathias Krause
2010-01-28 17:10 ` Linus Torvalds
2010-01-28 21:49 ` Mathias Krause
2010-01-28 21:58 ` Linus Torvalds
2010-01-28 22:08 ` Mathias Krause
2010-01-28 22:18 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).