* DoS on x86_64 @ 2010-01-28 7:34 Mathias Krause 2010-01-28 8:18 ` [Security] " Andrew Morton 2010-01-28 17:10 ` Linus Torvalds 0 siblings, 2 replies; 32+ messages in thread From: Mathias Krause @ 2010-01-28 7:34 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, linux-mm; +Cc: security [-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --] Hello security team, I found by accident an reliable way to panic the kernel on an x86_64 system. Since this one can be triggered by an unprivileged user I CCed security@kernel.org. I also haven't found a corresponding bug on bugzilla.kernel.org. So, what to do to trigger the bug: 1. Enable core dumps 2. Start an 32 bit program that tries to execve() an 64 bit program 3. The 64 bit program cannot be started by the kernel because it can't find the interpreter, i.e. execve returns with an error 4. Generate a segmentation fault 5. panic The problem seams to be located in fs/binfmt_elf.c:load_elf_binary(). It calls SET_PERSONALITY() prior checking that the ELF interpreter is available. This in turn makes the previously 32 bit process a 64 bit one which would be fine if execve() would succeed. But after the SET_PERSONALITY() the open_exec() call fails (because it cannot find the interpreter) and execve() almost instantly returns with an error. If you now look at /proc/PID/maps you'll see, that it has the vsyscall page mapped which shouldn't be. But the process is not dead yet, it's still running. By now generating a segmentation fault and in turn trying to generate a core dump the kernel just dies. I haven't yet looked into this code but maybe you guys are much faster than me and just can fix this problem :) Test case for this bug is attached. It was tested on a 2.6.26.7 and 2.6.30.10, but I may affect even older kernels. So it may be interesting for stable, too. Greetings, Mathias Krause [-- Attachment #1.2: amd64_killer.tgz --] [-- Type: application/octet-stream, Size: 796 bytes --] [-- Attachment #2: Signierter Teil der Nachricht --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 7:34 DoS on x86_64 Mathias Krause @ 2010-01-28 8:18 ` Andrew Morton 2010-01-28 15:41 ` H. Peter Anvin 2010-01-28 21:31 ` Mathias Krause 2010-01-28 17:10 ` Linus Torvalds 1 sibling, 2 replies; 32+ messages in thread From: Andrew Morton @ 2010-01-28 8:18 UTC (permalink / raw) To: Mathias Krause Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security, H. Peter Anvin, Mike Waychison On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause <minipli@googlemail.com> wrote: > I found by accident an reliable way to panic the kernel on an x86_64 > system. Since this one can be triggered by an unprivileged user I > CCed security@kernel.org. I also haven't found a corresponding bug on > bugzilla.kernel.org. So, what to do to trigger the bug: > > 1. Enable core dumps > 2. Start an 32 bit program that tries to execve() an 64 bit program > 3. The 64 bit program cannot be started by the kernel because it > can't find the interpreter, i.e. execve returns with an error > 4. Generate a segmentation fault > 5. panic hrm, isn't this the same as "failed exec() leaves caller with incorrect personality", discussed in December? afacit nothing happened as a result of that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 8:18 ` [Security] " Andrew Morton @ 2010-01-28 15:41 ` H. Peter Anvin 2010-01-28 22:33 ` Linus Torvalds 2010-01-28 21:31 ` Mathias Krause 1 sibling, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-28 15:41 UTC (permalink / raw) To: Andrew Morton Cc: Mathias Krause, Thomas Gleixner, Ingo Molnar, linux-mm, security, Mike Waychison, Michael Davidson, Luck, Tony, Roland McGrath, James Morris On 01/28/2010 12:18 AM, Andrew Morton wrote: > On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause <minipli@googlemail.com> wrote: > >> I found by accident an reliable way to panic the kernel on an x86_64 >> system. Since this one can be triggered by an unprivileged user I >> CCed security@kernel.org. I also haven't found a corresponding bug on >> bugzilla.kernel.org. So, what to do to trigger the bug: >> >> 1. Enable core dumps >> 2. Start an 32 bit program that tries to execve() an 64 bit program >> 3. The 64 bit program cannot be started by the kernel because it >> can't find the interpreter, i.e. execve returns with an error >> 4. Generate a segmentation fault >> 5. panic > > hrm, isn't this the same as "failed exec() leaves caller with incorrect > personality", discussed in December? afacit nothing happened as a result > of that. Yes, it is. We closed the ptrace-related hole which made it exploitable as something more than a DoS, but it got stalled out a bit at that point. Funny enough I talked to Ralf about the whole situation as late as yesterday. I did a bunch of digging into this about how to fix it properly -- the code is infernally screwed up because of the compat macro layer. This is what it looks like from my point of view: - At some point in the past, some personalities would play games with the filename space in order to provide a separate namespace for libraries. As a result, we had to at least partially switch personalities before looking up the interpreter. - There is no cleanup macro! The personality switch macro is supposed to use an arch-specific deferred state change in order to handle irreversible changes, but even setting a deferral bit can be a state leak which could cause an exec to malfunction later. - *As far as I have been able to discern*, there aren't actually any architectures which use personalities which muck with the namespace anymore. The x86 layer in IA64, in particular, used to do it, but that code has been dead for a while; similar with the iBCS2 layer in i386. - In my opinion, we should defer the personality switch until we have passed the point of no return. - The actual point of no return in the case of binfmt_elf.c is inside the subroutine flush_old_exec() [which makes sense - the actual process switch shouldn't be dependent on the binfmt] which isn't subject to compat-level macro munging. - The "right thing" probably is replacing the compat macros with an ops struct. Replacing the SET_PERSONALITY() macro with a function pointer would make it possible to pass it as a function pointer to flush_old_exec() -- the current implementation as macros makes that impossible. - The only other realistic option seems to be to have a new macro to clean up the effects of SET_PERSONALITY() and add it to all failure paths. This can be done more straightforward than it sounds by moving SET_PERSONALITY() down to just before flush_old_exec(), and then the cleanup macro would be executed onto the (retval). - Either way, this is a panarchitectural change, involving some pretty grotty code in the form of the compat macros. I guess I should do the x86 implementation of one of these, but I don't see any way to fix the actual problem without touching every architecture. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 15:41 ` H. Peter Anvin @ 2010-01-28 22:33 ` Linus Torvalds 2010-01-28 22:47 ` Mathias Krause ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 22:33 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On Thu, 28 Jan 2010, H. Peter Anvin wrote: > > - The actual point of no return in the case of binfmt_elf.c is inside > the subroutine flush_old_exec() [which makes sense - the actual process > switch shouldn't be dependent on the binfmt] which isn't subject to > compat-level macro munging. Why worry about it? We already do that additional SET_PERSONALITY(loc->elf_ex); _after_ the flush_old_exec() call anyway in fs/binfmt_elf.c. So why not just simply remove the whole early SET_PERSONALITY thing, and only keep that later one? The comment about "lookup of the interpreter" is known to be irrelevant these days, so why don't we just remove it all? I have _not_ tested any of this, and maybe there is some crazy reason why this won't work, but I'm not seeing it. I think we do have to do that "task_size" thing (which flush_old_exec() also does), because it depends on the personality exactly the same way STACK_TOP does. But why isn't the following patch "obviously correct"? Linus --- fs/binfmt_elf.c | 26 ++------------------------ 1 files changed, 2 insertions(+), 24 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..c62462e 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -747,6 +723,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Do this immediately, since STACK_TOP as used in setup_arg_pages may depend on the personality. */ SET_PERSONALITY(loc->elf_ex); + current->mm->task_size = TASK_SIZE; + if (elf_read_implies_exec(loc->elf_ex, executable_stack)) current->personality |= READ_IMPLIES_EXEC; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 22:33 ` Linus Torvalds @ 2010-01-28 22:47 ` Mathias Krause 2010-01-28 22:47 ` H. Peter Anvin 2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds 2 siblings, 0 replies; 32+ messages in thread From: Mathias Krause @ 2010-01-28 22:47 UTC (permalink / raw) To: Linus Torvalds Cc: H. Peter Anvin, Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Roland McGrath [-- Attachment #1: Type: text/plain, Size: 1408 bytes --] Am 28.01.2010 um 23:33 schrieb Linus Torvalds: >> >> - The actual point of no return in the case of binfmt_elf.c is inside >> the subroutine flush_old_exec() [which makes sense - the actual >> process >> switch shouldn't be dependent on the binfmt] which isn't subject to >> compat-level macro munging. > > Why worry about it? We already do that additional > > SET_PERSONALITY(loc->elf_ex); > > _after_ the flush_old_exec() call anyway in fs/binfmt_elf.c. > > So why not just simply remove the whole early SET_PERSONALITY > thing, and > only keep that later one? The comment about "lookup of the > interpreter" is > known to be irrelevant these days, so why don't we just remove it all? > > I have _not_ tested any of this, and maybe there is some crazy > reason why > this won't work, but I'm not seeing it. > > I think we do have to do that "task_size" thing (which > flush_old_exec() > also does), because it depends on the personality exactly the same way > STACK_TOP does. But why isn't the following patch "obviously correct"? Looks good to me because that's almost exactly the thing we already tried, too. But by doing so we just got another Oops when executing a 32 bit program. But, in fact, we forgot the assignment of TASK_SIZE which now clearly makes sense. I guess we can try this tomorrow. I'll keep you informed. Thanks for the patch. Looks promising :) Greets, Mathias [-- Attachment #2: Signierter Teil der Nachricht --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 22:33 ` Linus Torvalds 2010-01-28 22:47 ` Mathias Krause @ 2010-01-28 22:47 ` H. Peter Anvin 2010-01-28 23:09 ` Linus Torvalds 2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds 2 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-28 22:47 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On 01/28/2010 02:33 PM, Linus Torvalds wrote: > > > On Thu, 28 Jan 2010, H. Peter Anvin wrote: >> >> - The actual point of no return in the case of binfmt_elf.c is inside >> the subroutine flush_old_exec() [which makes sense - the actual process >> switch shouldn't be dependent on the binfmt] which isn't subject to >> compat-level macro munging. > > Why worry about it? We already do that additional > > SET_PERSONALITY(loc->elf_ex); > > _after_ the flush_old_exec() call anyway in fs/binfmt_elf.c. > > So why not just simply remove the whole early SET_PERSONALITY thing, and > only keep that later one? The comment about "lookup of the interpreter" is > known to be irrelevant these days, so why don't we just remove it all? > > I have _not_ tested any of this, and maybe there is some crazy reason why > this won't work, but I'm not seeing it. > > I think we do have to do that "task_size" thing (which flush_old_exec() > also does), because it depends on the personality exactly the same way > STACK_TOP does. But why isn't the following patch "obviously correct"? > I was worrying about the use of TASK_SIZE, but I don't see any obvious uses of it downstream in flush_old_exec() before we return. So this patch, *plus* removing any delayed side effects from SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is intended to have a forward action from SET_PERSONALITY() to flush_thread()] might just work. I will try it out. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 22:47 ` H. Peter Anvin @ 2010-01-28 23:09 ` Linus Torvalds 2010-01-28 23:27 ` H. Peter Anvin 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 23:09 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On Thu, 28 Jan 2010, H. Peter Anvin wrote: > > So this patch, *plus* removing any delayed side effects from > SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is > intended to have a forward action from SET_PERSONALITY() to > flush_thread()] might just work. I will try it out. Yeah, if you do that, then my "split up" patch isn't necessary. And it would make the code a whole lot more straightforward, and remove that whole crazy TIF_ABI_PENDING thing. Getting rid of the whole TIF_ABI_PENDING crap would be wonderful. It would make SET_PERSONALITY() (and flush_thread()) way more obvious. So that would be much better than the untested "split up flush_old_exec" patch I just sent out. So forget that patch, and let's go with your further cleanup approach instead. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 23:09 ` Linus Torvalds @ 2010-01-28 23:27 ` H. Peter Anvin 2010-01-28 23:46 ` Linus Torvalds 2010-01-29 4:43 ` Linus Torvalds 0 siblings, 2 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-28 23:27 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On 01/28/2010 03:09 PM, Linus Torvalds wrote: > > > On Thu, 28 Jan 2010, H. Peter Anvin wrote: >> >> So this patch, *plus* removing any delayed side effects from >> SET_PERSONALITY() [e.g. the TIF_ABI_PENDING stuff in x86-64 which is >> intended to have a forward action from SET_PERSONALITY() to >> flush_thread()] might just work. I will try it out. > > Yeah, if you do that, then my "split up" patch isn't necessary. And it > would make the code a whole lot more straightforward, and remove that > whole crazy TIF_ABI_PENDING thing. > > Getting rid of the whole TIF_ABI_PENDING crap would be wonderful. It would > make SET_PERSONALITY() (and flush_thread()) way more obvious. > > So that would be much better than the untested "split up flush_old_exec" > patch I just sent out. So forget that patch, and let's go with your > further cleanup approach instead. > I think your splitup patch might still be a good idea in the sense that your flush_old_exec() is the parts that can fail. So I think the splitup patch, plus removing delayed effects, might be the right thing to do? Testing that approach now... -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 23:27 ` H. Peter Anvin @ 2010-01-28 23:46 ` Linus Torvalds 2010-01-29 4:43 ` Linus Torvalds 1 sibling, 0 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 23:46 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On Thu, 28 Jan 2010, H. Peter Anvin wrote: > > I think your splitup patch might still be a good idea in the sense that > your flush_old_exec() is the parts that can fail. Yeah. And it does have the advantage that then the naming really matches what it does. Side note: my splitup was purely by "can fail"/"cannot fail", it wasn't really by "flush old"/"setup new". So the split could certainly be done better, even if from a practical perspective it probably doesn't much matter. > So I think the splitup patch, plus removing delayed effects, might be > the right thing to do? Testing that approach now... Ok, thanks. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 23:27 ` H. Peter Anvin 2010-01-28 23:46 ` Linus Torvalds @ 2010-01-29 4:43 ` Linus Torvalds 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-29 4:43 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On Thu, 28 Jan 2010, H. Peter Anvin wrote: > > I think your splitup patch might still be a good idea in the sense that > your flush_old_exec() is the parts that can fail. > > So I think the splitup patch, plus removing delayed effects, might be > the right thing to do? Testing that approach now... So I didn't see any patch from you, so here's my try instead. I'll follow up with two patches: the first one does the split-up (and I tried to make it very obvious that it has _no_ semantic changes what-so- ever and is purely a preparatory patch), and the second actually changes the ELF loader to do the SET_PERSONALITY() call in the sane spot, and gets rid of that crazy indirect bit. Comments? It looks like ppc/sparc have had similar issues, I have _not_ done those architectures. I don't imagine that they'll complain much. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 4:43 ` Linus Torvalds @ 2010-01-29 4:43 ` Linus Torvalds 2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds 2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin 2010-01-29 5:29 ` H. Peter Anvin 2 siblings, 2 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-29 4:43 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath From: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu, 28 Jan 2010 19:56:16 -0800 Subject: [PATCH 1/2] Split 'flush_old_exec' into two functions 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 1 + fs/binfmt_aout.c | 2 ++ fs/binfmt_elf.c | 2 ++ fs/binfmt_elf_fdpic.c | 2 ++ fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 2 ++ fs/exec.c | 24 ++++++++++++------------ include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 25 insertions(+), 14 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..9bc3298 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -307,6 +307,7 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) retval = flush_old_exec(bprm); if (retval) return retval; + setup_new_exec(bprm); regs->cs = __USER32_CS; regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..56ef825 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -259,6 +259,8 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) return retval; /* OK, This is the point of no return */ + setup_new_exec(bprm); + #ifdef __alpha__ SET_AOUT_PERSONALITY(bprm, ex); #else diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..c7e6973 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -741,6 +741,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) goto out_free_dentry; /* OK, This is the point of no return */ + setup_new_exec(bprm); + current->flags &= ~PF_FORKNOEXEC; current->mm->def_flags = def_flags; diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..26d0ba3 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -318,6 +318,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, /* there's now no turning back... the old userspace image is dead, * defunct, deceased, etc. after this point we have to exit via * error_kill */ + setup_new_exec(bprm); + set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..d6a43eb 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -518,6 +518,7 @@ static int load_flat_file(struct linux_binprm * bprm, } /* OK, This is the point of no return */ + setup_new_exec(bprm); set_personality(PER_LINUX_32BIT); } diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..1189fb1 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -225,6 +225,8 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) goto out_free; /* OK, This is the point of no return */ + setup_new_exec(bprm); + current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; diff --git a/fs/exec.c b/fs/exec.c index 632b02e..9e10e6e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -959,8 +957,16 @@ int flush_old_exec(struct linux_binprm * bprm) * Release all of the old mmap stuff */ retval = exec_mmap(bprm->mm); - if (retval) - goto out; +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + char * name; + int i, ch; + char tcomm[sizeof(current->comm)]; bprm->mm = NULL; /* We're using it now */ @@ -1019,14 +1025,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- 1.7.0.rc0.33.g7c3932 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds @ 2010-01-29 4:47 ` Linus Torvalds 2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 1 sibling, 0 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-29 4:47 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath From: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu, 28 Jan 2010 20:37:30 -0800 Subject: [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Now that the previous commit made it possible to do the personality setting at the point of no return, we do just that for ELF binaries. And suddenly all the reasons for that insane TIF_ABI_PENDING bit go away, and we can just make SET_PERSONALITY() just do the obvious thing for a 32-bit compat process. Everything becomes much more straightforward this way. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- This not only makes things more straightforward, it removes more lines of code than it adds too. Partly because we don't need that comment about the crazy case any more. What say you guys? For a change, I've actually tested these two patches. I never saw the problem that Mathias saw, but I _could_ see the incorrect 64-bit [vsyscall] vma in /proc/xyz/maps after the failed execve, and that is gone with this patch. That said, I don't really have any 32-bit binaries. The only 32-bit binary I tested was Mathias test-case. It worked. The code looks sane. What can I say? arch/x86/ia32/ia32_aout.c | 1 - arch/x86/include/asm/elf.h | 6 ++---- arch/x86/include/asm/thread_info.h | 3 +-- arch/x86/kernel/process.c | 12 ------------ fs/binfmt_elf.c | 26 ++------------------------ fs/exec.c | 2 +- 6 files changed, 6 insertions(+), 44 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 9bc3298..a6b35c9 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -316,7 +316,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); - clear_thread_flag(TIF_ABI_PENDING); current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index b4501ee..4c17e20 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -183,11 +183,9 @@ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp); #define COMPAT_SET_PERSONALITY(ex) \ do { \ - if (test_thread_flag(TIF_IA32)) \ - clear_thread_flag(TIF_ABI_PENDING); \ - else \ - set_thread_flag(TIF_ABI_PENDING); \ + set_thread_flag(TIF_IA32); \ current->personality |= force_personality32; \ + current_thread_info()->status |= TS_COMPAT; \ } while (0) #define COMPAT_ELF_PLATFORM ("i686") diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 375c917..4ce5b15 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -87,7 +87,7 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_IA32 17 /* 32bit process */ #define TIF_FORK 18 /* ret_from_fork */ -#define TIF_ABI_PENDING 19 + /* 19 - unused */ #define TIF_MEMDIE 20 #define TIF_DEBUG 21 /* uses debug registers */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ @@ -112,7 +112,6 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_IA32 (1 << TIF_IA32) #define _TIF_FORK (1 << TIF_FORK) -#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING) #define _TIF_DEBUG (1 << TIF_DEBUG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) #define _TIF_FREEZE (1 << TIF_FREEZE) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 02c3ee0..c9b3522 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -115,18 +115,6 @@ void flush_thread(void) { struct task_struct *tsk = current; -#ifdef CONFIG_X86_64 - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); - if (test_tsk_thread_flag(tsk, TIF_IA32)) { - clear_tsk_thread_flag(tsk, TIF_IA32); - } else { - set_tsk_thread_flag(tsk, TIF_IA32); - current_thread_info()->status |= TS_COMPAT; - } - } -#endif - flush_ptrace_hw_breakpoint(tsk); memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array)); /* diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index c7e6973..bcf5ddb 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -740,6 +716,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (retval) goto out_free_dentry; + SET_PERSONALITY(loc->elf_ex); + /* OK, This is the point of no return */ setup_new_exec(bprm); diff --git a/fs/exec.c b/fs/exec.c index 9e10e6e..5835fe2 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -723,7 +723,6 @@ static int exec_mmap(struct mm_struct *mm) tsk->active_mm = mm; activate_mm(active_mm, mm); task_unlock(tsk); - arch_pick_mmap_layout(mm); if (old_mm) { up_read(&old_mm->mmap_sem); BUG_ON(active_mm != old_mm); @@ -969,6 +968,7 @@ void setup_new_exec(struct linux_binprm * bprm) char tcomm[sizeof(current->comm)]; bprm->mm = NULL; /* We're using it now */ + arch_pick_mmap_layout(current->mm); /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; -- 1.7.0.rc0.33.g7c3932 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds 2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds @ 2010-01-29 5:17 ` H. Peter Anvin 1 sibling, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:17 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath On 01/28/2010 08:43 PM, Linus Torvalds wrote: > --- a/fs/binfmt_elf_fdpic.c > +++ b/fs/binfmt_elf_fdpic.c > @@ -318,6 +318,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, > /* there's now no turning back... the old userspace image is dead, > * defunct, deceased, etc. after this point we have to exit via > * error_kill */ > + setup_new_exec(bprm); > + > set_personality(PER_LINUX_FDPIC); > if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) > current->personality |= READ_IMPLIES_EXEC; > diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c > index d4a00ea..d6a43eb 100644 > --- a/fs/binfmt_flat.c > +++ b/fs/binfmt_flat.c > @@ -518,6 +518,7 @@ static int load_flat_file(struct linux_binprm * bprm, > } > > /* OK, This is the point of no return */ > + setup_new_exec(bprm); > set_personality(PER_LINUX_32BIT); > } For all of these, wouldn't it make more sense to call set_personality() *before* calling setup_new_exec()? The sequencing that would seem sane to me is: flush_old_exec(); set_personality(); setup_new_exec(); ... since setup_new_exec() should be able to depend on the target personality, including TASK_SIZE and arch_pick_mmap_layout(). Similarly, for binfmt_elf the following sequence would seem to make sense: /* OK, This is the point of no return */ /* Do this immediately, since STACK_TOP as used in setup_arg_pages may depend on the personality. */ SET_PERSONALITY(loc->elf_ex); if (elf_read_implies_exec(loc->elf_ex, executable_stack)) current->personality |= READ_IMPLIES_EXEC; if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; setup_new_exec(bprm); current->flags &= ~PF_FORKNOEXEC; current->mm->def_flags = def_flags; ... then there shouldn't be a need to call SET_PERSONALITY() and arch_pick_mmap_layout() twice... -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-29 4:43 ` Linus Torvalds 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds @ 2010-01-29 5:05 ` H. Peter Anvin 2010-01-29 5:29 ` H. Peter Anvin 2 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:05 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath [-- Attachment #1: Type: text/plain, Size: 730 bytes --] On 01/28/2010 08:43 PM, Linus Torvalds wrote: > > > On Thu, 28 Jan 2010, H. Peter Anvin wrote: >> >> I think your splitup patch might still be a good idea in the sense that >> your flush_old_exec() is the parts that can fail. >> >> So I think the splitup patch, plus removing delayed effects, might be >> the right thing to do? Testing that approach now... > > So I didn't see any patch from you, so here's my try instead. > Sorry, was chasing bugs. These two patches on top of your original split patch works for me in testing so far. I'm going to compare the code with what your two new patches produce. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Fix-the-flush_old_exec-patch-from-Linus.patch --] [-- Type: text/x-patch; name="0001-Fix-the-flush_old_exec-patch-from-Linus.patch", Size: 0 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-29 4:43 ` Linus Torvalds 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds 2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin @ 2010-01-29 5:29 ` H. Peter Anvin 2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin ` (4 more replies) 2 siblings, 5 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:29 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Thomas Gleixner, Mathias Krause, Roland McGrath Here is a cleaned-up version of my patches... I pretty much used your descriptions straight off. The first one is your original split patch with my (small) changes folded in, the second one is my removal of TIF_ABI_PENDING. The main difference, again, is that my variant sets the personality *before* calling setup_new_exec(). -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 5:29 ` H. Peter Anvin @ 2010-01-29 5:34 ` H. Peter Anvin 2010-01-29 5:34 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin ` (3 subsequent siblings) 4 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:34 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin From: Linus Torvalds <torvalds@linux-foundation.org> 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 10 ++++++---- fs/binfmt_aout.c | 1 + fs/binfmt_elf.c | 27 ++------------------------- fs/binfmt_elf_fdpic.c | 3 +++ fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 1 + fs/exec.c | 26 ++++++++++++++++---------- include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 33 insertions(+), 41 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..435d2a5 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (retval) return retval; - regs->cs = __USER32_CS; - regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = - regs->r13 = regs->r14 = regs->r15 = 0; - /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); clear_thread_flag(TIF_ABI_PENDING); + setup_new_exec(bprm); + + regs->cs = __USER32_CS; + regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = + regs->r13 = regs->r14 = regs->r15 = 0; + current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); current->mm->end_data = ex.a_data + diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..fdd3970 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) #else set_personality(PER_LINUX); #endif + setup_new_exec(bprm); current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..fd5b2ea 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; - arch_pick_mmap_layout(current->mm); + + setup_new_exec(bprm); /* Do this so that we can load the interpreter, if need be. We will change some of these later */ diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..18d7729 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; + + setup_new_exec(bprm); + set_binfmt(&elf_fdpic_format); current->mm->start_code = 0; diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..42c6b4a 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm, /* OK, This is the point of no return */ set_personality(PER_LINUX_32BIT); + setup_new_exec(bprm); } /* diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..cc8560f 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) /* OK, This is the point of no return */ current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; + setup_new_exec(bprm); /* Set the task size for HP-UX processes such that * the gateway page is outside the address space. diff --git a/fs/exec.c b/fs/exec.c index 632b02e..675c3f4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm) goto out; bprm->mm = NULL; /* We're using it now */ + return 0; + +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + int i, ch; + char * name; + char tcomm[sizeof(current->comm)]; + + arch_pick_mmap_layout(current->mm); /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; @@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 5:34 ` H. Peter Anvin 0 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:34 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin Now that the previous commit made it possible to do the personality setting at the point of no return, we do just that for ELF binaries. And suddenly all the reasons for that insane TIF_ABI_PENDING bit go away, and we can just make SET_PERSONALITY() just do the obvious thing for a 32-bit compat process. Everything becomes much more straightforward this way. Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/x86/ia32/ia32_aout.c | 1 - arch/x86/include/asm/elf.h | 10 ++-------- arch/x86/include/asm/thread_info.h | 2 -- arch/x86/kernel/process.c | 12 +++--------- arch/x86/kernel/process_64.c | 11 +++++++++++ 5 files changed, 16 insertions(+), 20 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 435d2a5..f9f4724 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); - clear_thread_flag(TIF_ABI_PENDING); setup_new_exec(bprm); diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index b4501ee..1994d3f 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -181,14 +181,8 @@ do { \ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp); #define compat_start_thread start_thread_ia32 -#define COMPAT_SET_PERSONALITY(ex) \ -do { \ - if (test_thread_flag(TIF_IA32)) \ - clear_thread_flag(TIF_ABI_PENDING); \ - else \ - set_thread_flag(TIF_ABI_PENDING); \ - current->personality |= force_personality32; \ -} while (0) +void set_personality_ia32(void); +#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32() #define COMPAT_ELF_PLATFORM ("i686") diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 375c917..e0d2890 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -87,7 +87,6 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_IA32 17 /* 32bit process */ #define TIF_FORK 18 /* ret_from_fork */ -#define TIF_ABI_PENDING 19 #define TIF_MEMDIE 20 #define TIF_DEBUG 21 /* uses debug registers */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ @@ -112,7 +111,6 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_IA32 (1 << TIF_IA32) #define _TIF_FORK (1 << TIF_FORK) -#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING) #define _TIF_DEBUG (1 << TIF_DEBUG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) #define _TIF_FREEZE (1 << TIF_FREEZE) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 02c3ee0..7d42304 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -116,15 +116,9 @@ void flush_thread(void) struct task_struct *tsk = current; #ifdef CONFIG_X86_64 - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); - if (test_tsk_thread_flag(tsk, TIF_IA32)) { - clear_tsk_thread_flag(tsk, TIF_IA32); - } else { - set_tsk_thread_flag(tsk, TIF_IA32); - current_thread_info()->status |= TS_COMPAT; - } - } + /* Set up the first "return" to user space */ + if (test_tsk_thread_flag(tsk, TIF_IA32)) + current_thread_info()->status |= TS_COMPAT; #endif flush_ptrace_hw_breakpoint(tsk); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index f9e0331..41a26a8 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -521,6 +521,17 @@ void set_personality_64bit(void) current->personality &= ~READ_IMPLIES_EXEC; } +void set_personality_ia32(void) +{ + /* inherit personality from parent */ + + /* Make sure to be in 32bit mode */ + set_thread_flag(TIF_IA32); + + /* Prepare the first "return" to user space */ + current_thread_info()->status |= TS_COMPAT; +} + unsigned long get_wchan(struct task_struct *p) { unsigned long stack; -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 5:29 ` H. Peter Anvin 2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 5:36 ` H. Peter Anvin 2010-01-29 5:36 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin ` (2 subsequent siblings) 4 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:36 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin From: Linus Torvalds <torvalds@linux-foundation.org> 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 10 ++++++---- fs/binfmt_aout.c | 1 + fs/binfmt_elf.c | 27 ++------------------------- fs/binfmt_elf_fdpic.c | 3 +++ fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 1 + fs/exec.c | 26 ++++++++++++++++---------- include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 33 insertions(+), 41 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..435d2a5 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (retval) return retval; - regs->cs = __USER32_CS; - regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = - regs->r13 = regs->r14 = regs->r15 = 0; - /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); clear_thread_flag(TIF_ABI_PENDING); + setup_new_exec(bprm); + + regs->cs = __USER32_CS; + regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = + regs->r13 = regs->r14 = regs->r15 = 0; + current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); current->mm->end_data = ex.a_data + diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..fdd3970 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) #else set_personality(PER_LINUX); #endif + setup_new_exec(bprm); current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..fd5b2ea 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; - arch_pick_mmap_layout(current->mm); + + setup_new_exec(bprm); /* Do this so that we can load the interpreter, if need be. We will change some of these later */ diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..18d7729 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; + + setup_new_exec(bprm); + set_binfmt(&elf_fdpic_format); current->mm->start_code = 0; diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..42c6b4a 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm, /* OK, This is the point of no return */ set_personality(PER_LINUX_32BIT); + setup_new_exec(bprm); } /* diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..cc8560f 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) /* OK, This is the point of no return */ current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; + setup_new_exec(bprm); /* Set the task size for HP-UX processes such that * the gateway page is outside the address space. diff --git a/fs/exec.c b/fs/exec.c index 632b02e..675c3f4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm) goto out; bprm->mm = NULL; /* We're using it now */ + return 0; + +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + int i, ch; + char * name; + char tcomm[sizeof(current->comm)]; + + arch_pick_mmap_layout(current->mm); /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; @@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 5:36 ` H. Peter Anvin 0 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:36 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin Now that the previous commit made it possible to do the personality setting at the point of no return, we do just that for ELF binaries. And suddenly all the reasons for that insane TIF_ABI_PENDING bit go away, and we can just make SET_PERSONALITY() just do the obvious thing for a 32-bit compat process. Everything becomes much more straightforward this way. Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/x86/ia32/ia32_aout.c | 1 - arch/x86/include/asm/elf.h | 10 ++-------- arch/x86/include/asm/thread_info.h | 2 -- arch/x86/kernel/process.c | 12 +++--------- arch/x86/kernel/process_64.c | 11 +++++++++++ 5 files changed, 16 insertions(+), 20 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 435d2a5..f9f4724 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); - clear_thread_flag(TIF_ABI_PENDING); setup_new_exec(bprm); diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index b4501ee..1994d3f 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -181,14 +181,8 @@ do { \ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp); #define compat_start_thread start_thread_ia32 -#define COMPAT_SET_PERSONALITY(ex) \ -do { \ - if (test_thread_flag(TIF_IA32)) \ - clear_thread_flag(TIF_ABI_PENDING); \ - else \ - set_thread_flag(TIF_ABI_PENDING); \ - current->personality |= force_personality32; \ -} while (0) +void set_personality_ia32(void); +#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32() #define COMPAT_ELF_PLATFORM ("i686") diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 375c917..e0d2890 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -87,7 +87,6 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_IA32 17 /* 32bit process */ #define TIF_FORK 18 /* ret_from_fork */ -#define TIF_ABI_PENDING 19 #define TIF_MEMDIE 20 #define TIF_DEBUG 21 /* uses debug registers */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ @@ -112,7 +111,6 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_IA32 (1 << TIF_IA32) #define _TIF_FORK (1 << TIF_FORK) -#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING) #define _TIF_DEBUG (1 << TIF_DEBUG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) #define _TIF_FREEZE (1 << TIF_FREEZE) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 02c3ee0..7d42304 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -116,15 +116,9 @@ void flush_thread(void) struct task_struct *tsk = current; #ifdef CONFIG_X86_64 - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); - if (test_tsk_thread_flag(tsk, TIF_IA32)) { - clear_tsk_thread_flag(tsk, TIF_IA32); - } else { - set_tsk_thread_flag(tsk, TIF_IA32); - current_thread_info()->status |= TS_COMPAT; - } - } + /* Set up the first "return" to user space */ + if (test_tsk_thread_flag(tsk, TIF_IA32)) + current_thread_info()->status |= TS_COMPAT; #endif flush_ptrace_hw_breakpoint(tsk); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index f9e0331..41a26a8 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -521,6 +521,17 @@ void set_personality_64bit(void) current->personality &= ~READ_IMPLIES_EXEC; } +void set_personality_ia32(void) +{ + /* inherit personality from parent */ + + /* Make sure to be in 32bit mode */ + set_thread_flag(TIF_IA32); + + /* Prepare the first "return" to user space */ + current_thread_info()->status |= TS_COMPAT; +} + unsigned long get_wchan(struct task_struct *p) { unsigned long stack; -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 5:29 ` H. Peter Anvin 2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 5:41 ` H. Peter Anvin 2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 4 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:41 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin From: Linus Torvalds <torvalds@linux-foundation.org> 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 10 ++++++---- fs/binfmt_aout.c | 1 + fs/binfmt_elf.c | 27 ++------------------------- fs/binfmt_elf_fdpic.c | 3 +++ fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 1 + fs/exec.c | 26 ++++++++++++++++---------- include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 33 insertions(+), 41 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..435d2a5 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (retval) return retval; - regs->cs = __USER32_CS; - regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = - regs->r13 = regs->r14 = regs->r15 = 0; - /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); clear_thread_flag(TIF_ABI_PENDING); + setup_new_exec(bprm); + + regs->cs = __USER32_CS; + regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = + regs->r13 = regs->r14 = regs->r15 = 0; + current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); current->mm->end_data = ex.a_data + diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..fdd3970 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) #else set_personality(PER_LINUX); #endif + setup_new_exec(bprm); current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..fd5b2ea 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; - arch_pick_mmap_layout(current->mm); + + setup_new_exec(bprm); /* Do this so that we can load the interpreter, if need be. We will change some of these later */ diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..18d7729 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; + + setup_new_exec(bprm); + set_binfmt(&elf_fdpic_format); current->mm->start_code = 0; diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..42c6b4a 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm, /* OK, This is the point of no return */ set_personality(PER_LINUX_32BIT); + setup_new_exec(bprm); } /* diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..cc8560f 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) /* OK, This is the point of no return */ current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; + setup_new_exec(bprm); /* Set the task size for HP-UX processes such that * the gateway page is outside the address space. diff --git a/fs/exec.c b/fs/exec.c index 632b02e..675c3f4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm) goto out; bprm->mm = NULL; /* We're using it now */ + return 0; + +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + int i, ch; + char * name; + char tcomm[sizeof(current->comm)]; + + arch_pick_mmap_layout(current->mm); /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; @@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 5:41 ` H. Peter Anvin 2010-01-29 5:44 ` H. Peter Anvin 0 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:41 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, H. Peter Anvin Now that the previous commit made it possible to do the personality setting at the point of no return, we do just that for ELF binaries. And suddenly all the reasons for that insane TIF_ABI_PENDING bit go away, and we can just make SET_PERSONALITY() just do the obvious thing for a 32-bit compat process. Everything becomes much more straightforward this way. Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/x86/ia32/ia32_aout.c | 1 - arch/x86/include/asm/elf.h | 10 ++-------- arch/x86/include/asm/thread_info.h | 2 -- arch/x86/kernel/process.c | 12 +++--------- arch/x86/kernel/process_64.c | 11 +++++++++++ 5 files changed, 16 insertions(+), 20 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 435d2a5..f9f4724 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); - clear_thread_flag(TIF_ABI_PENDING); setup_new_exec(bprm); diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index b4501ee..1994d3f 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -181,14 +181,8 @@ do { \ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp); #define compat_start_thread start_thread_ia32 -#define COMPAT_SET_PERSONALITY(ex) \ -do { \ - if (test_thread_flag(TIF_IA32)) \ - clear_thread_flag(TIF_ABI_PENDING); \ - else \ - set_thread_flag(TIF_ABI_PENDING); \ - current->personality |= force_personality32; \ -} while (0) +void set_personality_ia32(void); +#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32() #define COMPAT_ELF_PLATFORM ("i686") diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 375c917..e0d2890 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -87,7 +87,6 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_IA32 17 /* 32bit process */ #define TIF_FORK 18 /* ret_from_fork */ -#define TIF_ABI_PENDING 19 #define TIF_MEMDIE 20 #define TIF_DEBUG 21 /* uses debug registers */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ @@ -112,7 +111,6 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_IA32 (1 << TIF_IA32) #define _TIF_FORK (1 << TIF_FORK) -#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING) #define _TIF_DEBUG (1 << TIF_DEBUG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) #define _TIF_FREEZE (1 << TIF_FREEZE) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 02c3ee0..7d42304 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -116,15 +116,9 @@ void flush_thread(void) struct task_struct *tsk = current; #ifdef CONFIG_X86_64 - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); - if (test_tsk_thread_flag(tsk, TIF_IA32)) { - clear_tsk_thread_flag(tsk, TIF_IA32); - } else { - set_tsk_thread_flag(tsk, TIF_IA32); - current_thread_info()->status |= TS_COMPAT; - } - } + /* Set up the first "return" to user space */ + if (test_tsk_thread_flag(tsk, TIF_IA32)) + current_thread_info()->status |= TS_COMPAT; #endif flush_ptrace_hw_breakpoint(tsk); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index f9e0331..41a26a8 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -521,6 +521,17 @@ void set_personality_64bit(void) current->personality &= ~READ_IMPLIES_EXEC; } +void set_personality_ia32(void) +{ + /* inherit personality from parent */ + + /* Make sure to be in 32bit mode */ + set_thread_flag(TIF_IA32); + + /* Prepare the first "return" to user space */ + current_thread_info()->status |= TS_COMPAT; +} + unsigned long get_wchan(struct task_struct *p) { unsigned long stack; -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin @ 2010-01-29 5:44 ` H. Peter Anvin 0 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 5:44 UTC (permalink / raw) To: H. Peter Anvin Cc: torvalds, akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland On 01/28/2010 09:41 PM, H. Peter Anvin wrote: > > #ifdef CONFIG_X86_64 > - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { > - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); > - if (test_tsk_thread_flag(tsk, TIF_IA32)) { > - clear_tsk_thread_flag(tsk, TIF_IA32); > - } else { > - set_tsk_thread_flag(tsk, TIF_IA32); > - current_thread_info()->status |= TS_COMPAT; > - } > - } > + /* Set up the first "return" to user space */ > + if (test_tsk_thread_flag(tsk, TIF_IA32)) > + current_thread_info()->status |= TS_COMPAT; > #endif > This chunk should of course have been completely removed... let me do that and re-test. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2] Split 'flush_old_exec' into two functions 2010-01-29 5:29 ` H. Peter Anvin ` (2 preceding siblings ...) 2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 6:14 ` H. Peter Anvin 2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 4 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 6:14 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, ralf, H. Peter Anvin From: Linus Torvalds <torvalds@linux-foundation.org> 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 10 ++++++---- fs/binfmt_aout.c | 1 + fs/binfmt_elf.c | 27 ++------------------------- fs/binfmt_elf_fdpic.c | 3 +++ fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 1 + fs/exec.c | 26 ++++++++++++++++---------- include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 33 insertions(+), 41 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..435d2a5 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -308,15 +308,17 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (retval) return retval; - regs->cs = __USER32_CS; - regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = - regs->r13 = regs->r14 = regs->r15 = 0; - /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); clear_thread_flag(TIF_ABI_PENDING); + setup_new_exec(bprm); + + regs->cs = __USER32_CS; + regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = + regs->r13 = regs->r14 = regs->r15 = 0; + current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); current->mm->end_data = ex.a_data + diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..fdd3970 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -264,6 +264,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) #else set_personality(PER_LINUX); #endif + setup_new_exec(bprm); current->mm->end_code = ex.a_text + (current->mm->start_code = N_TXTADDR(ex)); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..fd5b2ea 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; - arch_pick_mmap_layout(current->mm); + + setup_new_exec(bprm); /* Do this so that we can load the interpreter, if need be. We will change some of these later */ diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..18d7729 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -321,6 +321,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, set_personality(PER_LINUX_FDPIC); if (elf_read_implies_exec(&exec_params.hdr, executable_stack)) current->personality |= READ_IMPLIES_EXEC; + + setup_new_exec(bprm); + set_binfmt(&elf_fdpic_format); current->mm->start_code = 0; diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..42c6b4a 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm, /* OK, This is the point of no return */ set_personality(PER_LINUX_32BIT); + setup_new_exec(bprm); } /* diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..cc8560f 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) /* OK, This is the point of no return */ current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; + setup_new_exec(bprm); /* Set the task size for HP-UX processes such that * the gateway page is outside the address space. diff --git a/fs/exec.c b/fs/exec.c index 632b02e..675c3f4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -963,6 +961,20 @@ int flush_old_exec(struct linux_binprm * bprm) goto out; bprm->mm = NULL; /* We're using it now */ + return 0; + +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + int i, ch; + char * name; + char tcomm[sizeof(current->comm)]; + + arch_pick_mmap_layout(current->mm); /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; @@ -1019,14 +1031,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit 2010-01-29 5:29 ` H. Peter Anvin ` (3 preceding siblings ...) 2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin @ 2010-01-29 6:14 ` H. Peter Anvin 4 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-29 6:14 UTC (permalink / raw) To: torvalds Cc: akpm, security, tony.luck, jmorris, mikew, md, linux-mm, mingo, tglx, minipli, roland, ralf, H. Peter Anvin Now that the previous commit made it possible to do the personality setting at the point of no return, we do just that for ELF binaries. And suddenly all the reasons for that insane TIF_ABI_PENDING bit go away, and we can just make SET_PERSONALITY() just do the obvious thing for a 32-bit compat process. Everything becomes much more straightforward this way. Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- arch/x86/ia32/ia32_aout.c | 1 - arch/x86/include/asm/elf.h | 10 ++-------- arch/x86/include/asm/thread_info.h | 2 -- arch/x86/kernel/process.c | 12 ------------ arch/x86/kernel/process_64.c | 11 +++++++++++ 5 files changed, 13 insertions(+), 23 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 435d2a5..f9f4724 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -311,7 +311,6 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* OK, This is the point of no return */ set_personality(PER_LINUX); set_thread_flag(TIF_IA32); - clear_thread_flag(TIF_ABI_PENDING); setup_new_exec(bprm); diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index b4501ee..1994d3f 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -181,14 +181,8 @@ do { \ void start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp); #define compat_start_thread start_thread_ia32 -#define COMPAT_SET_PERSONALITY(ex) \ -do { \ - if (test_thread_flag(TIF_IA32)) \ - clear_thread_flag(TIF_ABI_PENDING); \ - else \ - set_thread_flag(TIF_ABI_PENDING); \ - current->personality |= force_personality32; \ -} while (0) +void set_personality_ia32(void); +#define COMPAT_SET_PERSONALITY(ex) set_personality_ia32() #define COMPAT_ELF_PLATFORM ("i686") diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 375c917..e0d2890 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -87,7 +87,6 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_IA32 17 /* 32bit process */ #define TIF_FORK 18 /* ret_from_fork */ -#define TIF_ABI_PENDING 19 #define TIF_MEMDIE 20 #define TIF_DEBUG 21 /* uses debug registers */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ @@ -112,7 +111,6 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_IA32 (1 << TIF_IA32) #define _TIF_FORK (1 << TIF_FORK) -#define _TIF_ABI_PENDING (1 << TIF_ABI_PENDING) #define _TIF_DEBUG (1 << TIF_DEBUG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) #define _TIF_FREEZE (1 << TIF_FREEZE) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 02c3ee0..c9b3522 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -115,18 +115,6 @@ void flush_thread(void) { struct task_struct *tsk = current; -#ifdef CONFIG_X86_64 - if (test_tsk_thread_flag(tsk, TIF_ABI_PENDING)) { - clear_tsk_thread_flag(tsk, TIF_ABI_PENDING); - if (test_tsk_thread_flag(tsk, TIF_IA32)) { - clear_tsk_thread_flag(tsk, TIF_IA32); - } else { - set_tsk_thread_flag(tsk, TIF_IA32); - current_thread_info()->status |= TS_COMPAT; - } - } -#endif - flush_ptrace_hw_breakpoint(tsk); memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array)); /* diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index f9e0331..41a26a8 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -521,6 +521,17 @@ void set_personality_64bit(void) current->personality &= ~READ_IMPLIES_EXEC; } +void set_personality_ia32(void) +{ + /* inherit personality from parent */ + + /* Make sure to be in 32bit mode */ + set_thread_flag(TIF_IA32); + + /* Prepare the first "return" to user space */ + current_thread_info()->status |= TS_COMPAT; +} + unsigned long get_wchan(struct task_struct *p) { unsigned long stack; -- 1.6.2.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 22:33 ` Linus Torvalds 2010-01-28 22:47 ` Mathias Krause 2010-01-28 22:47 ` H. Peter Anvin @ 2010-01-28 23:06 ` Linus Torvalds 2010-01-28 23:14 ` H. Peter Anvin 2 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 23:06 UTC (permalink / raw) To: H. Peter Anvin Cc: Mathias Krause, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Andrew Morton, Thomas Gleixner, Roland McGrath On Thu, 28 Jan 2010, Linus Torvalds wrote: > > I have _not_ tested any of this, and maybe there is some crazy reason why > this won't work, but I'm not seeing it. Grr. We also do "arch_pick_mmap_layout()" in "flush_old_exec()". That whole function is mis-named. It doesn't actually flush the old exec, it also creates the new one. However, we then re-do it afterwards in fs/binfmt_elf.c, so again, that doesn't really matter. What _does_ matter, however, is the crazy stuff we do in flush_thread() wrt TIF_ABI_PENDING. That's just crazy. So no, the trivial patch won't work. How about splitting up "flush_old_exec()" into two pieces? We'd have a "flush_old_exec()" and a "setup_new_exec()" piece, and all existing callers of flush_old_exec() would just be changed to call both? So here's a new patch. Again, TOTALLY UNTESTED. It may be equally broken, for some other reason I haven't noticed yet (or because I just screwed up). I've verified that it compiles for me, but that's it. Caveat patchor. Linus --- arch/sh/kernel/process_64.c | 2 +- arch/x86/ia32/ia32_aout.c | 1 + fs/binfmt_aout.c | 1 + fs/binfmt_elf.c | 27 ++------------------------- fs/binfmt_elf_fdpic.c | 1 + fs/binfmt_flat.c | 1 + fs/binfmt_som.c | 1 + fs/exec.c | 24 ++++++++++++++---------- include/linux/binfmts.h | 1 + include/linux/sched.h | 2 +- 10 files changed, 24 insertions(+), 37 deletions(-) diff --git a/arch/sh/kernel/process_64.c b/arch/sh/kernel/process_64.c index 31f80c6..ec79faf 100644 --- a/arch/sh/kernel/process_64.c +++ b/arch/sh/kernel/process_64.c @@ -368,7 +368,7 @@ void exit_thread(void) void flush_thread(void) { - /* Called by fs/exec.c (flush_old_exec) to remove traces of a + /* Called by fs/exec.c (setup_new_exec) to remove traces of a * previously running executable. */ #ifdef CONFIG_SH_FPU if (last_task_used_math == current) { diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 2a4d073..9bc3298 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -307,6 +307,7 @@ static int load_aout_binary(struct linux_binprm *bprm, struct pt_regs *regs) retval = flush_old_exec(bprm); if (retval) return retval; + setup_new_exec(bprm); regs->cs = __USER32_CS; regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 346b694..19ae369 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -257,6 +257,7 @@ static int load_aout_binary(struct linux_binprm * bprm, struct pt_regs * regs) retval = flush_old_exec(bprm); if (retval) return retval; + setup_new_exec(); /* OK, This is the point of no return */ #ifdef __alpha__ diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index edd90c4..fd5b2ea 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -662,27 +662,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (elf_interpreter[elf_ppnt->p_filesz - 1] != '\0') goto out_free_interp; - /* - * The early SET_PERSONALITY here is so that the lookup - * for the interpreter happens in the namespace of the - * to-be-execed image. SET_PERSONALITY can select an - * alternate root. - * - * However, SET_PERSONALITY is NOT allowed to switch - * this task into the new images's memory mapping - * policy - that is, TASK_SIZE must still evaluate to - * that which is appropriate to the execing application. - * This is because exit_mmap() needs to have TASK_SIZE - * evaluate to the size of the old image. - * - * So if (say) a 64-bit application is execing a 32-bit - * application it is the architecture's responsibility - * to defer changing the value of TASK_SIZE until the - * switch really is going to happen - do this in - * flush_thread(). - akpm - */ - SET_PERSONALITY(loc->elf_ex); - interpreter = open_exec(elf_interpreter); retval = PTR_ERR(interpreter); if (IS_ERR(interpreter)) @@ -730,9 +709,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) /* Verify the interpreter has a valid arch */ if (!elf_check_arch(&loc->interp_elf_ex)) goto out_free_dentry; - } else { - /* Executables without an interpreter also need a personality */ - SET_PERSONALITY(loc->elf_ex); } /* Flush all traces of the currently running executable */ @@ -752,7 +728,8 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) current->flags |= PF_RANDOMIZE; - arch_pick_mmap_layout(current->mm); + + setup_new_exec(bprm); /* Do this so that we can load the interpreter, if need be. We will change some of these later */ diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index c57d9ce..c38d396 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -314,6 +314,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, retval = flush_old_exec(bprm); if (retval) goto error; + setup_new_exec(bprm); /* there's now no turning back... the old userspace image is dead, * defunct, deceased, etc. after this point we have to exit via diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index d4a00ea..42c6b4a 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -519,6 +519,7 @@ static int load_flat_file(struct linux_binprm * bprm, /* OK, This is the point of no return */ set_personality(PER_LINUX_32BIT); + setup_new_exec(bprm); } /* diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index 2a9b533..cc8560f 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -227,6 +227,7 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) /* OK, This is the point of no return */ current->flags &= ~PF_FORKNOEXEC; current->personality = PER_HPUX; + setup_new_exec(bprm); /* Set the task size for HP-UX processes such that * the gateway page is outside the address space. diff --git a/fs/exec.c b/fs/exec.c index 632b02e..4bc488d 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -941,9 +941,7 @@ void set_task_comm(struct task_struct *tsk, char *buf) int flush_old_exec(struct linux_binprm * bprm) { - char * name; - int i, ch, retval; - char tcomm[sizeof(current->comm)]; + int retval; /* * Make sure we have a private signal table and that @@ -963,6 +961,18 @@ int flush_old_exec(struct linux_binprm * bprm) goto out; bprm->mm = NULL; /* We're using it now */ + return 0; + +out: + return retval; +} +EXPORT_SYMBOL(flush_old_exec); + +void setup_new_exec(struct linux_binprm * bprm) +{ + int i, ch; + char * name; + char tcomm[sizeof(current->comm)]; /* This is the point of no return */ current->sas_ss_sp = current->sas_ss_size = 0; @@ -1019,14 +1029,8 @@ int flush_old_exec(struct linux_binprm * bprm) flush_signal_handlers(current, 0); flush_old_files(current->files); - - return 0; - -out: - return retval; } - -EXPORT_SYMBOL(flush_old_exec); +EXPORT_SYMBOL(setup_new_exec); /* * Prepare credentials and lock ->cred_guard_mutex. diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index cd4349b..89c6249 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -109,6 +109,7 @@ extern int prepare_binprm(struct linux_binprm *); extern int __must_check remove_arg_zero(struct linux_binprm *); extern int search_binary_handler(struct linux_binprm *,struct pt_regs *); extern int flush_old_exec(struct linux_binprm * bprm); +extern void setup_new_exec(struct linux_binprm * bprm); extern int suid_dumpable; #define SUID_DUMP_DISABLE 0 /* No setuid dumping */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 6f7bba9..abdfacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1369,7 +1369,7 @@ struct task_struct { char comm[TASK_COMM_LEN]; /* executable name excluding path - access with [gs]et_task_comm (which lock it with task_lock()) - - initialized normally by flush_old_exec */ + - initialized normally by setup_new_exec */ /* file system info */ int link_count, total_link_count; #ifdef CONFIG_SYSVIPC -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds @ 2010-01-28 23:14 ` H. Peter Anvin 0 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2010-01-28 23:14 UTC (permalink / raw) To: Linus Torvalds Cc: Mathias Krause, security, Luck, Tony, James Morris, Mike Waychison, Michael Davidson, linux-mm, Ingo Molnar, Andrew Morton, Thomas Gleixner, Roland McGrath On 01/28/2010 03:06 PM, Linus Torvalds wrote: > > > On Thu, 28 Jan 2010, Linus Torvalds wrote: >> >> I have _not_ tested any of this, and maybe there is some crazy reason why >> this won't work, but I'm not seeing it. > > Grr. We also do "arch_pick_mmap_layout()" in "flush_old_exec()". > > That whole function is mis-named. It doesn't actually flush the old exec, > it also creates the new one. > > However, we then re-do it afterwards in fs/binfmt_elf.c, so again, that > doesn't really matter. > > What _does_ matter, however, is the crazy stuff we do in flush_thread() > wrt TIF_ABI_PENDING. That's just crazy. > > So no, the trivial patch won't work. > > How about splitting up "flush_old_exec()" into two pieces? We'd have a > "flush_old_exec()" and a "setup_new_exec()" piece, and all existing > callers of flush_old_exec() would just be changed to call both? > Ah yes. This really is a lot better than the track which I originally was thinking about, which was something like adding a callout from flush_old_exec(). I will try this... plus remove the TIF_ABI_PENDING stuff from x86, and see how it works. -hpa -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 8:18 ` [Security] " Andrew Morton 2010-01-28 15:41 ` H. Peter Anvin @ 2010-01-28 21:31 ` Mathias Krause 1 sibling, 0 replies; 32+ messages in thread From: Mathias Krause @ 2010-01-28 21:31 UTC (permalink / raw) To: Andrew Morton Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security, H. Peter Anvin, Mike Waychison, Michael Davidson, Luck, Tony, Roland McGrath, James Morris [-- Attachment #1: Type: text/plain, Size: 903 bytes --] Am 28.01.2010 um 09:18 schrieb Andrew Morton: > On Thu, 28 Jan 2010 08:34:02 +0100 Mathias Krause > <minipli@googlemail.com> wrote: > >> I found by accident an reliable way to panic the kernel on an x86_64 >> system. Since this one can be triggered by an unprivileged user I >> CCed security@kernel.org. I also haven't found a corresponding bug on >> bugzilla.kernel.org. So, what to do to trigger the bug: >> >> 1. Enable core dumps >> 2. Start an 32 bit program that tries to execve() an 64 bit program >> 3. The 64 bit program cannot be started by the kernel because it >> can't find the interpreter, i.e. execve returns with an error >> 4. Generate a segmentation fault >> 5. panic > > hrm, isn't this the same as "failed exec() leaves caller with > incorrect > personality", discussed in December? afacit nothing happened as a > result > of that. Yes, indeed it is. Thanks for the pointer. [-- Attachment #2: Signierter Teil der Nachricht --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 7:34 DoS on x86_64 Mathias Krause 2010-01-28 8:18 ` [Security] " Andrew Morton @ 2010-01-28 17:10 ` Linus Torvalds 2010-01-28 21:49 ` Mathias Krause 1 sibling, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 17:10 UTC (permalink / raw) To: Mathias Krause; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security On Thu, 28 Jan 2010, Mathias Krause wrote: > > 1. Enable core dumps > 2. Start an 32 bit program that tries to execve() an 64 bit program > 3. The 64 bit program cannot be started by the kernel because it can't find > the interpreter, i.e. execve returns with an error > 4. Generate a segmentation fault > 5. panic Hmm. Nothing happens for me when I try this. I just get the expected SIGSEGV. Can you post the oops/panic message? I don't get a core-dump, even though it says I do: [torvalds@nehalem amd64_killer]$ ./run.sh * look at /proc/22768/maps and press enter to continue... * executing ./poison... * that failed (No such file or directory), as expected :) * look at /proc/22768/maps and press enter to continue... * fasten your seat belt, generating segmentation fault... ./run.sh: line 6: 22768 Segmentation fault (core dumped) ./amd64_killer ./poison This is with current -git (I don't have any machines around running older kernels), so maybe we fixed it already, of course. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 17:10 ` Linus Torvalds @ 2010-01-28 21:49 ` Mathias Krause 2010-01-28 21:58 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: Mathias Krause @ 2010-01-28 21:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security [-- Attachment #1: Type: text/plain, Size: 1538 bytes --] Am 28.01.2010 um 18:10 schrieb Linus Torvalds: > > > On Thu, 28 Jan 2010, Mathias Krause wrote: >> >> 1. Enable core dumps >> 2. Start an 32 bit program that tries to execve() an 64 bit program >> 3. The 64 bit program cannot be started by the kernel because it >> can't find >> the interpreter, i.e. execve returns with an error >> 4. Generate a segmentation fault >> 5. panic > > Hmm. Nothing happens for me when I try this. I just get the expected > SIGSEGV. Can you post the oops/panic message? > I currently have no access to the actual machine but will send it to you tomorrow. > I don't get a core-dump, even though it says I do: > > [torvalds@nehalem amd64_killer]$ ./run.sh > * look at /proc/22768/maps and press enter to continue... > * executing ./poison... > * that failed (No such file or directory), as expected :) > * look at /proc/22768/maps and press enter to continue... Have you looked at /proc/PID/maps at this point? On our machine the [vdso] was gone and [vsyscall] was there instead -- at an 64 bit address of course. > * fasten your seat belt, generating segmentation fault... > ./run.sh: line 6: 22768 Segmentation fault (core dumped) ./ > amd64_killer ./poison > > This is with current -git (I don't have any machines around running > older > kernels), so maybe we fixed it already, of course. Since this is a production server I would rather stick to a stable kernel and just pick the commit that fixes the issue. Can you please tell me which one that may be? Greets, Mathias [-- Attachment #2: Signierter Teil der Nachricht --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 21:49 ` Mathias Krause @ 2010-01-28 21:58 ` Linus Torvalds 2010-01-28 22:08 ` Mathias Krause 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 21:58 UTC (permalink / raw) To: Mathias Krause; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security On Thu, 28 Jan 2010, Mathias Krause wrote: > > I don't get a core-dump, even though it says I do: > > > > [torvalds@nehalem amd64_killer]$ ./run.sh > > * look at /proc/22768/maps and press enter to continue... > > * executing ./poison... > > * that failed (No such file or directory), as expected :) > > * look at /proc/22768/maps and press enter to continue... > > Have you looked at /proc/PID/maps at this point? On our machine the [vdso] was > gone and [vsyscall] was there instead -- at an 64 bit address of course. Yup. That's the behavior I see - except I see the [vdso] thing in both cases. So I agree that it has become a 64-bit process, and that the whole personality crap is buggy. I just don't see the crash. > Since this is a production server I would rather stick to a stable kernel and > just pick the commit that fixes the issue. Can you please tell me which one > that may be? I'd love to be able to say that it's been fixed in so-and-so, but since I don't know what the oops is, I have a hard time even guessing _whether_ it has actually been fixed or not, or whether the reason I don't see it is something else totally unrelated. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 21:58 ` Linus Torvalds @ 2010-01-28 22:08 ` Mathias Krause 2010-01-28 22:18 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: Mathias Krause @ 2010-01-28 22:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Thomas Gleixner, Ingo Molnar, linux-mm, security [-- Attachment #1: Type: text/plain, Size: 1692 bytes --] Am 28.01.2010 um 22:58 schrieb Linus Torvalds: >>> I don't get a core-dump, even though it says I do: >>> >>> [torvalds@nehalem amd64_killer]$ ./run.sh >>> * look at /proc/22768/maps and press enter to continue... >>> * executing ./poison... >>> * that failed (No such file or directory), as expected :) >>> * look at /proc/22768/maps and press enter to continue... >> >> Have you looked at /proc/PID/maps at this point? On our machine >> the [vdso] was >> gone and [vsyscall] was there instead -- at an 64 bit address of >> course. > > Yup. That's the behavior I see - except I see the [vdso] thing in both > cases. > > So I agree that it has become a 64-bit process, and that the whole > personality crap is buggy. So it's not really fixed yet :) > I just don't see the crash. This at least gives us the hint, the core writing code maybe was modified in a way it does some additionally check that prevents the kernel to crash in this case. But the crash should be reproducible on the latest stable, 2.6.32.6 -- at least this is what I would read out of the statement of hpa made. >> Since this is a production server I would rather stick to a stable >> kernel and >> just pick the commit that fixes the issue. Can you please tell me >> which one >> that may be? > > I'd love to be able to say that it's been fixed in so-and-so, but > since I > don't know what the oops is, I have a hard time even guessing > _whether_ it > has actually been fixed or not, or whether the reason I don't see > it is > something else totally unrelated. I'll look into the last commits to fs/exec.c and see if I can find something that suits to my assumption. Greets, Mathias [-- Attachment #2: Signierter Teil der Nachricht --] [-- Type: application/pgp-signature, Size: 186 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Security] DoS on x86_64 2010-01-28 22:08 ` Mathias Krause @ 2010-01-28 22:18 ` Linus Torvalds 0 siblings, 0 replies; 32+ messages in thread From: Linus Torvalds @ 2010-01-28 22:18 UTC (permalink / raw) To: Mathias Krause; +Cc: linux-mm, Thomas Gleixner, Ingo Molnar, security On Thu, 28 Jan 2010, Mathias Krause wrote: > > > > So I agree that it has become a 64-bit process, and that the whole > > personality crap is buggy. > > So it's not really fixed yet :) Right. Peter looked at it at some point. But there's a big difference between "we have always had problems with that execve() mess" and "we have a nasty DoS that can be triggered by regular users", and the two may well have independent fixes. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2010-01-29 6:23 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-28 7:34 DoS on x86_64 Mathias Krause 2010-01-28 8:18 ` [Security] " Andrew Morton 2010-01-28 15:41 ` H. Peter Anvin 2010-01-28 22:33 ` Linus Torvalds 2010-01-28 22:47 ` Mathias Krause 2010-01-28 22:47 ` H. Peter Anvin 2010-01-28 23:09 ` Linus Torvalds 2010-01-28 23:27 ` H. Peter Anvin 2010-01-28 23:46 ` Linus Torvalds 2010-01-29 4:43 ` Linus Torvalds 2010-01-29 4:43 ` [PATCH 1/2] Split 'flush_old_exec' into two functions Linus Torvalds 2010-01-29 4:47 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit Linus Torvalds 2010-01-29 5:17 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:05 ` [Security] DoS on x86_64 H. Peter Anvin 2010-01-29 5:29 ` H. Peter Anvin 2010-01-29 5:34 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:34 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 5:36 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:36 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 5:41 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 5:41 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-29 5:44 ` H. Peter Anvin 2010-01-29 6:14 ` [PATCH 1/2] Split 'flush_old_exec' into two functions H. Peter Anvin 2010-01-29 6:14 ` [PATCH 2/2] x86: get rid of the insane TIF_ABI_PENDING bit H. Peter Anvin 2010-01-28 23:06 ` [Security] DoS on x86_64 Linus Torvalds 2010-01-28 23:14 ` H. Peter Anvin 2010-01-28 21:31 ` Mathias Krause 2010-01-28 17:10 ` Linus Torvalds 2010-01-28 21:49 ` Mathias Krause 2010-01-28 21:58 ` Linus Torvalds 2010-01-28 22:08 ` Mathias Krause 2010-01-28 22:18 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).