* [PATCH] x86: ptrace: sign-extend eax with orig_eax>=0 @ 2009-09-23 0:46 Roland McGrath 2009-09-23 1:31 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: Roland McGrath @ 2009-09-23 0:46 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton, Linus Torvalds The 32-bit ptrace syscall on a 64-bit kernel (32-bit debugger on 32-bit task) behaves differently than a native 32-bit kernel. When setting a register state of orig_eax>=0 and eax=-ERESTART* when the debugged task is NOT on its way out of a 32-bit syscall, the task will fail to do the syscall restart logic that it should do. Test case available at http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/erestartsys-trap.c?cvsroot=systemtap This happens because the 32-bit ptrace syscall sets eax=0xffffffff when it sets orig_eax>=0. The resuming task will not sign-extend this for the -ERESTART* check because TS_COMPAT is not set. (So the task thinks it is restarting after a 64-bit syscall, not a 32-bit one.) The fix is to have 32-bit ptrace calls sign-extend eax when orig_eax>=0. The long comment in the change explains the scenarios and caveats fully. Reported-by: Jan.Kratochvil@redhat.com Signed-off-by: Roland McGrath <roland@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> --- arch/x86/kernel/ptrace.c | 55 +++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 54 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 8d7d5c9..ecb7a49 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -1120,7 +1120,6 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) R32(edi, di); R32(esi, si); R32(ebp, bp); - R32(eax, ax); R32(eip, ip); R32(esp, sp); @@ -1130,6 +1129,60 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) * causes (long)orig_ax < 0 tests to fire correctly. */ regs->orig_ax = (long) (s32) value; + + /* + * Whenever setting orig_eax to indicate a system call in + * progress, make sure an eax value set by the debugger gets + * sign-extended so that any ax == -ERESTART* tests fire + * correctly. + * + * When those tests (in handle_signal) are done directly + * after an actual 32-bit syscall, then TS_COMPAT is set and + * so syscall_get_error() does sign-extension. However, the + * debugger sometimes saves that state and then restores it + * later with the intent of picking up the old thread state + * that can be about to do syscall restart. + * + * When it's a 32-bit debugger, that truncates ax to 32 bits. + * If the debugger restores thread state and resumes after a + * ptrace stop when the child was not doing a new syscall, it + * will not have TS_COMPAT set to make syscall_get_error() + * notice and do the sign-extension. + * + * We can't have syscall_get_error() always sign-extend, + * since that's wrong for 64-bit syscalls. We want it to + * check TS_COMPAT rather than TIF_IA32 to avoid a false + * positive in the oddball case of a 32-bit task doing a + * syscall from a 64-bit code segment. In the "restored + * thread state" case, it has no way to know whether the + * restored state refers to a 32-bit or 64-bit syscall. + * + * So we can't win 'em all. We assume that if you are using + * a 32-bit debugger, you don't really care about arcane + * interference with a child trying to use 64-bit syscalls. + * (Just use a 64-bit debugger on it instead!) What we do + * here makes a 32-bit debugger fiddling a 32-bit task + * consistent with what happens on a native 32-bit kernel. + * + * NOTE! Since we have no similar logic in putreg(), we + * just expect a 64-bit debugger to save/restore the full + * 64 bits. If a 64-bit debugger were to treat a 32-bit + * task differently and save/restore only 32 bits per + * register, it would have to grok orig_eax >= 0 and know + * to sign-extend its saved eax when setting it as 64 bits. + */ + if (regs->orig_ax >= 0) + regs->ax = (long) (s32) regs->ax; + break; + + case offsetof(struct user32, regs.eax): + /* + * As above, for either order of setting both ax and orig_ax. + */ + if (regs->orig_ax >= 0) + regs->ax = (long) (s32) value; + else + regs->ax = value; break; case offsetof(struct user32, regs.eflags): ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] x86: ptrace: sign-extend eax with orig_eax>=0 2009-09-23 0:46 [PATCH] x86: ptrace: sign-extend eax with orig_eax>=0 Roland McGrath @ 2009-09-23 1:31 ` Linus Torvalds 2009-09-23 2:40 ` Roland McGrath 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2009-09-23 1:31 UTC (permalink / raw) To: Roland McGrath Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton On Tue, 22 Sep 2009, Roland McGrath wrote: > > The 32-bit ptrace syscall on a 64-bit kernel (32-bit debugger on > 32-bit task) behaves differently than a native 32-bit kernel. When > setting a register state of orig_eax>=0 and eax=-ERESTART* when the > debugged task is NOT on its way out of a 32-bit syscall, the task will > fail to do the syscall restart logic that it should do. Hmm. This really smells extremely hacky to me. I see what you're doing, and I understand why, but it all makes me shudder. I think we already do the wrong thing for 'orig_ax', and that we should probably simply test just the low 32 bits of it (looks to be easily done by just making the return type of 'syscall_get_nr()' be 'int') rather than have that special "let's sign-extend orig_ax" code in ptrace. I also get the feeling that the TS_COMPAT testing is just hacky to begin with. I think it was broken to do things that way, and I have this gut feel that we really should have hidden the "am I a 32-bit or 64-bit syscall" in that orig_ax field instead (ie make the 64-bit system calls set a bit or whatever). That's the field we've always used for system call flagging, and TS_COMPAT was broken. In this example, the problem for ptrace seems to be that TS_COMPAT ends up being that "extra" bit of information that isn't visible in the register set. But that's a bigger separate cleanup. In the meantime, I really get the feeling that your patch is nasty, and could be replaced with something like the following instead.. It just sets the TS_COMPAT bit if we use a 32-bit interface to set the system call number to positive (ie "inside system call"). THE BELOW IS TOTALLY UNTESTED! I haven't really thought it through. I just reacted very negatively to your patch, and my gut feel is that the code is fundamentally doing something wrong. The below may not work, and may be seriously broken and miss the point, but I think it conceptually comes closer to how things _should_ work. Linus --- arch/x86/include/asm/syscall.h | 2 +- arch/x86/kernel/ptrace.c | 8 +++++++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h index d82f39b..f2a0631 100644 --- a/arch/x86/include/asm/syscall.h +++ b/arch/x86/include/asm/syscall.h @@ -16,7 +16,7 @@ #include <linux/sched.h> #include <linux/err.h> -static inline long syscall_get_nr(struct task_struct *task, +static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { /* diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 8d7d5c9..90b67db 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -1124,13 +1124,19 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) R32(eip, ip); R32(esp, sp); - case offsetof(struct user32, regs.orig_eax): + case offsetof(struct user32, regs.orig_eax): { + struct thread_info *thread = task_thread_info(child); /* * Sign-extend the value so that orig_eax = -1 * causes (long)orig_ax < 0 tests to fire correctly. */ regs->orig_ax = (long) (s32) value; + if ((s32) value >= 0) + thread->status |= TS_COMPAT; + else + thread->status &= ~TS_COMPAT; break; + } case offsetof(struct user32, regs.eflags): return set_flags(child, value); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] x86: ptrace: sign-extend eax with orig_eax>=0 2009-09-23 1:31 ` Linus Torvalds @ 2009-09-23 2:40 ` Roland McGrath 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath 0 siblings, 1 reply; 11+ messages in thread From: Roland McGrath @ 2009-09-23 2:40 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton > Hmm. This really smells extremely hacky to me. Agreed. Anything with a half-page comment about why it's two-thirds right and only one-third wrong is not what we'd choose when we could think of better options. > I see what you're doing, and I understand why, but it all makes me > shudder. I think we already do the wrong thing for 'orig_ax', and that we > should probably simply test just the low 32 bits of it (looks to be easily > done by just making the return type of 'syscall_get_nr()' be 'int') rather > than have that special "let's sign-extend orig_ax" code in ptrace. Yes, that seems mostly equivalent. By having orig_ax sign-extended in ptrace unconditionally (i.e. even for 64-bit tasks), you've already said that the semantics of orig_ax are that you only get 32 bits of meaningful value in there. In fact, that's what you said explicitly when we discussed that change. There is no reason for syscall_get_nr() not to return int. I said "mostly equivalent". The only difference I can see is that today 64-bit ptrace callers (and core dumps, etc.), can see the high bits of orig_ax and probably expect to see 64 1 bits in the "no syscall" case. So if we punt all the existing sign-extension and say that orig_ax has only 32 meaningful bits internally in all uses, then the ABI-preserving change is to make getreg() always sign-extend the 32-bit orig_ax value so -1 is -1LL in userland. I wonder if you want to say the same thing about all machines generally, that the syscall number can only have 32 meaningful bits. That seems reasonable to me. Then it would be appropriate to change the type in asm-generic/syscall.h; that file is never really used, but it sets the function signatures that other arch's files are written to match. > I also get the feeling that the TS_COMPAT testing is just hacky to begin > with. I think it was broken to do things that way, and I have this gut > feel that we really should have hidden the "am I a 32-bit or 64-bit > syscall" in that orig_ax field instead (ie make the 64-bit system calls > set a bit or whatever). That's the field we've always used for system call > flagging, and TS_COMPAT was broken. On the whole I agree. hpa and I talked about this very idea last year. We never got around to doing anything about it, since the previous hack at the time had fixed the known bug. Doing this internally is probably pretty easy and is nice for all the other cases with TS_COMPAT checks now. Unfortunately, it's a userland ABI issue to change this such that any new special meaning of some orig_ax bits can be seen by userland so as to help this latest case. What hpa and I discussed in fact was more than just a 32/64 flag, but a complete "which entry path" indicator. i.e. int80 vs sysenter vs AMD 32-bit "syscall" (if we ever supported that) vs 64-bit "syscall" vs non-syscall (exception/interrupt) entries. That would have a bit or two of information even for 32-bit. (We figured that limiting an actual syscall # to 24 bits or so would be fine.) But the utility of distinguishing those paths (aside from 32 vs 64 and syscall vs not) is purely theoretical, which is to say I don't think we even have any contrived idea what you'd use it for. > In this example, the problem for ptrace seems to be that TS_COMPAT ends up > being that "extra" bit of information that isn't visible in the register > set. Agreed. All else being equal, I would prefer to have this bit appear in the regset layout somewhere. Unfortunately I don't see how we can both make userland implicitly win by saving and restoring it blindly, and be ABI-compatible with existing userland that knows the orig_ax meaning today. > But that's a bigger separate cleanup. In the meantime, I really get the > feeling that your patch is nasty, and could be replaced with something > like the following instead.. It just sets the TS_COMPAT bit if we use a > 32-bit interface to set the system call number to positive (ie "inside > system call"). I think I may have been tempted toward that but didn't really consider it. The only real reason is that TS_* bits are "thread-synchronous" and I thought there was no precedent for touching them on any task != current. (Given I understand how ptrace is the special situation where it could be safe.) But, in fact, ptrace can already fiddle TS_USEDFPU. So it's no worse than that, anyway. So it's better than bad, it's good! > THE BELOW IS TOTALLY UNTESTED! I haven't really thought it through. I just > reacted very negatively to your patch, and my gut feel is that the code is > fundamentally doing something wrong. The below may not work, and may be > seriously broken and miss the point, but I think it conceptually comes > closer to how things _should_ work. I agree. Thanks, Roland ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 0/4] x86: clean up orig_ax handling 2009-09-23 2:40 ` Roland McGrath @ 2009-09-23 6:18 ` Roland McGrath 2009-09-23 6:19 ` [PATCH 1/4] asm-generic: syscall_get_nr returns int Roland McGrath ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Roland McGrath @ 2009-09-23 6:18 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton Here is the aforementioned other tack on this. I said earlier that getreg() (i.e. 64-bit ptrace/core-dump fetches) should sign-extend the low 32 bits of orig_ax up. But I've changed my mind. It's true that today you can store 0xffffffff via either 64-bit ptrace or 32-bit ptrace and then read back -1 via 64-bit ptrace. (This wasn't always so, and so we can hope that no debugger really depends on it.) What seems more important is that tracing and core dumps correctly show the full orig_ax value incoming in %rax from userland, since %rax=__NR_foo|(1UL<<32) behaves differently (i.e. -ENOSYS) than %eax=_NR_foo in actual fact when user-mode does "syscall" with those values. In a bogon case like that, you would like to have traces/dumps tell you why the task is not making a proper syscall rather than lie about what register bits it entered the kernel with. Patches 1-3 change no ptrace-tests outcomes, i.e. don't regress on the test cases that went with the original sign-extension changes. They reintroduce e.g. the ability to blindly read and write back the whole regset when at syscall-entry tracing with %rax=__NR_foo|(1UL<<32) and have that fail with -ENOSYS as it would without tracing rather than perturb the tracee to call sys_foo instead. (Not that this is useful.) Patch 4 does Linus's fix for the outstanding bug. I've verified it works. Thanks, Roland --- The following changes since commit 7fa07729e439a6184bd824746d06a49cca553f15: Linus Torvalds (1): Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/.../tip/linux-2.6-tip are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland.git x86/orig_ax Roland McGrath (4): asm-generic: syscall_get_nr returns int x86: syscall_get_nr returns int x86: ptrace: do not sign-extend orig_ax on write x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 arch/x86/include/asm/syscall.h | 14 +++++++------- arch/x86/kernel/ptrace.c | 21 ++++++++------------- include/asm-generic/syscall.h | 8 ++++++-- 3 files changed, 21 insertions(+), 22 deletions(-) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] asm-generic: syscall_get_nr returns int 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath @ 2009-09-23 6:19 ` Roland McGrath 2009-09-23 6:20 ` [PATCH 2/4] x86: " Roland McGrath ` (3 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Roland McGrath @ 2009-09-23 6:19 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton Only 32 bits of system call number are meaningful, so make the specification for syscall_get_nr() be to return int, not long. Signed-off-by: Roland McGrath <roland@redhat.com> --- include/asm-generic/syscall.h | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/syscall.h b/include/asm-generic/syscall.h index ea8087b..5c122ae 100644 --- a/include/asm-generic/syscall.h +++ b/include/asm-generic/syscall.h @@ -1,7 +1,7 @@ /* * Access to user system call parameters and results * - * Copyright (C) 2008 Red Hat, Inc. All rights reserved. + * Copyright (C) 2008-2009 Red Hat, Inc. All rights reserved. * * This copyrighted material is made available to anyone wishing to use, * modify, copy, or redistribute it subject to the terms and conditions @@ -32,9 +32,13 @@ struct pt_regs; * If @task is not executing a system call, i.e. it's blocked * inside the kernel for a fault or signal, returns -1. * + * Note this returns int even on 64-bit machines. Only 32 bits of + * system call number can be meaningful. If the actual arch value + * is 64 bits, this truncates to 32 bits so 0xffffffff means -1. + * * It's only valid to call this when @task is known to be blocked. */ -long syscall_get_nr(struct task_struct *task, struct pt_regs *regs); +int syscall_get_nr(struct task_struct *task, struct pt_regs *regs); /** * syscall_rollback - roll back registers after an aborted system call ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] x86: syscall_get_nr returns int 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath 2009-09-23 6:19 ` [PATCH 1/4] asm-generic: syscall_get_nr returns int Roland McGrath @ 2009-09-23 6:20 ` Roland McGrath 2009-09-23 6:20 ` [PATCH 3/4] x86: ptrace: do not sign-extend orig_ax on write Roland McGrath ` (2 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Roland McGrath @ 2009-09-23 6:20 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton Make syscall_get_nr() return int, so we always sign-extend the low 32 bits of orig_ax in checks. Signed-off-by: Roland McGrath <roland@redhat.com> --- arch/x86/include/asm/syscall.h | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h index d82f39b..8d33bc5 100644 --- a/arch/x86/include/asm/syscall.h +++ b/arch/x86/include/asm/syscall.h @@ -1,7 +1,7 @@ /* * Access to user system call parameters and results * - * Copyright (C) 2008 Red Hat, Inc. All rights reserved. + * Copyright (C) 2008-2009 Red Hat, Inc. All rights reserved. * * This copyrighted material is made available to anyone wishing to use, * modify, copy, or redistribute it subject to the terms and conditions @@ -16,13 +16,13 @@ #include <linux/sched.h> #include <linux/err.h> -static inline long syscall_get_nr(struct task_struct *task, - struct pt_regs *regs) +/* + * Only the low 32 bits of orig_ax are meaningful, so we return int. + * This importantly ignores the high bits on 64-bit, so comparisons + * sign-extend the low 32 bits. + */ +static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { - /* - * We always sign-extend a -1 value being set here, - * so this is always either -1L or a syscall number. - */ return regs->orig_ax; } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] x86: ptrace: do not sign-extend orig_ax on write 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath 2009-09-23 6:19 ` [PATCH 1/4] asm-generic: syscall_get_nr returns int Roland McGrath 2009-09-23 6:20 ` [PATCH 2/4] x86: " Roland McGrath @ 2009-09-23 6:20 ` Roland McGrath 2009-09-23 6:21 ` [PATCH 4/4] x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 Roland McGrath 2009-09-23 15:41 ` [PATCH 0/4] x86: clean up orig_ax handling Ingo Molnar 4 siblings, 0 replies; 11+ messages in thread From: Roland McGrath @ 2009-09-23 6:20 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton The high 32 bits of orig_ax will be ignored when it matters, so don't fiddle them when setting it. Signed-off-by: Roland McGrath <roland@redhat.com> --- arch/x86/kernel/ptrace.c | 19 +------------------ 1 files changed, 1 insertions(+), 18 deletions(-) diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 8d7d5c9..52222fa 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -325,16 +325,6 @@ static int putreg(struct task_struct *child, return set_flags(child, value); #ifdef CONFIG_X86_64 - /* - * Orig_ax is really just a flag with small positive and - * negative values, so make sure to always sign-extend it - * from 32 bits so that it works correctly regardless of - * whether we come from a 32-bit environment or not. - */ - case offsetof(struct user_regs_struct, orig_ax): - value = (long) (s32) value; - break; - case offsetof(struct user_regs_struct,fs_base): if (value >= TASK_SIZE_OF(child)) return -EIO; @@ -1121,17 +1111,10 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) R32(esi, si); R32(ebp, bp); R32(eax, ax); + R32(orig_eax, orig_ax); R32(eip, ip); R32(esp, sp); - case offsetof(struct user32, regs.orig_eax): - /* - * Sign-extend the value so that orig_eax = -1 - * causes (long)orig_ax < 0 tests to fire correctly. - */ - regs->orig_ax = (long) (s32) value; - break; - case offsetof(struct user32, regs.eflags): return set_flags(child, value); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath ` (2 preceding siblings ...) 2009-09-23 6:20 ` [PATCH 3/4] x86: ptrace: do not sign-extend orig_ax on write Roland McGrath @ 2009-09-23 6:21 ` Roland McGrath 2009-09-23 15:41 ` [PATCH 0/4] x86: clean up orig_ax handling Ingo Molnar 4 siblings, 0 replies; 11+ messages in thread From: Roland McGrath @ 2009-09-23 6:21 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin Cc: Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton The 32-bit ptrace syscall on a 64-bit kernel (32-bit debugger on 32-bit task) behaves differently than a native 32-bit kernel. When setting a register state of orig_eax>=0 and eax=-ERESTART* when the debugged task is NOT on its way out of a 32-bit syscall, the task will fail to do the syscall restart logic that it should do. Test case available at http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/erestartsys-trap.c?cvsroot=systemtap This happens because the 32-bit ptrace syscall sets eax=0xffffffff when it sets orig_eax>=0. The resuming task will not sign-extend this for the -ERESTART* check because TS_COMPAT is not set. (So the task thinks it is restarting after a 64-bit syscall, not a 32-bit one.) The fix is to have 32-bit ptrace calls set TS_COMPAT when setting orig_eax>=0. This ensures that the 32-bit syscall restart logic will apply when the child resumes. Signed-off-by: Roland McGrath <roland@redhat.com> --- arch/x86/kernel/ptrace.c | 14 +++++++++++++- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 52222fa..7b058a2 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -1111,10 +1111,22 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) R32(esi, si); R32(ebp, bp); R32(eax, ax); - R32(orig_eax, orig_ax); R32(eip, ip); R32(esp, sp); + case offsetof(struct user32, regs.orig_eax): + /* + * A 32-bit debugger setting orig_eax means to restore + * the state of the task restarting a 32-bit syscall. + * Make sure we interpret the -ERESTART* codes correctly + * in case the task is not actually still sitting at the + * exit from a 32-bit syscall with TS_COMPAT still set. + */ + regs->orig_ax = value; + if (syscall_get_nr(child, regs) >= 0) + task_thread_info(child)->status |= TS_COMPAT; + break; + case offsetof(struct user32, regs.eflags): return set_flags(child, value); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] x86: clean up orig_ax handling 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath ` (3 preceding siblings ...) 2009-09-23 6:21 ` [PATCH 4/4] x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 Roland McGrath @ 2009-09-23 15:41 ` Ingo Molnar 2009-09-23 15:53 ` Linus Torvalds 4 siblings, 1 reply; 11+ messages in thread From: Ingo Molnar @ 2009-09-23 15:41 UTC (permalink / raw) To: Roland McGrath Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Linus Torvalds, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton * Roland McGrath <roland@redhat.com> wrote: > Here is the aforementioned other tack on this. > > I said earlier that getreg() (i.e. 64-bit ptrace/core-dump fetches) > should sign-extend the low 32 bits of orig_ax up. But I've changed my > mind. It's true that today you can store 0xffffffff via either 64-bit > ptrace or 32-bit ptrace and then read back -1 via 64-bit ptrace. (This > wasn't always so, and so we can hope that no debugger really depends on > it.) What seems more important is that tracing and core dumps correctly > show the full orig_ax value incoming in %rax from userland, since > %rax=__NR_foo|(1UL<<32) behaves differently (i.e. -ENOSYS) than > %eax=_NR_foo in actual fact when user-mode does "syscall" with those > values. In a bogon case like that, you would like to have traces/dumps > tell you why the task is not making a proper syscall rather than lie > about what register bits it entered the kernel with. > > Patches 1-3 change no ptrace-tests outcomes, i.e. don't regress on the > test cases that went with the original sign-extension changes. They > reintroduce e.g. the ability to blindly read and write back the whole > regset when at syscall-entry tracing with %rax=__NR_foo|(1UL<<32) and > have that fail with -ENOSYS as it would without tracing rather than > perturb the tracee to call sys_foo instead. (Not that this is useful.) > > Patch 4 does Linus's fix for the outstanding bug. I've verified it works. > > > Thanks, > Roland > --- > The following changes since commit 7fa07729e439a6184bd824746d06a49cca553f15: > Linus Torvalds (1): > Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/.../tip/linux-2.6-tip > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland.git x86/orig_ax > > Roland McGrath (4): > asm-generic: syscall_get_nr returns int > x86: syscall_get_nr returns int > x86: ptrace: do not sign-extend orig_ax on write > x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 > > arch/x86/include/asm/syscall.h | 14 +++++++------- > arch/x86/kernel/ptrace.c | 21 ++++++++------------- > include/asm-generic/syscall.h | 8 ++++++-- > 3 files changed, 21 insertions(+), 22 deletions(-) Linus, this series looks good to me. Do you want to pull this directly or should we test this for a few days in the x86 tree? (either solution is fine to me) Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] x86: clean up orig_ax handling 2009-09-23 15:41 ` [PATCH 0/4] x86: clean up orig_ax handling Ingo Molnar @ 2009-09-23 15:53 ` Linus Torvalds 2009-09-23 16:21 ` Ingo Molnar 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2009-09-23 15:53 UTC (permalink / raw) To: Ingo Molnar Cc: Roland McGrath, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton On Wed, 23 Sep 2009, Ingo Molnar wrote: > > Linus, this series looks good to me. Do you want to pull this directly > or should we test this for a few days in the x86 tree? (either solution > is fine to me) I already pulled it, since I'm used to pulling ptrace fixes from Roland, Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] x86: clean up orig_ax handling 2009-09-23 15:53 ` Linus Torvalds @ 2009-09-23 16:21 ` Ingo Molnar 0 siblings, 0 replies; 11+ messages in thread From: Ingo Molnar @ 2009-09-23 16:21 UTC (permalink / raw) To: Linus Torvalds Cc: Roland McGrath, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, jan.kratochvil, Oleg Nesterov, x86, linux-kernel, Andrew Morton * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, 23 Sep 2009, Ingo Molnar wrote: > > > > Linus, this series looks good to me. Do you want to pull this > > directly or should we test this for a few days in the x86 tree? > > (either solution is fine to me) > > I already pulled it, since I'm used to pulling ptrace fixes from > Roland, Ok, thanks! Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-09-23 16:22 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-23 0:46 [PATCH] x86: ptrace: sign-extend eax with orig_eax>=0 Roland McGrath 2009-09-23 1:31 ` Linus Torvalds 2009-09-23 2:40 ` Roland McGrath 2009-09-23 6:18 ` [PATCH 0/4] x86: clean up orig_ax handling Roland McGrath 2009-09-23 6:19 ` [PATCH 1/4] asm-generic: syscall_get_nr returns int Roland McGrath 2009-09-23 6:20 ` [PATCH 2/4] x86: " Roland McGrath 2009-09-23 6:20 ` [PATCH 3/4] x86: ptrace: do not sign-extend orig_ax on write Roland McGrath 2009-09-23 6:21 ` [PATCH 4/4] x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 Roland McGrath 2009-09-23 15:41 ` [PATCH 0/4] x86: clean up orig_ax handling Ingo Molnar 2009-09-23 15:53 ` Linus Torvalds 2009-09-23 16:21 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox