* TIF_SINGLESTEP bug?
@ 2015-04-10 21:30 Andy Lutomirski
2015-04-11 13:32 ` Oleg Nesterov
0 siblings, 1 reply; 3+ messages in thread
From: Andy Lutomirski @ 2015-04-10 21:30 UTC (permalink / raw)
To: Oleg Nesterov, X86 ML, linux-kernel@vger.kernel.org,
Borislav Petkov, Denys Vlasenko
[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]
Hi all-
AFAICS there are several things wrong with our magical do_debug
handling of single-stepping through the kernel. They boil down to two
issues:
1. do_debug seems to be overly permissive in terms of what faults in
kernel space it thinks are okay. This isn't obviously a problem
except that it obfuscates what's going on. AFAICT the *only*
acceptable case is TF set on sysenter. All the mentions of syscalls
are garbage -- both int80 and syscall clear TF.
2. I think this is wrong:
set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
regs->flags &= ~X86_EFLAGS_TF;
TIF_SINGLESTEP doesn't mean "set TF in this sysenter's saved flags and
then clear TIF_SINGLESTEP". It means something complicated.
The upshot AFAICT is that the attached program blows up if you build
it with -m32 and run it on an Intel machine.
Help? The best option I see is to add a new ti flag that tells the
syscall entry hook to clear that flag and set TF in regs->flags. But
I don't understand ptrace or kgdb very well.
--Andy
--
Andy Lutomirski
AMA Capital Management, LLC
[-- Attachment #2: single_step_syscall.c --]
[-- Type: text/x-csrc, Size: 2960 bytes --]
#define _GNU_SOURCE
#include <sys/time.h>
#include <time.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <inttypes.h>
#include <sys/mman.h>
#include <sys/signal.h>
#include <sys/ucontext.h>
#include <asm/ldt.h>
#include <err.h>
#include <setjmp.h>
#include <stddef.h>
#include <stdbool.h>
#include <sys/ptrace.h>
#include <sys/user.h>
static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
int flags)
{
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO | flags;
sigemptyset(&sa.sa_mask);
if (sigaction(sig, &sa, 0))
err(1, "sigaction");
}
static volatile sig_atomic_t sig_traps;
#ifdef __x86_64__
# define REG_IP REG_RIP
# define WIDTH "q"
#else
# define REG_IP REG_EIP
# define WIDTH "l"
#endif
static unsigned long get_eflags(void)
{
unsigned long eflags;
asm volatile ("pushf" WIDTH "\n\tpop" WIDTH " %0" : "=rm" (eflags));
return eflags;
}
static void set_eflags(unsigned long eflags)
{
asm volatile ("push" WIDTH " %0\n\tpopf" WIDTH
: : "rm" (eflags) : "flags");
}
#define X86_EFLAGS_TF (1UL << 8)
static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
{
ucontext_t *ctx = (ucontext_t*)ctx_void;
if (get_eflags() & X86_EFLAGS_TF) {
set_eflags(get_eflags() & ~X86_EFLAGS_TF);
printf("[WARN]\tSIGTRAP handler had TF set\n");
_exit(1);
}
sig_traps++;
if (sig_traps == 10000 || sig_traps == 10001) {
printf("[WARN]\tHit %d SIGTRAPs with si_addr 0x%lx, ip 0x%lx\n",
(int)sig_traps,
(unsigned long)info->si_addr,
(unsigned long)ctx->uc_mcontext.gregs[REG_IP]);
}
}
static void check_result(void)
{
unsigned long new_eflags = get_eflags();
set_eflags(new_eflags & ~X86_EFLAGS_TF);
if (!sig_traps) {
printf("[FAIL]\tNo SIGTRAP\n");
exit(1);
}
if (!(new_eflags & X86_EFLAGS_TF)) {
printf("[FAIL]\tTF was cleared\n");
exit(1);
}
printf("[OK]\tSurvived with TF set and %d traps\n", (int)sig_traps);
sig_traps = 0;
}
int main()
{
int tmp;
sethandler(SIGTRAP, sigtrap, 0);
printf("[RUN]\tSet TF and check nop\n");
set_eflags(get_eflags() | X86_EFLAGS_TF);
asm volatile ("nop");
check_result();
#ifdef __x86_64__
printf("[RUN]\tSet TF and check syscall-less opportunistic sysret\n");
set_eflags(get_eflags() | X86_EFLAGS_TF);
extern unsigned char post_nop[];
asm volatile ("pushf" WIDTH "\n\t"
"pop" WIDTH " %%r11\n\t"
"nop\n\t"
"post_nop:"
: : "c" (post_nop) : "r11");
check_result();
#endif
printf("[RUN]\tSet TF and check int80\n");
set_eflags(get_eflags() | X86_EFLAGS_TF);
asm volatile ("int $0x80" : "=a" (tmp) : "a" (SYS_getpid));
check_result();
syscall(SYS_getpid); /* Force symbol binding without TF set. */
printf("[RUN]\tSet TF and check a fast syscall\n");
set_eflags(get_eflags() | X86_EFLAGS_TF);
syscall(SYS_getpid);
check_result();
return 0;
}
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: TIF_SINGLESTEP bug? 2015-04-10 21:30 TIF_SINGLESTEP bug? Andy Lutomirski @ 2015-04-11 13:32 ` Oleg Nesterov 2015-04-13 22:59 ` Andy Lutomirski 0 siblings, 1 reply; 3+ messages in thread From: Oleg Nesterov @ 2015-04-11 13:32 UTC (permalink / raw) To: Andy Lutomirski Cc: X86 ML, linux-kernel@vger.kernel.org, Borislav Petkov, Denys Vlasenko On 04/10, Andy Lutomirski wrote: > > Hi all- > > AFAICS there are several things wrong with our magical do_debug > handling of single-stepping through the kernel. They boil down to two > issues: > > 1. do_debug seems to be overly permissive in terms of what faults in > kernel space it thinks are okay. This isn't obviously a problem > except that it obfuscates what's going on. AFAICT the *only* > acceptable case is TF set on sysenter. All the mentions of syscalls > are garbage -- both int80 and syscall clear TF. > > 2. I think this is wrong: > > set_tsk_thread_flag(tsk, TIF_SINGLESTEP); > regs->flags &= ~X86_EFLAGS_TF; > > TIF_SINGLESTEP doesn't mean "set TF in this sysenter's saved flags and > then clear TIF_SINGLESTEP". It means something complicated. Yes, plus TIF_FORCED_TF adds more confusion... And to me another problem is that these flags are not cleared in/after otrace_stop(). > The upshot AFAICT is that the attached program blows up if you build > it with -m32 and run it on an Intel machine. I thinks the patch below should help, but most probably there are other isssues. Actually I am sure there are other issues, but I forgot the problems I found when I tried to understand this logic in details some time ago. The patch is already in -mm. I'll try to check if it actually helps when I have the access to my testing machine (I need it to compile with -m32, my user-space environment is broken ;). Oleg. ------------------------------------------------------------------------------- Subject: [PATCH 2nd RESEND] ptrace/x86: fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-By: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- arch/x86/kernel/signal.c | 22 +++++++++++----------- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index ed37a76..9d3a15b 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -629,7 +629,8 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) static void handle_signal(struct ksignal *ksig, struct pt_regs *regs) { - bool failed; + bool stepping, failed; + /* Are we from a system call? */ if (syscall_get_nr(current, regs) >= 0) { /* If so, check system call restarting.. */ @@ -653,12 +654,13 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) } /* - * If TF is set due to a debugger (TIF_FORCED_TF), clear the TF - * flag so that register information in the sigcontext is correct. + * If TF is set due to a debugger (TIF_FORCED_TF), clear TF now + * so that register information in the sigcontext is correct and + * then notify the tracer before entering the signal handler. */ - if (unlikely(regs->flags & X86_EFLAGS_TF) && - likely(test_and_clear_thread_flag(TIF_FORCED_TF))) - regs->flags &= ~X86_EFLAGS_TF; + stepping = test_thread_flag(TIF_SINGLESTEP); + if (stepping) + user_disable_single_step(current); failed = (setup_rt_frame(ksig, regs) < 0); if (!failed) { @@ -669,10 +671,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) * it might disable possible debug exception from the * signal handler. * - * Clear TF when entering the signal handler, but - * notify any tracer that was single-stepping it. - * The tracer may want to single-step inside the - * handler too. + * Clear TF for the case when it wasn't set by debugger to + * avoid the recursive send_sigtrap() in SIGTRAP handler. */ regs->flags &= ~(X86_EFLAGS_DF|X86_EFLAGS_RF|X86_EFLAGS_TF); /* @@ -681,7 +681,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) if (used_math()) drop_init_fpu(current); } - signal_setup_done(failed, ksig, test_thread_flag(TIF_SINGLESTEP)); + signal_setup_done(failed, ksig, stepping); } #ifdef CONFIG_X86_32 -- 1.5.5.1 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: TIF_SINGLESTEP bug? 2015-04-11 13:32 ` Oleg Nesterov @ 2015-04-13 22:59 ` Andy Lutomirski 0 siblings, 0 replies; 3+ messages in thread From: Andy Lutomirski @ 2015-04-13 22:59 UTC (permalink / raw) To: Oleg Nesterov Cc: X86 ML, linux-kernel@vger.kernel.org, Borislav Petkov, Denys Vlasenko On Sat, Apr 11, 2015 at 6:32 AM, Oleg Nesterov <oleg@redhat.com> wrote: > On 04/10, Andy Lutomirski wrote: >> >> Hi all- >> >> AFAICS there are several things wrong with our magical do_debug >> handling of single-stepping through the kernel. They boil down to two >> issues: >> >> 1. do_debug seems to be overly permissive in terms of what faults in >> kernel space it thinks are okay. This isn't obviously a problem >> except that it obfuscates what's going on. AFAICT the *only* >> acceptable case is TF set on sysenter. All the mentions of syscalls >> are garbage -- both int80 and syscall clear TF. >> >> 2. I think this is wrong: >> >> set_tsk_thread_flag(tsk, TIF_SINGLESTEP); >> regs->flags &= ~X86_EFLAGS_TF; >> >> TIF_SINGLESTEP doesn't mean "set TF in this sysenter's saved flags and >> then clear TIF_SINGLESTEP". It means something complicated. > > Yes, plus TIF_FORCED_TF adds more confusion... > > And to me another problem is that these flags are not cleared in/after > otrace_stop(). > >> The upshot AFAICT is that the attached program blows up if you build >> it with -m32 and run it on an Intel machine. > > I thinks the patch below should help, but most probably there are other > isssues. Actually I am sure there are other issues, but I forgot the > problems I found when I tried to understand this logic in details some > time ago. > > The patch is already in -mm. I'll try to check if it actually helps > when I have the access to my testing machine (I need it to compile > with -m32, my user-space environment is broken ;). That fixes my test. Thanks! (It still seems like this is more twisted than it deserves to be. The do_debug special case IMO shouldn't interact at all with any of the weird stuff in step.c if for no other reason than that it's specific to sysenter, and IMO sysenter should work the same as syscall to the extent possible.) --Andy > > Oleg. > > ------------------------------------------------------------------------------- > Subject: [PATCH 2nd RESEND] ptrace/x86: fix the TIF_FORCED_TF logic in handle_signal() > > When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() > clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. > > If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets > X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent > PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the > wrong SIGTRAP. > > Test-case (needs -O2 to avoid prologue insns in signal handler): > > #include <unistd.h> > #include <stdio.h> > #include <sys/ptrace.h> > #include <sys/wait.h> > #include <sys/user.h> > #include <assert.h> > #include <stddef.h> > > void handler(int n) > { > asm("nop"); > } > > int child(void) > { > assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); > signal(SIGALRM, handler); > kill(getpid(), SIGALRM); > return 0x23; > } > > void *getip(int pid) > { > return (void*)ptrace(PTRACE_PEEKUSER, pid, > offsetof(struct user, regs.rip), 0); > } > > int main(void) > { > int pid, status; > > pid = fork(); > if (!pid) > return child(); > > assert(wait(&status) == pid); > assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); > > assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); > assert(wait(&status) == pid); > assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); > assert((getip(pid) - (void*)handler) == 0); > > assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); > assert(wait(&status) == pid); > assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); > assert((getip(pid) - (void*)handler) == 1); > > assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); > assert(wait(&status) == pid); > assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); > > return 0; > } > > The last assert() fails because PTRACE_CONT wrongly triggers another > single-step and X86_EFLAGS_TF can't be cleared by debugger until the > tracee does sys_rt_sigreturn(). > > Change handle_signal() to do user_disable_single_step() if stepping, > we do not need to preserve TIF_SINGLESTEP because we are going to do > ptrace_notify(), and it is simply wrong to leak this bit. > > While at it, change the comment to explain why we also need to clear > TF unconditionally after setup_rt_frame(). > > Note: in the longer term we should probably change setup_sigcontext() > to use get_flags() and then just remove this user_disable_single_step(). > And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() > which can set/clear TF, this needs another fix. > > Reported-by: Evan Teran <eteran@alum.rit.edu> > Reported-by: Pedro Alves <palves@redhat.com> > Tested-By: Andres Freund <andres@anarazel.de> > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > arch/x86/kernel/signal.c | 22 +++++++++++----------- > 1 files changed, 11 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c > index ed37a76..9d3a15b 100644 > --- a/arch/x86/kernel/signal.c > +++ b/arch/x86/kernel/signal.c > @@ -629,7 +629,8 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) > static void > handle_signal(struct ksignal *ksig, struct pt_regs *regs) > { > - bool failed; > + bool stepping, failed; > + > /* Are we from a system call? */ > if (syscall_get_nr(current, regs) >= 0) { > /* If so, check system call restarting.. */ > @@ -653,12 +654,13 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) > } > > /* > - * If TF is set due to a debugger (TIF_FORCED_TF), clear the TF > - * flag so that register information in the sigcontext is correct. > + * If TF is set due to a debugger (TIF_FORCED_TF), clear TF now > + * so that register information in the sigcontext is correct and > + * then notify the tracer before entering the signal handler. > */ > - if (unlikely(regs->flags & X86_EFLAGS_TF) && > - likely(test_and_clear_thread_flag(TIF_FORCED_TF))) > - regs->flags &= ~X86_EFLAGS_TF; > + stepping = test_thread_flag(TIF_SINGLESTEP); > + if (stepping) > + user_disable_single_step(current); > > failed = (setup_rt_frame(ksig, regs) < 0); > if (!failed) { > @@ -669,10 +671,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) > * it might disable possible debug exception from the > * signal handler. > * > - * Clear TF when entering the signal handler, but > - * notify any tracer that was single-stepping it. > - * The tracer may want to single-step inside the > - * handler too. > + * Clear TF for the case when it wasn't set by debugger to > + * avoid the recursive send_sigtrap() in SIGTRAP handler. > */ > regs->flags &= ~(X86_EFLAGS_DF|X86_EFLAGS_RF|X86_EFLAGS_TF); > /* > @@ -681,7 +681,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) > if (used_math()) > drop_init_fpu(current); > } > - signal_setup_done(failed, ksig, test_thread_flag(TIF_SINGLESTEP)); > + signal_setup_done(failed, ksig, stepping); > } > > #ifdef CONFIG_X86_32 > -- > 1.5.5.1 > > > -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-04-13 22:59 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-10 21:30 TIF_SINGLESTEP bug? Andy Lutomirski 2015-04-11 13:32 ` Oleg Nesterov 2015-04-13 22:59 ` Andy Lutomirski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.