* Re: More trouble with i386 EFLAGS and ptrace
@ 2005-03-14 4:06 Jesse Allen
0 siblings, 0 replies; 15+ messages in thread
From: Jesse Allen @ 2005-03-14 4:06 UTC (permalink / raw)
To: linux-kernel
Roland, Daniel,
I was the one that reported the ptrace problems which caused all that
hoopla in Nov and Dec. I have tested your new patches and find that
there are no new regressions.
Jesse
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: More trouble with i386 EFLAGS and ptrace
@ 2005-03-14 4:12 Jesse Allen
0 siblings, 0 replies; 15+ messages in thread
From: Jesse Allen @ 2005-03-14 4:12 UTC (permalink / raw)
To: linux-kernel
Andi Kleen wrote:
> > Once we're sure about the i386 state, we should update the x86_64 code to match.
>
> I have it fixed in my tree, but not pushed out yet because of lack of testing.
Andi,
I have someone who thinks they might be experiencing the same problem
that I reported but now on x86-64. Could I get the patches from your
tree so they can be tested?
Jesse
^ permalink raw reply [flat|nested] 15+ messages in thread
* More trouble with i386 EFLAGS and ptrace
@ 2005-03-06 19:38 Daniel Jacobowitz
2005-03-06 20:03 ` Linus Torvalds
2005-03-06 20:26 ` Daniel Jacobowitz
0 siblings, 2 replies; 15+ messages in thread
From: Daniel Jacobowitz @ 2005-03-06 19:38 UTC (permalink / raw)
To: Linus Torvalds, linux-kernel; +Cc: Andrew Cagney
It looks like the changes to preserve eflags when single-stepping don't work
right with signals. Take this test case:
<snip>
#include <signal.h>
#include <unistd.h>
volatile int done;
void handler (int sig)
{
done = 1;
}
int main()
{
while (1)
{
done = 0;
signal (SIGALRM, handler);
alarm (1);
while (!done);
}
}
<snip>
And this GDB session:
(gdb) b 18
Breakpoint 1 at 0x804840d: file test.c, line 18.
(gdb) r
Starting program: /home/drow/eflags/test
Breakpoint 1, main () at test.c:18
18 while (!done);
(gdb) p/x $eflags
$1 = 0x200217
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048414 in main () at test.c:18
18 while (!done);
(gdb) p/x $eflags
$2 = 0x200302
There's an implied delay before the "c" which is long enough for the signal
handler to become pending.
The reason this happens is that when the inferior hits a breakpoint, the
first thing GDB will do is remove the breakpoint, single-step past it, and
reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the
signal handler (without single-step - good so far). When the signal handler
returns, we've lost track of the fact that ptrace set the single-step flag,
however. So the single-step completes and returns SIGTRAP to GDB. GDB is
expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the
inferior, but now the trap flag is set in $eflags. So, oops, the continue
acts like a step instead.
What to do? We need to know when we restore the trap bit in sigreturn
whether it was set by ptrace or by the application (possibly including by
the signal handler).
Andrew, serious kudos for GDB's sigstep.exp, which uncovered this problem
(through a much more complicated test - I may add the smaller one).
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 19:38 Daniel Jacobowitz
@ 2005-03-06 20:03 ` Linus Torvalds
2005-03-06 21:14 ` Daniel Jacobowitz
2005-03-06 21:22 ` Roland McGrath
2005-03-06 20:26 ` Daniel Jacobowitz
1 sibling, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2005-03-06 20:03 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Kernel Mailing List, Andrew Cagney, Roland McGrath
On Sun, 6 Mar 2005, Daniel Jacobowitz wrote:
>
> The reason this happens is that when the inferior hits a breakpoint, the
> first thing GDB will do is remove the breakpoint, single-step past it, and
> reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the
> signal handler (without single-step - good so far).
No, not good so far.
Yes, it cleared TF, but it saved eflags with TF set on-stack, even though
the TF was due to a TIF_SINGLESTEP (and was thus "temporary", not a real
flag). That's why you see the bogus SIGTRAP after returning from the
signal handler.
Now, we actually get this _right_ if the signal is due to a single-step,
because do_debug() will do:
if (likely(tsk->ptrace & PT_DTRACE)) {
tsk->ptrace &= ~PT_DTRACE;
regs->eflags &= ~TF_MASK;
}
but that's the only place we do that. So any _other_ signal won't do this,
and we'll enter the signal handler without checking the PT_DTRACE flag to
see if TF was a temporary thing from debugger rather than something the
program actually did on its own.
I _think_ your test-case would work right if you just moved that code from
the special-case in do_debug(), and moved it to the top of
setup_sigcontext() instead. I've not tested it, though, and haven't really
given it any "deep thought". Maybe somebody smarter can say "yeah, that's
obviously the right thing to do" or "no, that won't work because.."
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 20:03 ` Linus Torvalds
@ 2005-03-06 21:14 ` Daniel Jacobowitz
2005-03-07 0:46 ` Linus Torvalds
2005-03-06 21:22 ` Roland McGrath
1 sibling, 1 reply; 15+ messages in thread
From: Daniel Jacobowitz @ 2005-03-06 21:14 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List, Andrew Cagney, Roland McGrath
On Sun, Mar 06, 2005 at 12:03:22PM -0800, Linus Torvalds wrote:
> I _think_ your test-case would work right if you just moved that code from
> the special-case in do_debug(), and moved it to the top of
> setup_sigcontext() instead. I've not tested it, though, and haven't really
> given it any "deep thought". Maybe somebody smarter can say "yeah, that's
> obviously the right thing to do" or "no, that won't work because.."
I bought it, but the GDB testsuite didn't. Both copies seem to be
necessary; there's generally no signal handler for SIGTRAP, so moving
it disables the test in the most common case. I didn't poke at it long
enough to figure out what the failing case was, but it introduced a
different situation which could leave TF enabled. This, however,
worked:
If a debugger set the TF bit, make sure to clear it when creating a
signal context. Otherwise, TF will be incorrectly restored by
sigreturn.
Signed-off-by: Daniel Jacobowitz <dan@debian.org>
===== arch/i386/kernel/signal.c 1.53 vs edited =====
--- 1.53/arch/i386/kernel/signal.c 2005-01-31 01:20:14 -05:00
+++ edited/arch/i386/kernel/signal.c 2005-03-06 15:36:41 -05:00
@@ -277,6 +277,18 @@
{
int tmp, err = 0;
+ /*
+ * If TF is set due to a debugger (PT_DTRACE), clear the TF
+ * flag so that register information in the sigcontext is
+ * correct.
+ */
+ if (unlikely(regs->eflags & TF_MASK)) {
+ if (likely(current->ptrace & PT_DTRACE)) {
+ current->ptrace &= ~PT_DTRACE;
+ regs->eflags &= ~TF_MASK;
+ }
+ }
+
tmp = 0;
__asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp));
err |= __put_user(tmp, (unsigned int __user *)&sc->gs);
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 21:14 ` Daniel Jacobowitz
@ 2005-03-07 0:46 ` Linus Torvalds
0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2005-03-07 0:46 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: Kernel Mailing List, Andrew Cagney, Roland McGrath
On Sun, 6 Mar 2005, Daniel Jacobowitz wrote:
>
> I bought it, but the GDB testsuite didn't. Both copies seem to be
> necessary; there's generally no signal handler for SIGTRAP
Ahh, duh, yes. I was looking at it and saying "debug fault always
generates a sigtrap", but you're right, it obviously doesn't - usually
it's caught by the tracer.
Your patch looks fine.
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 20:03 ` Linus Torvalds
2005-03-06 21:14 ` Daniel Jacobowitz
@ 2005-03-06 21:22 ` Roland McGrath
2005-03-06 22:13 ` Daniel Jacobowitz
2005-03-07 19:13 ` Andi Kleen
1 sibling, 2 replies; 15+ messages in thread
From: Roland McGrath @ 2005-03-06 21:22 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Daniel Jacobowitz, Kernel Mailing List, Andrew Cagney
> I _think_ your test-case would work right if you just moved that code from
> the special-case in do_debug(), and moved it to the top of
> setup_sigcontext() instead. I've not tested it, though, and haven't really
> given it any "deep thought". Maybe somebody smarter can say "yeah, that's
> obviously the right thing to do" or "no, that won't work because.."
Indeed, this is what my original changes for this did, before you started
cleaning things up to be nice to TF users other than PTRACE_SINGLESTEP.
I note, btw, that the x86_64 code is still at that prior stage. So I think
it doesn't have this new wrinkle, but it also doesn't have the advantages
of the more recent i386 changes. Once we're sure about the i386 state, we
should update the x86_64 code to match.
I'm not sure what kind of smart this makes me, but I'll say that your plan
would work and no, it's obviously not the right thing to do. ;-) I haven't
tested the following, not having tracked down the specific problem case you
folks are talking about. But I think this is the right solution. The
difference is that when we stop for some signal and report to the debugger,
the debugger looking at our registers will see TF clear instead of set,
before it decides whether to continue us with the signal or what to do.
With the change yo suggested, (I think) if the debugger decides to eat the
signal and resume, we would get a spurious single-step trap after executing
the next instruction, instead of resuming normally as requested.
Thanks,
Roland
Signed-off-by: Roland McGrath <roland@redhat.com>
--- linux-2.6/include/asm-i386/signal.h
+++ linux-2.6/include/asm-i386/signal.h
@@ -223,7 +223,14 @@ static __inline__ int sigfindinword(unsi
struct pt_regs;
extern int FASTCALL(do_signal(struct pt_regs *regs, sigset_t *oldset));
-#define ptrace_signal_deliver(regs, cookie) do { } while (0)
+
+#define ptrace_signal_deliver(regs, cookie) \
+ do { \
+ if (current->ptrace & PT_DTRACE) { \
+ current->ptrace &= ~PT_DTRACE; \
+ (regs)->eflags &= ~TF_MASK; \
+ } \
+ } while (0)
#endif /* __KERNEL__ */
--- linux-2.6/arch/i386/kernel/traps.c
+++ linux-2.6/arch/i386/kernel/traps.c
@@ -707,8 +707,6 @@ fastcall void do_debug(struct pt_regs *
/*
* Single-stepping through TF: make sure we ignore any events in
* kernel space (but re-enable TF when returning to user mode).
- * And if the event was due to a debugger (PT_DTRACE), clear the
- * TF flag so that register information is correct.
*/
if (condition & DR_STEP) {
/*
@@ -718,11 +716,6 @@ fastcall void do_debug(struct pt_regs *
*/
if ((regs->xcs & 3) == 0)
goto clear_TF_reenable;
-
- if (likely(tsk->ptrace & PT_DTRACE)) {
- tsk->ptrace &= ~PT_DTRACE;
- regs->eflags &= ~TF_MASK;
- }
}
/* Ok, finally something we can handle */
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 21:22 ` Roland McGrath
@ 2005-03-06 22:13 ` Daniel Jacobowitz
[not found] ` <200503070316.j273Gb4G027048@magilla.sf.frob.com>
2005-03-07 19:13 ` Andi Kleen
1 sibling, 1 reply; 15+ messages in thread
From: Daniel Jacobowitz @ 2005-03-06 22:13 UTC (permalink / raw)
To: Roland McGrath; +Cc: Linus Torvalds, Kernel Mailing List, Andrew Cagney
On Sun, Mar 06, 2005 at 01:22:25PM -0800, Roland McGrath wrote:
> > I _think_ your test-case would work right if you just moved that code from
> > the special-case in do_debug(), and moved it to the top of
> > setup_sigcontext() instead. I've not tested it, though, and haven't really
> > given it any "deep thought". Maybe somebody smarter can say "yeah, that's
> > obviously the right thing to do" or "no, that won't work because.."
>
> Indeed, this is what my original changes for this did, before you started
> cleaning things up to be nice to TF users other than PTRACE_SINGLESTEP.
>
> I note, btw, that the x86_64 code is still at that prior stage. So I think
> it doesn't have this new wrinkle, but it also doesn't have the advantages
> of the more recent i386 changes. Once we're sure about the i386 state, we
> should update the x86_64 code to match.
>
> I'm not sure what kind of smart this makes me, but I'll say that your plan
> would work and no, it's obviously not the right thing to do. ;-) I haven't
> tested the following, not having tracked down the specific problem case you
> folks are talking about. But I think this is the right solution. The
> difference is that when we stop for some signal and report to the debugger,
> the debugger looking at our registers will see TF clear instead of set,
> before it decides whether to continue us with the signal or what to do.
> With the change yo suggested, (I think) if the debugger decides to eat the
> signal and resume, we would get a spurious single-step trap after executing
> the next instruction, instead of resuming normally as requested.
Roland, the sigstep.exp test in the GDB testsuite will show this
problem; if your patch monotonically improves GDB HEAD testsuite
results and removes all the FAILs for sigstep.exp, then it's probably
equivalent to the one I just posted for this testcase.
I think mine is more correct; the problem doesn't occur because the
debugger cancelled a signal, it occurs because a bogus TF bit was saved
to the signal context. I like keeping solutions close to their
problems. But that's just aesthetic.
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 21:22 ` Roland McGrath
2005-03-06 22:13 ` Daniel Jacobowitz
@ 2005-03-07 19:13 ` Andi Kleen
1 sibling, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2005-03-07 19:13 UTC (permalink / raw)
To: Roland McGrath; +Cc: Daniel Jacobowitz, Kernel Mailing List, Andrew Cagney
Roland McGrath <roland@redhat.com> writes:
>
> I note, btw, that the x86_64 code is still at that prior stage. So I think
> it doesn't have this new wrinkle, but it also doesn't have the advantages
> of the more recent i386 changes. Once we're sure about the i386 state, we
> should update the x86_64 code to match.
I have it fixed in my tree, but not pushed out yet because of lack of
testing.
In general I track all i386 changes that make sense for modern systems
and 64bit, but sometimes merging takes some time.
However given the flood of bugs in the i386 code I'm starting to have
doubts again if all these changes were really a good idea.
However x86-64 has to be bug-to-bug compatible to i386, so there
is not much choice.
-Andi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: More trouble with i386 EFLAGS and ptrace
2005-03-06 19:38 Daniel Jacobowitz
2005-03-06 20:03 ` Linus Torvalds
@ 2005-03-06 20:26 ` Daniel Jacobowitz
1 sibling, 0 replies; 15+ messages in thread
From: Daniel Jacobowitz @ 2005-03-06 20:26 UTC (permalink / raw)
To: Linus Torvalds, linux-kernel, Andrew Cagney
On Sun, Mar 06, 2005 at 02:38:41PM -0500, Daniel Jacobowitz wrote:
> The reason this happens is that when the inferior hits a breakpoint, the
> first thing GDB will do is remove the breakpoint, single-step past it, and
> reinsert it. So GDB does a PTRACE_SINGLESTEP, and the kernel invokes the
> signal handler (without single-step - good so far). When the signal handler
> returns, we've lost track of the fact that ptrace set the single-step flag,
> however. So the single-step completes and returns SIGTRAP to GDB. GDB is
> expecting a SIGTRAP and reinserts the breakpoint. Then it resumes the
> inferior, but now the trap flag is set in $eflags. So, oops, the continue
> acts like a step instead.
Eh, I got the event sequence wrong as usual, but the basic description
is right.
- Original SIGTRAP at breakpoint
- user says "cont"
- GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP, no
signal
- GDB receives SIGALRM at the same PC
- GDB tries to singlestep past the breakpoint - PTRACE_SINGLESTEP,
SIGALRM
- GDB receives SIGTRAP at the first instruction of the handler
- GDB reinserts the breakpoint at line 18. This is a "step-resume"
breakpoint - we were stepping, we were interrupted by a signal.
- GDB issues PTRACE_CONT, no signal
- GDB receives SIGTRAP at the sigreturn location - this is the
step-resume breakpoint.
- GDB remove that and issues PTRACE_SINGLESTEP, no signal - It
is trying again to get past the breakpoint location so that it
can honor the user's "cont" request.
- GDB receives SIGTRAP at the instruction after the breakpoint.
- GDB reinserts the original breakpoint and issues PTRACE_CONTINUE.
All of this is what's supposed to happen. The executable be running
free now until it hits the breakpoint again.
- GDB receives an unexpected SIGTRAP at the next instruction (the
second instruction after the original breakpoint).
If your compiler uses only two instructions for the loop, you might not
see this. gcc -O0 will use three by default. Just stick something
else in the loop.
> What to do? We need to know when we restore the trap bit in sigreturn
> whether it was set by ptrace or by the application (possibly including by
> the signal handler).
If I'm following this right, then the saved value of eflags in the
signal handler should not contain the trap bit at this point. It does,
though. It's hard to see this in GDB, because the CFI does not express
%eflags, so "print $eflags" won't track up the stack. I don't think
there's a handy dwarf register number for it at the moment. But you
can print out the struct sigcontext by hand once you locate it on the
stack.
--
Daniel Jacobowitz
CodeSourcery, LLC
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2005-03-14 5:08 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-14 4:06 More trouble with i386 EFLAGS and ptrace Jesse Allen
-- strict thread matches above, loose matches on Subject: below --
2005-03-14 4:12 Jesse Allen
2005-03-06 19:38 Daniel Jacobowitz
2005-03-06 20:03 ` Linus Torvalds
2005-03-06 21:14 ` Daniel Jacobowitz
2005-03-07 0:46 ` Linus Torvalds
2005-03-06 21:22 ` Roland McGrath
2005-03-06 22:13 ` Daniel Jacobowitz
[not found] ` <200503070316.j273Gb4G027048@magilla.sf.frob.com>
2005-03-07 4:49 ` Daniel Jacobowitz
2005-03-07 21:29 ` Roland McGrath
2005-03-09 0:12 ` Daniel Jacobowitz
2005-03-13 8:27 ` Roland McGrath
2005-03-13 20:06 ` Daniel Jacobowitz
2005-03-07 19:13 ` Andi Kleen
2005-03-06 20:26 ` Daniel Jacobowitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox