public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* ptrace bugs and related problems
@ 2006-07-27  6:55 Albert Cahalan
  2006-07-27  7:19 ` David Miller
  2006-07-27 20:31 ` Daniel Jacobowitz
  0 siblings, 2 replies; 15+ messages in thread
From: Albert Cahalan @ 2006-07-27  6:55 UTC (permalink / raw)
  To: torvalds, alan, ak, mingo, arjan, akpm, linux-kernel, roland

Many of these bugs are generic, some are pure i386, some are for
i386 binaries on the x86-64 kernel, and some apply to a bit more.
Some bugs may involve race conditions: I use a 2-core AMD system.
Kernels vary, but are generally quite recent. (stock 2.6.17.7,
FC5's latest update, etc.)

There is a ptrace option to follow vfork, and an option to get a
message when the parent is released by the child. In kernel/fork.c
there is a bad attempt at optimization which prevents the release
message (PTRACE_EVENT_VFORK_DONE) from being sent unless the ptrace
user also chose the option to follow the vfork child.

System call restart does not appear to nest. It stores stuff
in the thread info rather than on the user stack.

Both i386 and x86-64 PTRACE_SINGLESTEP only check for popf, not iret.
Yes, really, iret can be used by normal apps. There is also no check
for failure, as when the popf or iret takes an alignment exception
or hits an unmapped page. The signal handler could fix that up, but
the kernel still thinks that the popf must have set TF in eflags and
thus writes a messed-up eflags into the sigcontext.

There is the pushf problem. Single-stepping this simple code
does not work:   pushf ; popf

A debugger can set or get the siginfo. Great. Signal handlers also
have sigcontext/ucontext data. Besides being generally very useful,
this is the only place where the cr2 register and trap error data
can be found. Looking on the stack only works once the signal is
allowed to be delivered, which may be too late for the debugger.

x86-64 has big problems single-stepping in the vdso's signal
return path. Suppose I breakpoint the pop. (this is in the path
that goes pop,mov,syscall or pop,mov,sysenter) If I then try to
single step, the process runs free. The i386 arch works fine.

I can't even set the hardware breakpoints:

(gdb) hbreak __kernel_sigreturn
Hardware assisted breakpoint 1 at 0xffffe500
(gdb) hbreak __kernel_rt_sigreturn
Hardware assisted breakpoint 2 at 0xffffe600
(gdb) continue
Continuing.
Couldn't write debug register: Input/output error.

The debugger has no way to reliably stop a process without causing
confusion. The SIGSTOP signal is not queued. The app under debug might
use SIGSTOP and rely on SIGSTOP to work. The debugger can't steal this.
Any signal that could be queued can also be blocked. The debugger has
no way to get notice when a signal has merely been queued, can not
see into the queue, and can not reasonably adjust the signal mask.

The is_at_popf function on x86-64 fails to account for instruction
set differences. Many prefixes are only valid in 32-bit mode, and
many others are only valid in 64-bit mode. The name is of course
wrong too; see above note about iret and other problems.

The PTRACE_EVENT_EXEC messages are just plain unreliable. They don't
always arrive. Things get especially ugly when a non-leader task
does an execve.

A debugger has little reasonable access to x86 segment info.
Given an arbitrary segment number, I can not generally look it
up in the context of the target process. I can special case
the typical ones, separately for i386 and x86-64. I can "know"
that specific segments are the context switched ones, then ask
the kernel about those.

A debugger needs to read the vdso page. A debugger might want to use
either /proc/*/mem or PTRACE_PEEK. One of the architectures can't do
both. If I remember right, x86-64 can't PTRACE_PEEK.

Suppose my debugger has a few threads. PTRACE_ATTACH will not share.
All ptrace calls fail for all threads other than the one that attached.
It really sucks to have to funnel everything through one thread.

BTW, not bugs exactly, but... Getting ptrace events via waitpid is
horrible. Events arrive in some arbitrary order, with no peeking ahead
either within a single target process or even across multiple target
processes. Messages from successful clone/fork/exec may arrive before
or after the child stops, making for some lovely non-deterministic
behavior. Also, it's no fun to mix waitpid with signals or select.
Writing a reliable debugger with ptrace on Linux is absurdly painful.

^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: ptrace bugs and related problems
@ 2006-07-28 20:07 Chuck Ebbert
  0 siblings, 0 replies; 15+ messages in thread
From: Chuck Ebbert @ 2006-07-28 20:07 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Albert Cahalan, linux-kernel

In-Reply-To: <20060728034741.GA3372@nevyn.them.org>

(cc: trimmed)

On Thu, 27 Jul 2006 23:47:41 -0400, Daniel Jacobowitz wrote:
> 
> > In do_fork, the result of fork_traceflag is assigned
> > to the "trace" variable. Note that PT_TRACE_VFORK_DONE
> > does not cause "trace" to be non-zero.
> > 
> > Then we hit this code:
> > 
> >                if (unlikely (trace)) {
> >                        current->ptrace_message = nr;
> >                        ptrace_notify ((trace << 8) | SIGTRAP);
> >                }
> > 
> > That doesn't run. The ptrace_message is thus not set when
> > ptrace_notify is called to send the PTRACE_EVENT_VFORK_DONE
> > message. You get random stale data from a previous message.
> 
> Why do you want the message data anyway?
> 
> FORK/VFORK/CLONE events have a message: it says what the new process's
> PID is.  VFORK_DONE doesn't have a message, because it only indicates
> that the current process is about to resume; it's an event that only
> has one process associated with it.
> 
> I really don't think this is a bug.

Maybe not a bug, but this would be a nice enhancement.  It would cost
exactly one line of code.  I looked at user code I had written and it
assumed the message was available (it was, because I was also tracing
EVENT_VFORK and it happens to be left over from that.)  If we make this
a part of the API, future kernel changes wouldn't break this (erroneous)
assumption, which otherwise might give someone a nasty surprise in
currently-working code.

Otherwise we should zero it out and see what breaks. :)

-- 
Chuck


^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: ptrace bugs and related problems
@ 2006-07-31  6:21 Chuck Ebbert
  2006-08-01  0:30 ` Albert Cahalan
  0 siblings, 1 reply; 15+ messages in thread
From: Chuck Ebbert @ 2006-07-31  6:21 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Linus Torvalds, Alan Cox, Andi Kleen, Arjan van de Ven,
	Andrew Morton, Ingo Molnar, linux-kernel, Roland Dreier

In-Reply-To: <787b0d920607262355x3f669f0ap544e3166be2dca21@mail.gmail.com>

On Thu, 27 Jul 2006 02:55:17 -0400, Albert Cahalan wrote:

> Both i386 and x86-64 PTRACE_SINGLESTEP only check for popf, not iret.
> Yes, really, iret can be used by normal apps.

Well there's a FIXME in the x86_64 code for that, anyway. (lahf/sahf
can't cause problems, so iret is the only remaining problem.)

> There is also no check
> for failure, as when the popf or iret takes an alignment exception
> or hits an unmapped page.

Can that happen?  Singlestep traps happen after the instruction has
already executed.  Or are you talking about starting to singlestep
after hitting a code breakpoint fault?

> There is the pushf problem. Single-stepping this simple code
> does not work:   pushf ; popf

The debugger needs to mask TF in the pushed flags.  Read the comment
in is_at_popf().

> The is_at_popf function on x86-64 fails to account for instruction
> set differences. Many prefixes are only valid in 32-bit mode, and
> many others are only valid in 64-bit mode.

I only see one bug here: the REX prefixes are 'inc' instructions
in compatibility mode.  Otherwise, prefixes that are only valid in
32-bit mode are ignored in 64-bit mode.

> The debugger has no way to reliably stop a process without causing
> confusion. The SIGSTOP signal is not queued. The app under debug might
> use SIGSTOP and rely on SIGSTOP to work. The debugger can't steal this.

I sort of got this working some time ago but I forget what the
problems were.  The idea was to decide whether or not a SIGSTOP was
meant for the debugger or not, and forward the unwanted ones to the
app.  But yeah, the interface really sucks and that probably can't be
made to work.

What Linux needs is a fresh new design for a debugging interface to
sit on top if utrace, one that solves the current inherent problems.
Just making a list of these problems is probably the place to start.

-- 
Chuck


^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: ptrace bugs and related problems
@ 2006-08-01  5:52 Chuck Ebbert
  0 siblings, 0 replies; 15+ messages in thread
From: Chuck Ebbert @ 2006-08-01  5:52 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Roland Dreier, linux-kernel, Ingo Molnar, Andrew Morton,
	Arjan van de Ven, Andi Kleen, Alan Cox, Linus Torvalds

In-Reply-To: <787b0d920607311730s5a951a5cv38eea7db03c759c8@mail.gmail.com>

On Mon, 31 Jul 2006 20:30:07 -0400, Albert Cahalan wrote:
> 
> > > There is also no check
> > > for failure, as when the popf or iret takes an alignment exception
> > > or hits an unmapped page.
> >
> > Can that happen?
> 
> You're at a popf that can not complete.
> You single-step.
> The kernel sets TF.
> The kernel notes the popf.
> The kernel assumes that TF will be determined by the popf.
> The kernel tries to run the popf.
> The popf faults, leaving TF unmodified.
> The kernel fails to clear TF.

That can be fixed, but it won't be easy.

> > > There is the pushf problem. Single-stepping this simple code
> > > does not work:   pushf ; popf
> >
> > The debugger needs to mask TF in the pushed flags.  Read the comment
> > in is_at_popf().
> 
> I think the term is "known bug".

Well at least it's known. :)

> > > The is_at_popf function on x86-64 fails to account for instruction
> > > set differences. Many prefixes are only valid in 32-bit mode, and
> > > many others are only valid in 64-bit mode.
> 
> There is a problem with instruction length though.
> The buffer is 16 bytes long, but should be only 15.

OK.

> The 0xf0 (lock) prefix is not valid for popf or iret.

I think it is OK on really old processors (maybe only 386?)  If we fix
the above problem with faulting instructions then the fault this would
cause on newer CPUs should not be a problem.
-- 
Chuck


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-08-01  5:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-27  6:55 ptrace bugs and related problems Albert Cahalan
2006-07-27  7:19 ` David Miller
2006-07-27 20:31 ` Daniel Jacobowitz
2006-07-28  1:17   ` Albert Cahalan
2006-07-28  3:47     ` Daniel Jacobowitz
2006-07-28 22:28       ` Albert Cahalan
2006-07-28 22:36         ` David Miller
2006-07-31 19:00         ` Daniel Jacobowitz
2006-08-01  0:08           ` Albert Cahalan
2006-08-01  1:37             ` Daniel Jacobowitz
2006-08-01  5:22               ` Albert Cahalan
  -- strict thread matches above, loose matches on Subject: below --
2006-07-28 20:07 Chuck Ebbert
2006-07-31  6:21 Chuck Ebbert
2006-08-01  0:30 ` Albert Cahalan
2006-08-01  5:52 Chuck Ebbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox