All of lore.kernel.org
 help / color / mirror / Atom feed
From: "K.Prasad" <prasad@linux.vnet.ibm.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: Ingo Molnar <mingo@elte.hu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Benjamin Herrenschmidt <benh@au1.ibm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	maneesh@linux.vnet.ibm.com, Roland McGrath <roland@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [Patch 00/11] Hardware Breakpoint interfaces
Date: Tue, 7 Apr 2009 13:52:24 +0530	[thread overview]
Message-ID: <20090407082224.GA22500@in.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0904011057100.3736-100000@iolanthe.rowland.org>

On Wed, Apr 01, 2009 at 12:16:26PM -0400, Alan Stern wrote:
> Sorry for not replying sooner; I was away on a short vacation.
> 
> On Sat, 28 Mar 2009, K.Prasad wrote:
> 
> > > There are some serious issues involving userspace breakpoints and the
> > > legacy ptrace interface.  It all comes down to this: At what point
> > > is a breakpoint registered for a ptrace caller?
> > > 
> > > Remember, to set up a breakpoint a debugger needs to call ptrace
> > > twice: once to put the address in one of the DR0-DR3 registers and
> > > once to set up DR7.  So when does the task own the breakpoint?
> > > 
> > > Logically, we should wait until DR7 gets set, because until then the
> > > breakpoint is not active.  But then how do we let the caller know that
> > > one of his breakpoints conflicts with a kernel breakpoint?
> > > 
> > > If we report an error during an attempt to set DR0-DR3 then at least
> > > it's unambiguous.  But then how do we know when the task is _finished_
> > > using the breakpoint?  It's under no obligation to set the register
> > > back to 0.
> > > 
> > > Related to this is the question of how to store the task's versions of
> > > DR0-DR3 when there is no associated active breakpoint.  Maybe it would
> > > be best to keep the existing registers in the thread structure.
> > > 
> > 
> > These are profound questions and I believe that it is upto the process in
> > user-space to answer them.
> > 
> > What we could ensure from the kernel-space is to retain the
> > existing behaviour of ptrace i.e. return success when a write is done on
> > DR0-DR3 (almost unconditionally) and reserve error returns only when DR7
> > is written into.
> > 
> > The patch in question could possibly return an -ENOMEM (even when write
> > is done on DR0-DR3) but I will change the behaviour as stated above.
> > 
> > 
> > A DR0 - DR3 return will do a:
> > 	thread->debugreg[n] = val;
> > 	return 0;
> > 
> > while all error returns are reserved for:
> > 	rc = ptrace_write_dr7(tsk, val);
> 
> That does seem to be the most logical approach.  The problem with it is
> that it doesn't give the caller much information about the cause of the
> problem or how to fix it.  (Not that existing programs would know how
> to interpret this information anyway...)
> 

A slight change though...writes to DR0-DR3 may fail if the address is
invalid. This behaviour is true even in existing implementation of
ptrace_set_debugreg().

I am removing some of the comments below, which I have addressed in the
patch. Like using switch-case and consolidating updation of kernel
breakpoints into one function: arch_update_kernel_hw_breakpoints(), etc.

> > > > +int arch_modify_user_hw_breakpoint(int pos, struct hw_breakpoint *bp,
> > > > +		struct task_struct *tsk)
> > > > +{
> > > > +	struct thread_struct *thread = &(tsk->thread);
> > > > +
> > > > +	/* Check if the register to be modified was enabled by the thread */
> > > > +	if (!(thread->dr7 & (1 << (pos * DR_ENABLE_SIZE))))
> > > > +		return -EINVAL;
> > > > +
> > > > +	thread->dr7 &= ~dr7_masks[pos];
> > > > +	thread->dr7 |= encode_dr7(pos, bp->info.len, bp->info.type);
> > > > +
> > > > +	return 0;
> > > > +}
> > > 
> > > It might be possible to eliminate this rather awkward code, once the
> > > DR0-DR3 values are added back into the thread structure.
> > > 
> > 
> > I'm sorry I don't see how. Can you explain?
> 
> This gets back to those tricky questions about integrating ptrace with
> hw_breakpoint.  In theory we could avoid allocating hw_breakpoint
> structures for ptrace breakpoints and treat them completely
> independently, but overall it's probably better to do things uniformly.
> 
> Regardless, we are still left with the problem that it's not easy to
> capture the ptrace interface using an hw_breakpoint structure, because
> ptrace breakpoints are set up in two stages: one to save the address in
> DRn (0 <= n <= 3) and one to save the type and length in DR7.  What's
> the best way to handle it when task being debugged isn't running and
> the debugger changes the breakpoint address?  Or changes the
> length/type fields in DR7?  I wrote modify_user_hw_breakpoint() to
> handle this, but it was just a kludge.
> 
> If we store debugreg[0..3] in the thread structure, and if 
> __register_user_hw_breakpoint() is written properly, then maybe ptrace 
> can install modifications to existing breakpoints simply by calling 
> __register_user_hw_breakpoint() and re-using the old "pos" value.
> 

I've re-written __modify_user_hw_breakpoint() to invoke the common
(new) worker routine arch_update_user_hw_breakpoint(). Let me know if
you think the new code still needs re-work.

> > 
> > This code is in hw_breakpoint_handler() which we don't intend to enter 
> > if single-stepping bit is set (through kprobes) and hence the
> > NOTIFY_DONE.
> 
> I don't see why we shouldn't enter even in this case.  Suppose somebody
> single-steps over an instruction that accesses a variable with an
> associated data breakpoint?
>

I think this is an important case that I missed, which made me add the
early return in hw_breakpoint_handler() for (DR6 & DR_STEP). I have
changed hw_breakpoint_handler() code to address all of your comments
except making dr6 a pointer. I will talk about my concerns about it 
in the subsequent mail.
 
> > 
> > 	/* Lazy debug register switching */
> > 	if (per_cpu(last_debugged_task, get_cpu()) != current) {
> > 		switch_to_none_hw_breakpoint();
> > 		put_cpu_no_resched();
> > 	}
> 
> I just noticed that the lines saving DR7 and setting it to 0 need to
> come here.  Otherwise switch_to_none_hw_breakpoint() might set DR7 back
> to a nonzero value, and it might not match the value stored in dr7.
> 

arch_uninstall_thread_hw_breakpoint()<--switch_to_none_hw_breakpoint()
will store 'kdr7' (which contains all kernel-space breakpoints in
encoded format) to DR7 physical register. Given that the current()
process should not have TIF_DEBUG() set (if it were set,
switch_to_thread_hw_breakpoint() would have been invoked to set
last_debugged_task), we will wipe out all user-space breakpoints and
store only kdr7.

> > > > @@ -530,13 +530,14 @@ asmlinkage __kprobes struct pt_regs *syn
> > > >  dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
> > > >  {
> > > >  	struct task_struct *tsk = current;
> > > > -	unsigned long condition;
> > > > +	unsigned long dr6;
> > > >  	int si_code;
> > > >  
> > > > -	get_debugreg(condition, 6);
> > > > +	get_debugreg(dr6, 6);
> > > > +	set_debugreg(0, 6);	/* DR6 may or may not be cleared by the CPU */
> > > >  
> > > >  	/* Catch kmemcheck conditions first of all! */
> > > > -	if (condition & DR_STEP && kmemcheck_trap(regs))
> > > > +	if (dr6 & DR_STEP && kmemcheck_trap(regs))
> > > >  		return;
> > > 
> > > Are you sure this is right?  Is it possible for any of the DR_TRAPn bits
> > > to be set as well when this happens?
> > > 
> > > 
> > 
> > I did not look at this check before. But the (dr6 & DR_STEP) condition
> > should make sure no HW breakpoint exceptions are set (since we don't
> > allow instruction breakpoints in kernel-space yet, as explained above).
> 
> What does kmemcheck_trap() do?
>

As I said before, the fact that I ignored a case where we could
single-step an instruction accessing a data being monitored by a
breakpoint, made me ignore all (dr6 & DR_STEP) cases.

kmemcheck_trap()'s functionality can be nicely understood from the
"Technical description" section in Documentation/kmemcheck.txt.
Basically it uses single-stepping to 'hide' a page immediately after
the instruction, say i, using the page has finished execution. This
is to cause page-fault deliberately, which is used by kmemcheck.

However, as you rightly pointed, the next instruction (i + 1) could be
accessing a data monitored by the debug registers and TRAP<n> bits could
be set. We shouldn't allow return in such a case. I will modify the code
in do_debug() also.

Thanks,
K.Prasad


  reply	other threads:[~2009-04-07  8:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-24 15:24 [Patch 00/11] Hardware Breakpoint interfaces K.Prasad
2009-03-25 19:48 ` Alan Stern
2009-03-27 22:06   ` K.Prasad
2009-04-01 16:16     ` Alan Stern
2009-04-07  8:22       ` K.Prasad [this message]
2009-04-09 20:50         ` Alan Stern
2009-03-28  8:46   ` K.Prasad
2009-04-01 16:22     ` Alan Stern
2009-04-07  8:22       ` K.Prasad
  -- strict thread matches above, loose matches on Subject: below --
2009-04-07  6:34 K.Prasad
2009-04-16 21:19 ` Alan Stern
2009-04-17  3:12   ` K.Prasad
2009-04-17 14:37     ` Alan Stern
2009-04-24  5:56       ` K.Prasad
2009-04-24 14:16         ` Alan Stern
2009-04-24 15:57           ` K.Prasad
2009-04-24 16:16             ` Alan Stern
2009-03-07  5:04 [Patch 00/11] Hardware Breakpoint Interfaces prasad
2009-03-05  4:37 [patch 00/11] Hardware Breakpoint interfaces prasad
2009-03-10 13:46 ` Ingo Molnar
2009-03-11 12:11   ` K.Prasad
2009-03-11 16:34     ` Alan Stern
2009-03-11 17:25       ` K.Prasad
2009-03-11 17:30         ` Ingo Molnar
2009-03-10 13:51 ` Ingo Molnar
2009-03-10 14:24   ` Alan Stern
2009-03-10 14:54     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090407082224.GA22500@in.ibm.com \
    --to=prasad@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@au1.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maneesh@linux.vnet.ibm.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.