All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: "Metzger, Markus T" <markus.t.metzger@intel.com>
Cc: "Kleen, Andi" <andi.kleen@intel.com>, Ingo Molnar <mingo@elte.hu>,
	Roland McGrath <roland@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [rfc] x86, bts: fix crash
Date: Thu, 26 Mar 2009 02:58:01 +0100	[thread overview]
Message-ID: <20090326015801.GA451@redhat.com> (raw)
In-Reply-To: <928CFBE8E7CB0040959E56B4EA41A77E9260843D@irsmsx504.ger.corp.intel.com>

Metzger, let's move this discussion to lkml, I also cc'ed Roland.
Imho, this problem (which I don't fully understand ;) needs more
eyes.

On 03/25, Metzger, Markus T wrote:
>
> The attached patch

Please don't use attachments ;)

> I would appreciate any comment or directions you have if you think that this is the
>
> wrong approach to tackle this problem.

Metzger, this all is a black magic to me, I don't even know what bts does.
But at least some bits doesn't look right to me.

> +static void ptrace_bts_exit_tracer(void)
>  {
> +	struct task_struct *child, *n;
> +
>  	/*
> -	 * Ptrace_detach() races with ptrace_untrace() in case
> -	 * the child dies and is reaped by another thread.
> +	 * We must not hold the tasklist_lock when we release the bts
> +	 * tracer, since we need to make sure the cpu executing the
> +	 * tracee stops tracing before we may free the trace buffer.
>  	 *
> -	 * We only do the memory accounting at this point and
> -	 * leave the buffer deallocation and the bts tracer
> -	 * release to ptrace_bts_untrace() which will be called
> -	 * later on with tasklist_lock held.
> +	 * read_lock(&tasklist_lock) and smp_call_function() may
> +	 * deadlock with write_lock_irq(&tasklist_lock).
> +	 *
> +	 * As long as we're the only one modifying our list of traced
> +	 * children, we should be safe, since we're currently busy.
>  	 */
> -	release_locked_buffer(child->bts_buffer, child->bts_size);
> +	list_for_each_entry_safe(child, n, &current->ptraced, ptrace_entry) {

It is not safe to iterate over current->ptraced lockless, the comment
is not right. Anoter subthread can reap the dead tracee, or at least
do ptrace_unlink() if the tracee is not the child of our thread group.

> +static void ptrace_bts_exit_tracee(void)
> +{
> +	struct mm_struct *mm = NULL;
> +
> +	/*
> +	 * This code may race with ptrace reparenting.
> +	 *
> +	 * We hold the tasklist lock while we check whether we're
> +	 * ptraced and grab our ptracer's mm.
> +	 *
> +	 * It does not really matter if we're reparented,
> +	 * afterwards. We hold onto the correct mm.
> +	 *
> +	 * If we're reparented before we get the tasklist_lock, our
> +	 * former ptrace parent will release the bts tracer.
> +	 */
> +	write_lock_irq(&tasklist_lock);
> +	if (current->ptrace)
> +		mm = get_task_mm(current->parent);

We can't take task_lock() (called by get_task_mm) under write_lock(tasklist),
this is deadlockable. Afaics, read_lock() is enough here, in that case it is
safe to call get_task_mm().

> @@ -752,6 +752,14 @@ void ds_release_bts(struct bts_tracer *t
>
>  	ds_suspend_bts(tracer);
>
> +	/*
> +	 * We must wait for the suspend to take effect before we may
> +	 * free the tracer and the ds configuration.
> +	 */
> +	if (tracer->ds.context->task &&
> +	    (tracer->ds.context->task != current))
> +		wait_task_inactive(tracer->ds.context->task, 0);

I am not sure I understand the problem. From the changelog:

	If the children are currently executing, the buffer
	may be freed while the hardware is still tracing.
	This might cause the hardware to overwrite memory.

So, the problem is that ds.context->task must not be running before we
can start to disable/free ds, yes? Something like ds_switch_to() should
be completed, right?

In that case I don't really understand how wait_task_inactive() can help.
If the task is killed it can be scheduled again, right after
wait_task_inactive() returns.

Also. This function is called from ptrace_bts_exit_tracer(), when the
tracee is not stopped. In this case wait_task_inactive() can spin forever.
For example, if the tracee simply does "for (;;) ;" it never succeeds.


If my understanding of the problem is wrong, could you please explain
it for dummies?

Oleg.


       reply	other threads:[~2009-03-26  2:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <928CFBE8E7CB0040959E56B4EA41A77E9260843D@irsmsx504.ger.corp.intel.com>
2009-03-26  1:58 ` Oleg Nesterov [this message]
2009-03-27 15:01   ` [rfc] x86, bts: fix crash Metzger, Markus T
2009-03-27 16:50     ` Oleg Nesterov
2009-03-27 17:33       ` Markus Metzger
2009-03-27 21:29         ` Oleg Nesterov
2009-03-30  7:24           ` Metzger, Markus T
2009-03-30 11:29             ` Metzger, Markus T
2009-03-30 13:29               ` Oleg Nesterov
2009-03-30 13:55                 ` Metzger, Markus T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090326015801.GA451@redhat.com \
    --to=oleg@redhat.com \
    --cc=andi.kleen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus.t.metzger@intel.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.