All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>, Paul Mackerras <paulus@samba.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH,RFC] perf: panic due to inclied cpu context task_ctx value
Date: Tue, 29 Mar 2011 21:01:58 +0200	[thread overview]
Message-ID: <1301425318.2250.485.camel@laptop> (raw)
In-Reply-To: <20110329162851.GA6317@redhat.com>

On Tue, 2011-03-29 at 18:28 +0200, Oleg Nesterov wrote:
> On 03/29, Peter Zijlstra wrote:
> >
> > On Mon, 2011-03-28 at 18:56 +0200, Oleg Nesterov wrote:
> > >
> > > jump_label_dec:
> > >
> > > 	if (atomic_dec_and_test(key))
> > > 		jump_label_disable(key);
> > >
> > > Another thread can create the PERF_ATTACH_TASK event in between
> > > and call jump_label_update(JUMP_LABEL_ENABLE) first. Looks like,
> > > jump_label_update() should ensure that "type" matches the state
> > > of the "*key" under jump_label_lock().
> >
> > No I think you're right, and I think we fixed that but it looks like
> > Ingo still didn't merge the new jump-label patches :/
> 
> OK. To remind, we have another problem, perf_install can race with exit.
> But lets ignore this for now...

Yay! ;-)

> 
> You know, I honestly tried to convince myself I can understand your
> patch. All I can say, I'll try to read it again ;) But the main idea
> is clear, we give more respect to ->nr_events and once it is zero
> task_ctx must not be active.

Right, except I'm leaking an ->is_active and was seriously considering
your clever idea of splitting the sched_in and sched_out jump-labels.

> > @@ -2114,8 +2107,19 @@ static void perf_event_context_sched_in(
> >  	struct perf_cpu_context *cpuctx;
> >
> >  	cpuctx = __get_cpu_context(ctx);
> > -	if (cpuctx->task_ctx == ctx)
> > +	raw_spin_lock(&ctx->lock);
> > +	/*
> > +	 * Serialize against perf_install_in_context(), the interesting case
> > +	 * is where perf_install_in_context() finds the context inactive and
> > +	 * another cpu is just about to schedule the task in. In that case
> > +	 * we need to avoid observing a stale ctx->nr_events.
> 
> I don't understand the comment... We can't race __perf_install_in_context,
> it can only run on the same CPU but we are called with irqs disabled.

Ah, I meant a race with perf_install_in_context() where task_function()
fails because !task_curr(), in that case we'll attempt
add_event_to_ctx() from the remote cpu. If we race wrong with sched_in
nobody might schedule the event.

> > +	ctx->is_active = 1;
> > +	if (cpuctx->task_ctx == ctx || !ctx->nr_events) {
> > +		raw_spin_lock(&ctx->lock);
> >  		return;
> 
> I guess you meant _unlock.

Yeah, utter fail day today :/

> But now I don't understand what ->is_active means. We make it true,
> but doesn't set cpuctx->task_ctx = ctx. Why __perf_event_release()
> clears ->is_active then?

Yeha, that's the big problem I have with this.

> It seems to me, instead we should change ctx_sched_in() to check
> nr_events and do nothing if it is zero.

you mean, not set ->is_active?

> > +	cpuctx->task_ctx = ctx;
> > +
> >  	ctx_sched_in(ctx, cpuctx, EVENT_PINNED, task);
> 
> But we already dropped ctx->lock, __perf_event_release() can remove
> the last event before ctx_sched_in() takes it again.
> 
> This should be moved into ctx_sched_in() afaics, but this is not simple.
> 
> So, perhaps we can take ctx->lock and check nr_events after the 2nd
> ctx_sched_in(). If it is zero, we should clear task_ctx/is_active.
> 
> Perhaps. Right now I got lost.

Yeah, you and me both.. I went to look at something else because I
simply confused myself more. Hopefully tomorrow will bring some sanity.



  reply	other threads:[~2011-03-29 18:59 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-24 16:44 [PATCH,RFC] perf: panic due to inclied cpu context task_ctx value Jiri Olsa
2011-03-25 19:10 ` Oleg Nesterov
2011-03-26 15:37 ` Peter Zijlstra
2011-03-26 16:13   ` Oleg Nesterov
2011-03-26 16:38     ` Peter Zijlstra
2011-03-26 17:09       ` Oleg Nesterov
2011-03-26 17:35         ` Oleg Nesterov
2011-03-26 18:29           ` Peter Zijlstra
2011-03-26 18:49             ` Oleg Nesterov
2011-03-28 13:30             ` Oleg Nesterov
2011-03-28 14:57               ` Peter Zijlstra
2011-03-28 15:00                 ` Peter Zijlstra
2011-03-28 15:15                 ` Oleg Nesterov
2011-03-28 16:27                   ` Peter Zijlstra
2011-03-28 15:39                     ` Oleg Nesterov
2011-03-28 15:49                 ` Peter Zijlstra
2011-03-28 16:56                   ` Oleg Nesterov
2011-03-29  8:32                     ` Peter Zijlstra
2011-03-29 10:49                       ` Peter Zijlstra
2011-03-29 16:28                       ` Oleg Nesterov
2011-03-29 19:01                         ` Peter Zijlstra [this message]
2011-03-30 13:09                     ` Jiri Olsa
2011-03-30 14:51                       ` Peter Zijlstra
2011-03-30 16:37                         ` Oleg Nesterov
2011-03-30 18:30                           ` Paul E. McKenney
2011-03-30 19:53                             ` Oleg Nesterov
2011-03-30 21:26                           ` Peter Zijlstra
2011-03-30 21:35                             ` Oleg Nesterov
2011-03-31 10:32                             ` Jiri Olsa
2011-03-31 12:41                             ` [tip:perf/urgent] perf: Fix task context scheduling tip-bot for Peter Zijlstra
2011-03-31 13:28                         ` [PATCH,RFC] perf: panic due to inclied cpu context task_ctx value Oleg Nesterov
2011-03-31 13:51                           ` Peter Zijlstra
2011-03-31 14:10                             ` Oleg Nesterov
2011-04-04 16:20                             ` Oleg Nesterov
2011-03-30 15:32                       ` Oleg Nesterov
2011-03-30 15:40                         ` Peter Zijlstra
2011-03-30 15:52                           ` Oleg Nesterov
2011-03-30 15:57                             ` Peter Zijlstra
2011-03-30 16:11                         ` Peter Zijlstra
2011-03-30 17:13                           ` Oleg Nesterov
2011-03-26 17:09       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1301425318.2250.485.camel@laptop \
    --to=peterz@infradead.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.