public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Stephane Eranian <eranian@google.com>
Cc: mingo@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
	tglx@linutronix.de, mingo@elte.hu,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:perf/core] perf: Add cgroup support
Date: Thu, 17 Feb 2011 16:50:43 +0100	[thread overview]
Message-ID: <1297957843.2413.1911.camel@twins> (raw)
In-Reply-To: <AANLkTi=0psOuX7kd=GH80+dEpziaTghQxjUTW82DhCC6@mail.gmail.com>

On Thu, 2011-02-17 at 15:45 +0100, Stephane Eranian wrote:

> > CONFIG_PROVE_RCU=y, its a bit of a shiny feature but most of the false
> > positives are gone these days I think.
> >
> I have this one enabled, yet no message.

Hmm, Ingo triggered it, not sure what he did.


> >> > @@ -5794,9 +5795,14 @@ static void task_clock_event_read(struct perf_event *event)
> >> >        u64 time;
> >> >
> >> >        if (!in_nmi()) {
> >> > -               update_context_time(event->ctx);
> >> > +               struct perf_event_context *ctx = event->ctx;
> >> > +               unsigned long flags;
> >> > +
> >> > +               spin_lock_irqsave(&ctx->lock, flags);
> >> > +               update_context_time(ctx);
> >> >                update_cgrp_time_from_event(event);
> >> > -               time = event->ctx->time;
> >> > +               time = ctx->time;
> >> > +               spin_unlock_irqrestore(&ctx->lock, flags);
> >> >        } else {
> >> >                u64 now = perf_clock();
> >> >                u64 delta = now - event->ctx->timestamp;
> >
> > I just thought we should probably kill the !in_nmi branch, I'm not quite
> > sure why that exists..
> 
> I don't quite understand what this event is supposed to count in system-wide
> mode. This function adds a time delta. It may be using the wrong time source
> in cgroup mode.
> 
> Having said that, it seems to me like we may not even need the call to
> update_cgrp_time_from_event() there. It is not even used to compute
> the time delta in that function. Yet, we do get correct timings in cgroup
> mode. Thus, I suspect the timing is taken care by callers already whenever
> needed. I looked at the pmu->read() callers, and it seems they do exactly
> that. In summary, I believe we may be able to drop this call.

ok, nice!

> >> > I then realized that the events themselves pin the cgroup, so its all
> >> > cosmetic at best, but then I already had the below patch...
> >> >
> >> I assume by 'pin the group' you mean the cgroup cannot disappear
> >> while there is at least one event pointing to it. That's is indeed true
> >> thanks to refcounting (css_get()).
> >
> > Right, that's what I was thinking, but now I think that's not
> > sufficient, we can have cgroups without events but with tasks in for
> > which the races are still valid.
> >
> But in that case, no perf_event code should be fiddling with cgroups.
> I think there are guards for that, either is_cgroup_event() or ctx->nr_cgroups.
> 
> But it seems perf_cgroup_from_event() is the one exception. So maybe
> we could rewrite it:
> 
> static inline void update_cgrp_time_from_event(struct perf_event *event)
> {
>         struct perf_cgroup *cgrp;
> 
>         if (!is_cgroup_event(event))
>                 return;
> 
>         cgrp = perf_cgroup_from_task(current);
>         /*
>          * do not update time when cgroup is not active
>          */
>         if (cgrp != event->cgrp)
>                 return;
> 
>         __update_cgrp_time(event->cgrp);
> }

That might indeed work. We'd still need to shut up that RCU warning
though, we can do that by annotating it away by using
task_subsys_state(.c=1), and put a comment in explaining things.

> @@ -1613,7 +1614,7 @@ static int __perf_event_enable(void *info)
>        /*
>         * set current task's cgroup time reference point
>         */
> -       perf_cgroup_set_timestamp(current, perf_clock());
> +       perf_cgroup_set_timestamp(current, ctx);

That part ended up avoiding a perf_clock() call, we could write that as:

  perf_cgroup_set_timestamp(current, ctx->timestamp);

since ctx->timestamp has just been set to perf_clock().

Could you send a nice set of patches addressing all concerns?

  reply	other threads:[~2011-02-17 15:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-14  9:20 [PATCH 1/2] perf_events: add cgroup support (v9) Stephane Eranian
2011-02-15 14:55 ` Peter Zijlstra
2011-02-15 15:01   ` stephane eranian
2011-02-16 13:46 ` [tip:perf/core] perf: Add cgroup support tip-bot for Stephane Eranian
2011-02-16 16:57   ` Peter Zijlstra
2011-02-17 11:16     ` Stephane Eranian
2011-02-17 11:36       ` Peter Zijlstra
2011-02-17 14:45         ` Stephane Eranian
2011-02-17 15:50           ` Peter Zijlstra [this message]
2011-02-17 16:01             ` Stephane Eranian
2011-02-17 16:05               ` Peter Zijlstra
2011-02-17 16:13             ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1297957843.2413.1911.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox