From: Peter Zijlstra <peterz@infradead.org>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, paulus@samba.org,
davem@davemloft.net, fweisbec@gmail.com, robert.richter@amd.com,
perfmon2-devel@lists.sf.net, eranian@gmail.com
Subject: Re: [PATCH] perf_events: AMD event scheduling (v2)
Date: Thu, 04 Feb 2010 17:19:35 +0100 [thread overview]
Message-ID: <1265300375.22001.102.camel@laptop> (raw)
In-Reply-To: <bd4cb8901002040805k759ae239q80e42f8ac4cc0e0e@mail.gmail.com>
On Thu, 2010-02-04 at 17:05 +0100, Stephane Eranian wrote:
> >> @@ -2250,10 +2261,144 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event
> >> return &unconstrained;
> >> }
> >>
> >> +/*
> >> + * AMD64 events are detected based on their event codes.
> >> + */
> >> +static inline int amd_is_nb_event(struct hw_perf_event *hwc)
> >> +{
> >> + u64 val = hwc->config;
> >
> > & K7_EVNTSEL_EVENT_MASK ?
>
> Yes, except that:
> -#define K7_EVNTSEL_EVENT_MASK 0x7000000FFULL
> +#define K7_EVNTSEL_EVENT_MASK 0xF000000FFULL
Ah, indeed.
> >> + *
> >> + * try to alllcate same counter as before if
> >> + * event has already been assigned once. Otherwise,
> >> + * try to use free counter k obtained during the 1st
> >> + * pass above.
> >> + */
> >> + i = j = hwc->idx != -1 ? hwc->idx : (k == -1 ? 0 : k);
> >
> > That's patently unreadable, and I'm not sure what happens if we failed
> > to find an eligible spot in the above loop, should we not somehow jump
> > out and return emptyconstraint?
> >
> I have clarified the nested if. The goal of the first loop is to check
> if the event is already present (explained why this is possible in the
> comment). We may not have scanned all the slots. Furthermore, there
> may be concurrent scans going on on other CPUs. The first pass tries
> to find an empty slot, it does not reserve it. The second loop is the actual
> allocation. We speculate the slot we found in the first pass is still available.
> If the second loops fails, then we return emptyconstraints.
Ah, now I see, yes indeed.
> >> + do {
> >> + old = cmpxchg(nb->owners+i, NULL, event);
> >> + if (!old)
> >> + break;
> >> + if (++i == x86_pmu.num_events)
> >> + i = 0;
> >> + } while (i != j);
> >> +skip:
> >> + if (!old)
> >> + return &nb->event_constraints[i];
> >> + return &emptyconstraint;
> >> }
> >> +
> >> + /*
> >> + * function may be called too early in the
> >> + * boot process, in which case nb_id is bogus
> >> + *
> >> + * for BSP, there is an explicit call from
> >> + * amd_pmu_init()
> >> + */
> >
> > I keep getting flash-backs to doom's graphics engine every time I see
> > BSP..
> >
> So how do you call the initial boot processor?
boot cpu, or x86 specific, cpu0.
>
> >> + nb_id = amd_get_nb_id(cpu);
> >> + if (nb_id == BAD_APICID)
> >> + return;
> >> +
> >> + cpu1 = &per_cpu(cpu_hw_events, cpu);
> >> + cpu1->amd_nb = NULL;
> >> +
> >> + raw_spin_lock(&amd_nb_lock);
> >> +
> >> + for_each_online_cpu(i) {
> >> + cpu2 = &per_cpu(cpu_hw_events, i);
> >> + nb = cpu2->amd_nb;
> >> + if (!nb)
> >> + continue;
> >> + if (nb->nb_id == nb_id)
> >> + goto found;
> >> + }
> >> +
> >> + nb = amd_alloc_nb(cpu, nb_id);
> >> + if (!nb) {
> >> + pr_err("perf_events: failed to allocate NB storage for CPU%d\n", cpu);
> >> + raw_spin_unlock(&amd_nb_lock);
> >> + return;
> >> + }
> >> +found:
> >> + nb->refcnt++;
> >> + cpu1->amd_nb = nb;
> >> +
> >> + raw_spin_unlock(&amd_nb_lock);
> >
> > Can't this be simplified by using the cpu to node mask?
>
> You mean to find the NB that corresponds to a CPU?
Yep, saves having to poke at all cpus.
> > Also, I think this is buggy in that:
> >
> > perf_disable();
> > event->pmu->disable(event);
> > ...
> > event->pmu->enable(event);
> > perf_enable();
> >
> > can now fail, I think we need to move the put_event_constraint() from
> > x86_pmu_disable() into x86_perf_enable() or something.
>
> Constraints are reserved during x86_scheduling(), not during enable().
> So if you had a conflict it was detected earlier than that.
What I'm worried about is:
CPU A and B are of the same NorthBridge and all node counters are taken.
CPU-A CPU-B
perf_disable();
event->pmu->disable(event)
x86_pmu.put_event_constraint() /* free slot n */
event->pmu->enable(event);
x86_schedule_events();
x86_pmu.get_event_constraint() /* grab slot n */
event->pmu->enable(event)
x86_schedule_events()
x86_pmu.get_event_constraint() /* FAIL */
perf_enable();
That means you cannot disable/enable the same event within a
perf_disable/enable section.
next prev parent reply other threads:[~2010-02-04 16:19 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-01 20:15 [PATCH] perf_events: AMD event scheduling (v2) Stephane Eranian
2010-02-04 14:55 ` Peter Zijlstra
2010-02-04 14:59 ` Peter Zijlstra
2010-02-04 15:04 ` Stephane Eranian
2010-02-04 16:05 ` Stephane Eranian
2010-02-04 16:19 ` Peter Zijlstra [this message]
2010-02-04 16:32 ` Stephane Eranian
2010-02-04 16:35 ` Stephane Eranian
2010-02-25 18:29 ` Stephane Eranian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1265300375.22001.102.camel@laptop \
--to=peterz@infradead.org \
--cc=davem@davemloft.net \
--cc=eranian@gmail.com \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=perfmon2-devel@lists.sf.net \
--cc=robert.richter@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.