From: Stephane Eranian <eranian@google.com>
To: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/4] perf, x86: Fix event scheduler to solve complex scheduling problems
Date: Sat, 16 Apr 2011 08:52:54 -0700 [thread overview]
Message-ID: <BANLkTikg5mJNDTqNMRz3AtjfnAa6mLdtHw@mail.gmail.com> (raw)
In-Reply-To: <1302913676-14352-5-git-send-email-robert.richter@amd.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 7153 bytes --]
On Fri, Apr 15, 2011 at 5:27 PM, Robert Richter <robert.richter@amd.com> wrote:
> The current x86 event scheduler fails to resolve scheduling problems
> of certain combinations of events and constraints. This happens esp.
> for events with complex constraints such as those of the AMD family
> 15h pmu. The scheduler does not find then an existing solution.
> Examples are:
>
>     event code    counter     failure     possible solution
>
> 1) Â Â Â 0x043 Â Â Â Â Â PMC[2:0] Â Â Â Â 0 Â Â Â Â Â Â Â 1
> Â Â Â Â 0x02E Â Â Â Â Â PMC[3,0] Â Â Â Â 3 Â Â Â Â Â Â Â 0
> Â Â Â Â 0x003 Â Â Â Â Â PMC3 Â Â Â Â Â Â FAIL Â Â Â Â Â Â 3
>
I am not sure I understand this failure case. If I recall
the scheduler looks at the weight of each event first:
weight
1) Â Â Â 0x043 Â Â Â Â Â PMC[2:0] Â 3
    0x02E      PMC[3,0]  2
    0x003      PMC3     1
Then, it schedules in increasing weight order. So it will
schedule weight 1, 2, 3. For weight 1, it will find counter3,
for weight 2, it will take counter0, for weight 3, it will
take counter 1 given 0 is already used.
Or am I reading your example the wrong way?
The fact that counter have overlapping constraints
should not matter. In fact this is what happens with
counters without constraints.
> 2) Â Â Â 0x02E Â Â Â Â Â PMC[3,0] Â Â Â Â 0 Â Â Â Â Â Â Â 3
> Â Â Â Â 0x043 Â Â Â Â Â PMC[2:0] Â Â Â Â 1 Â Â Â Â Â Â Â 0
> Â Â Â Â 0x045 Â Â Â Â Â PMC[2:0] Â Â Â Â 2 Â Â Â Â Â Â Â 1
> Â Â Â Â 0x046 Â Â Â Â Â PMC[2:0] Â Â Â Â FAIL Â Â Â Â Â Â 2
>
> Scheduling events on counters is a Hamiltonian path problem. To find a
> possible solution we must traverse all existing paths. This patch
> implements this.
>
> We need to save all states of already walked paths. If we fail to
> schedule an event we now rollback the previous state and try to use
> another free counter until we have analysed all paths.
>
> We might consider to later remove the constraint weight implementation
> completely, but I left this out as this is a much bigger and more
> risky change than this fix.
>
> Cc: Stephane Eranian <eranian@google.com>
> Signed-off-by: Robert Richter <robert.richter@amd.com>
> ---
> Â arch/x86/kernel/cpu/perf_event.c | Â 48 +++++++++++++++++++++++++++++++------
> Â 1 files changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 224a84f..887a500 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -770,11 +770,19 @@ static inline int is_x86_event(struct perf_event *event)
> Â Â Â Â return event->pmu == &pmu;
> Â }
>
> +struct sched_log
> +{
> +    int   i;
> +    int   w;
> +    int   idx;
> +};
> +
> Â static int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> Â {
> Â Â Â Â struct event_constraint *c, *constraints[X86_PMC_IDX_MAX];
> Â Â Â Â unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> - Â Â Â int i, j, w, wmax, num = 0;
> + Â Â Â struct sched_log sched_log[X86_PMC_IDX_MAX];
> + Â Â Â int i, idx, w, wmax, num = 0, scheduled = 0;
> Â Â Â Â struct hw_perf_event *hwc;
>
> Â Â Â Â bitmap_zero(used_mask, X86_PMC_IDX_MAX);
> @@ -815,6 +823,7 @@ static int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> Â Â Â Â */
>
> Â Â Â Â bitmap_zero(used_mask, X86_PMC_IDX_MAX);
> + Â Â Â memset(&sched_log, 0, sizeof(sched_log));
>
> Â Â Â Â /*
> Â Â Â Â * weight = number of possible counters
> @@ -838,25 +847,48 @@ static int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> Â Â Â Â for (w = 1, num = n; num && w <= wmax; w++) {
> Â Â Â Â Â Â Â Â /* for each event */
> Â Â Â Â Â Â Â Â for (i = 0; num && i < n; i++) {
> + Â Â Â Â Â Â Â redo:
> Â Â Â Â Â Â Â Â Â Â Â Â c = constraints[i];
> Â Â Â Â Â Â Â Â Â Â Â Â hwc = &cpuc->event_list[i]->hw;
>
> Â Â Â Â Â Â Â Â Â Â Â Â if (c->weight != w)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â continue;
>
> - Â Â Â Â Â Â Â Â Â Â Â for_each_set_bit(j, c->idxmsk, X86_PMC_IDX_MAX) {
> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (!test_bit(j, used_mask))
> + Â Â Â Â Â Â Â Â Â Â Â idx = sched_log[scheduled].idx;
> + Â Â Â Â Â Â Â Â Â Â Â /* for each bit in idxmsk starting from idx */
> + Â Â Â Â Â Â Â Â Â Â Â while (idx < X86_PMC_IDX_MAX) {
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â idx = find_next_bit(c->idxmsk, X86_PMC_IDX_MAX,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â idx);
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (idx == X86_PMC_IDX_MAX)
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (!__test_and_set_bit(idx, used_mask))
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â idx++;
> Â Â Â Â Â Â Â Â Â Â Â Â }
>
> - Â Â Â Â Â Â Â Â Â Â Â if (j == X86_PMC_IDX_MAX)
> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
> -
> - Â Â Â Â Â Â Â Â Â Â Â __set_bit(j, used_mask);
> + Â Â Â Â Â Â Â Â Â Â Â if (idx >= X86_PMC_IDX_MAX) {
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â /* roll back and try next free counter */
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (!scheduled)
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â /* no free counters anymore */
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â sched_log[scheduled].idx = 0;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â scheduled--;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â num++;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â clear_bit(sched_log[scheduled].idx, used_mask);
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â i = sched_log[scheduled].i;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â w = sched_log[scheduled].w;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â sched_log[scheduled].idx++;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â goto redo;
> + Â Â Â Â Â Â Â Â Â Â Â }
>
> Â Â Â Â Â Â Â Â Â Â Â Â if (assign)
> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â assign[i] = j;
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â assign[i] = idx;
> +
> Â Â Â Â Â Â Â Â Â Â Â Â num--;
> + Â Â Â Â Â Â Â Â Â Â Â sched_log[scheduled].i = i;
> + Â Â Â Â Â Â Â Â Â Â Â sched_log[scheduled].w = w;
> + Â Â Â Â Â Â Â Â Â Â Â sched_log[scheduled].idx = idx;
> + Â Â Â Â Â Â Â Â Â Â Â scheduled++;
> Â Â Â Â Â Â Â Â }
> Â Â Â Â }
> Â done:
> --
> 1.7.3.4
>
>
>
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
next prev parent reply other threads:[~2011-04-16 15:53 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-16 0:27 [PATCH 0/4] perf, x86: Fixes for v2.6.39 Robert Richter
2011-04-16 0:27 ` [PATCH 1/4] perf, x86: Fix pre-defined cache-misses event for AMD family 15h cpus Robert Richter
2011-04-19 12:03 ` [tip:perf/urgent] " tip-bot for Andre Przywara
2011-04-16 0:27 ` [PATCH 2/4] perf, x86: Fix AMD family 15h FPU event constraints Robert Richter
2011-04-19 12:04 ` [tip:perf/urgent] " tip-bot for Robert Richter
2011-04-16 0:27 ` [PATCH 3/4] perf, x86: Use ALTERNATIVE() to check for X86_FEATURE_PERFCTR_CORE Robert Richter
2011-04-18 20:00 ` Andi Kleen
2011-04-19 10:39 ` Robert Richter
2011-04-19 18:21 ` Andi Kleen
2011-04-19 12:04 ` [tip:perf/core] " tip-bot for Robert Richter
2011-04-16 0:27 ` [PATCH 4/4] perf, x86: Fix event scheduler to solve complex scheduling problems Robert Richter
2011-04-16 8:51 ` Peter Zijlstra
2011-04-16 9:43 ` Ingo Molnar
2011-04-16 10:08 ` Peter Zijlstra
2011-04-16 10:14 ` Ingo Molnar
2011-04-16 10:15 ` Peter Zijlstra
2011-04-16 14:26 ` Valdis.Kletnieks
2011-04-17 8:15 ` Robert Richter
2011-04-17 8:18 ` Ingo Molnar
2011-04-17 8:53 ` Peter Zijlstra
2011-04-17 11:23 ` Robert Richter
2011-04-18 8:17 ` Robert Richter
2011-04-16 15:52 ` Stephane Eranian [this message]
2011-04-17 8:44 ` Robert Richter
2011-04-17 9:05 ` Stephane Eranian
2011-04-19 10:26 ` [PATCH v2] perf, x86: Fix event scheduler for constraints with overlapping counters Robert Richter
2011-04-19 11:29 ` Ingo Molnar
2011-04-19 13:55 ` Robert Richter
2011-04-28 9:50 ` Robert Richter
2011-05-18 21:16 ` Peter Zijlstra
2011-05-18 21:20 ` Ingo Molnar
2011-05-18 21:36 ` Peter Zijlstra
2011-05-19 10:49 ` Robert Richter
2011-05-19 18:06 ` Ingo Molnar
2011-05-20 3:18 ` Robert Richter
2011-09-01 12:56 ` Peter Zijlstra
2011-09-01 14:12 ` Robert Richter
2011-09-01 16:37 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BANLkTikg5mJNDTqNMRz3AtjfnAa6mLdtHw@mail.gmail.com \
--to=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).