Re: [PATCH] perf_events: fix and improve x86 event scheduling

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, robert.richter@amd.com,
	mingo@elte.hu, ming.m.lin@intel.com, ak@linux.intel.com
Subject: Re: [PATCH] perf_events: fix and improve x86 event scheduling
Date: Mon, 07 Nov 2011 12:55:48 +0100	[thread overview]
Message-ID: <1320666948.18053.17.camel@twins> (raw)
In-Reply-To: <20111107110149.GA5177@quad>

On Mon, 2011-11-07 at 12:01 +0100, Stephane Eranian wrote:
> Major rewrite of the x86 event scheduling routine.
> The routine is shared between AMD and Intel.
> 
> The existing code had an issue with certain combinations
> of constraints. As Robert Richter @ AMD demonstrated on
> AMD Bulldozer:
> 
> e0 [3,0]
> e1 [2:0]
> e2 [2:0]
> e3 [2:0]
> 
> With this list of events, the existing scheduling algorithm
> would never schedule e3. That's because it would always choose
> counter 0 for e0:
>   e0 -> 0
>   e1 -> 1
>   e2 -> 2
>   e3 -> X
> 
> Robert Richter proposed a fix to the algorithm which was based
> on tagging constraints that overlap. He was using rollbacks to
> simulate recursion.
> 
> We propose an alternate solution which is simpler, faster. This
> is a small modification to the existing algorithm. It adds some
> smart in how a counter is chosen for a given event. The first
> available counter is not systemtically selected. Instead, the

systematically

> algorithm checks how the constraints between the events overlap.
> For each counter, it assigns a weight which is the number of events
> which it can count for the event set. For each event, the algorithm
> assigns the counter with the smallest weight first.
> 
> In the example above, the new algoritm assigns:

algorithm

>   e0 -> 3
>   e1 -> 0
>   e2 -> 1
>   e3 -> 2
> 
> Because:
>   counter 0, weight = 4
>   counter 1, weight = 3
>   counter 2, weight = 3
>   counter 3, weight = 1
> 
> We maximize PMU usage and ensure all events are measured.

Might be good to mention this is O(n^3) in time; also is there still a
case where this algorithm will fail and the O(n!) one will succeed?

The complexities seem to suggest there is, and I suspect it would be
where there is a tie in constraint weight. Please think about this and
tell us why we don't care ;-)

Few nits on the code:

> +int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> +{
> +	struct event_constraint **c;
> +	struct event_constraint *constraints[X86_PMC_IDX_MAX];
> +	struct perf_event *events[X86_PMC_IDX_MAX];
> +	unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> +	int i;
> +	int num_c = 0, num_u = 0;
> +	struct hw_perf_event *hwc;
> +
> +	/*
> +	 * collect event constraints
> +	 * separate constrained vs. unconstrained events
> +	 *
> +	 * map_idx = actual index in event_list and assign arrays
> +	 *
> +	 * we add constrained events from index 0 to n
> +	 * we add unconstrained events from index n - 1 to 0
> +	 * that way we save stack memory by not defining two arrays
> +	 */
> +	for (i = 0; i < n; i++) {
> +		constraints[i] =
> +		    x86_pmu.get_event_constraints(cpuc, cpuc->event_list[i]);
> +		if (constraints[i] != &unconstrained) {
> +			events[num_c] = cpuc->event_list[i];
> +			events[num_c]->hw.map_idx = i;
> +			num_c++;
> +		} else {
> +			events[n - num_u - 1] = cpuc->event_list[i];
> +			events[n - num_u - 1]->hw.map_idx = i;
> +			num_u++;
>  		}
>  	}
> +	/*
> +	 * fastpath: try to reuse previous assignments
> +	 */
> +	for (i = 0, c = constraints; i < n; i++, c++) {
> +		hwc = &cpuc->event_list[i]->hw;
> +
> +		/* never assigned, must go to slow path */
> +		if (hwc->idx == -1)
> +			break;
> +
> +		/* constraint still honored */
> +		if (!test_bit(hwc->idx, (*c)->idxmsk))
> +			break;
> +
> +		/* counter not already used */
> +		if (test_bit(hwc->idx, used_mask))
> +			break;
> +
> +		__set_bit(hwc->idx, used_mask);
> +		if (assign)
> +			assign[i] = hwc->idx;
> +	}
> +	/* was able to reuse every counter, we're done */
> +	if (i == n) {
> +		num_u = num_c = 0;
> +		goto done;
> +	}
> +
> +	/*
> +	 * begin slow path
> +	 */
> +	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
> +
> +	/* schedule constrained events */
> +	if (num_c)
> +		num_c = x86_sched_constrained(events,
> +					      num_c,
> +					      constraints,
> +					      n, used_mask, assign);
> +
> +	/* schedule unconstrained events */
> +	if (num_u)
> +		num_u = x86_sched_unconstrained(events,
> +						num_u,
> +						n, used_mask, assign);

For both branches, please wrap in (superfluous) curly braces.

>  done:
>  	/*
>  	 * scheduling failed or is just a simulation,
>  	 * free resources if necessary
>  	 */
> -	if (!assign || num) {
> +	if (!assign || num_u || num_c) {
>  		for (i = 0; i < n; i++) {
>  			if (x86_pmu.put_event_constraints)
>  				x86_pmu.put_event_constraints(cpuc, cpuc->event_list[i]);
>  		}
>  	}
> -	return num ? -ENOSPC : 0;
> +	return num_c || num_u ? -ENOSPC : 0;
>  }

We really should change that -ENOSPC... :-)

>  
>  /*
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 1e9ebe5..2605244 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -564,6 +564,7 @@ struct hw_perf_event {
>  			int		idx;
>  			int		last_cpu;
>  			struct hw_perf_event_extra extra_reg;
> +			int		map_idx;
>  		};
>  		struct { /* software */
>  			struct hrtimer	hrtimer;

Somewhat sad we need to add a field there, I think its ok we the
software side of things is still the largest so it doesn't matter but do
check.

next prev parent reply	other threads:[~2011-11-07 11:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-07 11:01 [PATCH] perf_events: fix and improve x86 event scheduling Stephane Eranian
2011-11-07 11:55 ` Peter Zijlstra [this message]
2011-11-07 12:10 ` Peter Zijlstra
2011-11-07 13:52   ` Stephane Eranian
2011-11-07 14:56     ` Peter Zijlstra
2011-11-10 14:37 ` Peter Zijlstra
2011-11-10 15:09   ` Stephane Eranian
2011-11-10 16:41     ` Gleb Natapov
2011-11-10 16:59       ` Stephane Eranian
2011-11-10 17:13         ` Gleb Natapov
2011-11-10 16:59     ` Robert Richter
2011-11-10 18:31     ` Peter Zijlstra
2011-11-10 18:03 ` Robert Richter
2011-11-10 18:41   ` Peter Zijlstra
2011-11-10 18:52   ` Peter Zijlstra
2011-11-11 14:29     ` Robert Richter
2011-11-14 17:51       ` [PATCH v3 0/2] perf, x86: handle overlapping counters Robert Richter
2011-11-14 17:51         ` [PATCH v3 1/2] perf, x86: Implement event scheduler helper functions Robert Richter
2011-11-16 16:02           ` Peter Zijlstra
2011-11-16 19:23             ` Robert Richter
2011-11-14 17:51         ` [PATCH v3 2/2] perf, x86: Fix event scheduler for constraints with overlapping counters Robert Richter
2011-11-14 12:55   ` [PATCH] perf_events: fix and improve x86 event scheduling Stephane Eranian
2011-11-14 14:12     ` Peter Zijlstra
2011-11-14 14:26       ` Stephane Eranian
2011-11-14 16:00         ` Peter Zijlstra
2011-11-14 17:39           ` Stephane Eranian
2011-11-14 21:43             ` Peter Zijlstra
2011-11-16 10:28               ` Stephane Eranian
2011-11-14 22:16             ` Peter Zijlstra
2011-11-16 10:06               ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1320666948.18053.17.camel@twins \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    --cc=robert.richter@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox