Re: [PATCH] perf_events: fix and improve x86 event scheduling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, robert.richter@amd.com,
	mingo@elte.hu, ming.m.lin@intel.com, ak@linux.intel.com
Subject: Re: [PATCH] perf_events: fix and improve x86 event scheduling
Date: Mon, 07 Nov 2011 12:55:48 +0100	[thread overview]
Message-ID: <1320666948.18053.17.camel@twins> (raw)
In-Reply-To: <20111107110149.GA5177@quad>

On Mon, 2011-11-07 at 12:01 +0100, Stephane Eranian wrote:
> Major rewrite of the x86 event scheduling routine.
> The routine is shared between AMD and Intel.
> 
> The existing code had an issue with certain combinations
> of constraints. As Robert Richter @ AMD demonstrated on
> AMD Bulldozer:
> 
> e0 [3,0]
> e1 [2:0]
> e2 [2:0]
> e3 [2:0]
> 
> With this list of events, the existing scheduling algorithm
> would never schedule e3. That's because it would always choose
> counter 0 for e0:
>   e0 -> 0
>   e1 -> 1
>   e2 -> 2
>   e3 -> X
> 
> Robert Richter proposed a fix to the algorithm which was based
> on tagging constraints that overlap. He was using rollbacks to
> simulate recursion.
> 
> We propose an alternate solution which is simpler, faster. This
> is a small modification to the existing algorithm. It adds some
> smart in how a counter is chosen for a given event. The first
> available counter is not systemtically selected. Instead, the

systematically

> algorithm checks how the constraints between the events overlap.
> For each counter, it assigns a weight which is the number of events
> which it can count for the event set. For each event, the algorithm
> assigns the counter with the smallest weight first.
> 
> In the example above, the new algoritm assigns:

algorithm

>   e0 -> 3
>   e1 -> 0
>   e2 -> 1
>   e3 -> 2
> 
> Because:
>   counter 0, weight = 4
>   counter 1, weight = 3
>   counter 2, weight = 3
>   counter 3, weight = 1
> 
> We maximize PMU usage and ensure all events are measured.

Might be good to mention this is O(n^3) in time; also is there still a
case where this algorithm will fail and the O(n!) one will succeed?

The complexities seem to suggest there is, and I suspect it would be
where there is a tie in constraint weight. Please think about this and
tell us why we don't care ;-)

Few nits on the code:

> +int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> +{
> +	struct event_constraint **c;
> +	struct event_constraint *constraints[X86_PMC_IDX_MAX];
> +	struct perf_event *events[X86_PMC_IDX_MAX];
> +	unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> +	int i;
> +	int num_c = 0, num_u = 0;
> +	struct hw_perf_event *hwc;
> +
> +	/*
> +	 * collect event constraints
> +	 * separate constrained vs. unconstrained events
> +	 *
> +	 * map_idx = actual index in event_list and assign arrays
> +	 *
> +	 * we add constrained events from index 0 to n
> +	 * we add unconstrained events from index n - 1 to 0
> +	 * that way we save stack memory by not defining two arrays
> +	 */
> +	for (i = 0; i < n; i++) {
> +		constraints[i] =
> +		    x86_pmu.get_event_constraints(cpuc, cpuc->event_list[i]);
> +		if (constraints[i] != &unconstrained) {
> +			events[num_c] = cpuc->event_list[i];
> +			events[num_c]->hw.map_idx = i;
> +			num_c++;
> +		} else {
> +			events[n - num_u - 1] = cpuc->event_list[i];
> +			events[n - num_u - 1]->hw.map_idx = i;
> +			num_u++;
>  		}
>  	}
> +	/*
> +	 * fastpath: try to reuse previous assignments
> +	 */
> +	for (i = 0, c = constraints; i < n; i++, c++) {
> +		hwc = &cpuc->event_list[i]->hw;
> +
> +		/* never assigned, must go to slow path */
> +		if (hwc->idx == -1)
> +			break;
> +
> +		/* constraint still honored */
> +		if (!test_bit(hwc->idx, (*c)->idxmsk))
> +			break;
> +
> +		/* counter not already used */
> +		if (test_bit(hwc->idx, used_mask))
> +			break;
> +
> +		__set_bit(hwc->idx, used_mask);
> +		if (assign)
> +			assign[i] = hwc->idx;
> +	}
> +	/* was able to reuse every counter, we're done */
> +	if (i == n) {
> +		num_u = num_c = 0;
> +		goto done;
> +	}
> +
> +	/*
> +	 * begin slow path
> +	 */
> +	bitmap_zero(used_mask, X86_PMC_IDX_MAX);
> +
> +	/* schedule constrained events */
> +	if (num_c)
> +		num_c = x86_sched_constrained(events,
> +					      num_c,
> +					      constraints,
> +					      n, used_mask, assign);
> +
> +	/* schedule unconstrained events */
> +	if (num_u)
> +		num_u = x86_sched_unconstrained(events,
> +						num_u,
> +						n, used_mask, assign);

For both branches, please wrap in (superfluous) curly braces.

>  done:
>  	/*
>  	 * scheduling failed or is just a simulation,
>  	 * free resources if necessary
>  	 */
> -	if (!assign || num) {
> +	if (!assign || num_u || num_c) {
>  		for (i = 0; i < n; i++) {
>  			if (x86_pmu.put_event_constraints)
>  				x86_pmu.put_event_constraints(cpuc, cpuc->event_list[i]);
>  		}
>  	}
> -	return num ? -ENOSPC : 0;
> +	return num_c || num_u ? -ENOSPC : 0;
>  }

We really should change that -ENOSPC... :-)

>  
>  /*
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 1e9ebe5..2605244 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -564,6 +564,7 @@ struct hw_perf_event {
>  			int		idx;
>  			int		last_cpu;
>  			struct hw_perf_event_extra extra_reg;
> +			int		map_idx;
>  		};
>  		struct { /* software */
>  			struct hrtimer	hrtimer;

Somewhat sad we need to add a field there, I think its ok we the
software side of things is still the largest so it doesn't matter but do
check.

next prev parent reply	other threads:[~2011-11-07 11:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-07 11:01 [PATCH] perf_events: fix and improve x86 event scheduling Stephane Eranian
2011-11-07 11:55 ` Peter Zijlstra [this message]
2011-11-07 12:10 ` Peter Zijlstra
2011-11-07 13:52   ` Stephane Eranian
2011-11-07 14:56     ` Peter Zijlstra
2011-11-10 14:37 ` Peter Zijlstra
2011-11-10 15:09   ` Stephane Eranian
2011-11-10 16:41     ` Gleb Natapov
2011-11-10 16:59       ` Stephane Eranian
2011-11-10 17:13         ` Gleb Natapov
2011-11-10 16:59     ` Robert Richter
2011-11-10 18:31     ` Peter Zijlstra
2011-11-10 18:03 ` Robert Richter
2011-11-10 18:41   ` Peter Zijlstra
2011-11-10 18:52   ` Peter Zijlstra
2011-11-11 14:29     ` Robert Richter
2011-11-14 17:51       ` [PATCH v3 0/2] perf, x86: handle overlapping counters Robert Richter
2011-11-14 17:51         ` [PATCH v3 1/2] perf, x86: Implement event scheduler helper functions Robert Richter
2011-11-16 16:02           ` Peter Zijlstra
2011-11-16 19:23             ` Robert Richter
2011-11-14 17:51         ` [PATCH v3 2/2] perf, x86: Fix event scheduler for constraints with overlapping counters Robert Richter
2011-11-14 12:55   ` [PATCH] perf_events: fix and improve x86 event scheduling Stephane Eranian
2011-11-14 14:12     ` Peter Zijlstra
2011-11-14 14:26       ` Stephane Eranian
2011-11-14 16:00         ` Peter Zijlstra
2011-11-14 17:39           ` Stephane Eranian
2011-11-14 21:43             ` Peter Zijlstra
2011-11-16 10:28               ` Stephane Eranian
2011-11-14 22:16             ` Peter Zijlstra
2011-11-16 10:06               ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1320666948.18053.17.camel@twins \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    --cc=robert.richter@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.