All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Jiri Olsa <jolsa@kernel.org>
Cc: Vince Weaver <vince@deater.net>, Robert Richter <rric@kernel.org>,
	Yan Zheng <zheng.z.yan@intel.com>,
	lkml <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>, Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug
Date: Tue, 8 Nov 2016 13:20:39 +0100	[thread overview]
Message-ID: <20161108122039.GP3142@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1478015068-14052-1-git-send-email-jolsa@kernel.org>

On Tue, Nov 01, 2016 at 04:44:28PM +0100, Jiri Olsa wrote:
> My fuzzer testing hits following warning in the counter scheduling code:
> 
>   WARNING: CPU: 0 PID: 0 at arch/x86/events/core.c:718 perf_assign_events+0x2ae/0x2c0
>   Call Trace:
>    <IRQ>
>    dump_stack+0x68/0x93
>    __warn+0xcb/0xf0
>    warn_slowpath_null+0x1d/0x20
>    perf_assign_events+0x2ae/0x2c0
>    uncore_assign_events+0x1a7/0x250 [intel_uncore]
>    uncore_pmu_event_add+0x7a/0x3c0 [intel_uncore]
>    event_sched_in.isra.104+0xf6/0x2e0
>    group_sched_in+0x6e/0x190
>    ...
> 
> The reason is that the counter scheduling code assumes
> overlap constraints with mask weight < SCHED_STATES_MAX.
> 
> This assumption is broken with uncore cbox constraint
> added for snbep in:


>   3b19e4c98c03 perf/x86: Fix event constraint for SandyBridge-EP C-Box

   3b19e4c98c03 ("perf/x86: Fix event constraint for SandyBridge-EP C-Box")

Is the right form.

> It's also easily triggered by running following perf command
> on snbep box:
>    # perf stat -e 'uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/' -a
> 
> Fixing this by increasing the SCHED_STATES_MAX to 3 and adding build
> check for EVENT_CONSTRAINT_OVERLAP macro.


> -#define	SCHED_STATES_MAX	2
> +#define SCHED_STATES_MAX 3

Us having to increase this is ff'ing sad :/ That's seriously challenged
hardware :/

> +
> +/* Check we dont overlap beyond the states max. */
> +#define OVERLAP_CHECK(n)   (!!sizeof(char[1 - 2*!!(HWEIGHT(n) > SCHED_STATES_MAX)]))
> +#define OVERLAP_HWEIGHT(n) (OVERLAP_CHECK(n)*HWEIGHT(n))

I'm not sure I get how this is correct at all. You cannot tell by a
single mask what the overlap is. You need all the masks.

The point is that that PMU has constraints like:

 0x01 - 0001
 0x03 - 0011
 0x0e - 1110
 0x0c - 1100

Which gets us a total of 4 overlapping counter masks, and that would
indeed lead to max 3 retries I think.

Now, I would much rather solve this by changing the constraint like the
below, that yields:

 0x01 - 0001
 0x03 - 0011

 0x0c - 1100

Which is two distinct groups, only one of which has overlap. And the one
with overlap only has 2 overlapping masks, giving a max reties of 1.


diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 272427700d48..71bc348736bd 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
 	UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
 	UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
 	UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
-	EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
+	UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that gives scheduling pain */
 	UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
 	UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
 	UNCORE_EVENT_CONSTRAINT(0x31, 0x3),

  reply	other threads:[~2016-11-08 12:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-01 15:44 [PATCH] perf/x86: Fix overlap counter scheduling bug Jiri Olsa
2016-11-08 12:20 ` Peter Zijlstra [this message]
2016-11-08 13:14   ` Jiri Olsa
2016-11-08 15:09   ` Andi Kleen
2016-11-08 16:22     ` Liang, Kan
2016-11-08 16:57       ` Peter Zijlstra
2016-11-08 17:25         ` Liang, Kan
2016-11-08 18:27           ` Peter Zijlstra
2016-11-09 14:25             ` Robert Richter
2016-11-09 15:51               ` Peter Zijlstra
2016-11-10  8:00                 ` Ingo Molnar
2016-11-10 16:41                   ` Peter Zijlstra
2016-12-14 15:59                 ` Jiri Olsa
2016-12-22 16:50                 ` [tip:perf/urgent] " tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161108122039.GP3142@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=andi@firstfloor.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rric@kernel.org \
    --cc=vince@deater.net \
    --cc=zheng.z.yan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.