public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: eranian@gmail.com
Cc: Vince Weaver <vincent.weaver@maine.edu>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	kan.liang@intel.com
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()
Date: Wed, 15 Jul 2015 14:35:46 +0200	[thread overview]
Message-ID: <20150715123546.GK2859@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <CAMsRxfJeV6JYJ-jke863EgJA0sFo0sZUTX8a4X3RhKaCvc_UEw@mail.gmail.com>

On Wed, Jul 15, 2015 at 08:42:50AM +0200, Stephane Eranian wrote:
> On Fri, Jul 3, 2015 at 9:49 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:
> > On Fri, 3 Jul 2015, Peter Zijlstra wrote:
> >
> >> That said, its far too warm and I might just not be making sense.
> >
> > you need to come visit Maine!  Although I am not sure the cooler weather
> > necessarily improves my kernel debugging skills.
> >
> > I managed to lock the machine (again this is with the patch applied).
> >
> I can reproduce the problem on my HSW running the fuzzer.
> 
> I can see why this could be happening if you are mixing PEBS and non PEBS events
> in the bottom 4 counters. I suspect:
>         for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
>                 if ((counts[bit] == 0) && (error[bit] == 0))
>                         continue;
> 
> This test is not correct when you have non-PEBS events mixed with PEBS
> events and
> they overflow at the same time. They will have counts[i] != 0 but
> error[i] == 0, and thus
> you fall thru the loop and hit the assert. Or it is something along those lines.
> 

The only way I can make this work is if ->status only has !PEBS events
set, because if it has both set we'll take that slow path which masks
out the !PEBS bits.

After masking there are 3 options:

 - there is one bit set, and its @bit, we increment counts[bit].
 - there are multiple bits set, we increment error[] for each set bit,
   we do not increment counts[].
 - there are no bits set, we do nothing.

The intent was to never increment counts[] for !PEBS events.

Now if we start out with only a single !PEBS event set, we'll pass the
test and increment counts[] for a !PEBS and hit the warn.

The below patch modifies the code such that it can deal with that
particular issue. Can you try?

---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 71fc40238843..68d0ced1d229 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 
 	for (at = base; at < top; at += x86_pmu.pebs_record_size) {
 		struct pebs_record_nhm *p = at;
+		u64 pebs_status;
 
 		/* PEBS v3 has accurate status bits */
 		if (x86_pmu.intel_cap.pebs_format >= 3) {
@@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 			continue;
 		}
 
-		bit = find_first_bit((unsigned long *)&p->status,
+		pebs_status = p->status & cpuc->pebs_enabled;
+		pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
+
+		bit = find_first_bit((unsigned long *)&pebs_status,
 					x86_pmu.max_pebs_events);
 		if (bit >= x86_pmu.max_pebs_events)
 			continue;
-		if (!test_bit(bit, cpuc->active_mask))
-			continue;
+
 		/*
 		 * The PEBS hardware does not deal well with the situation
 		 * when events happen near to each other and multiple bits
@@ -1172,27 +1175,21 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 		 * one, and it's not possible to reconstruct all events
 		 * that caused the PEBS record. It's called collision.
 		 * If collision happened, the record will be dropped.
-		 *
 		 */
-		if (p->status != (1 << bit)) {
-			u64 pebs_status;
-
-			/* slow path */
-			pebs_status = p->status & cpuc->pebs_enabled;
-			pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
-			if (pebs_status != (1 << bit)) {
-				for_each_set_bit(i, (unsigned long *)&pebs_status,
-						 MAX_PEBS_EVENTS)
-					error[i]++;
-				continue;
-			}
+		if (p->status != (1ULL << bit)) {
+			for_each_set_bit(i, (unsigned long *)&pebs_status,
+					 x86_pmu.max_pebs_events)
+				error[i]++;
+			continue;
 		}
+
 		counts[bit]++;
 	}
 
 	for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
 		if ((counts[bit] == 0) && (error[bit] == 0))
 			continue;
+
 		event = cpuc->events[bit];
 		WARN_ON_ONCE(!event);
 		WARN_ON_ONCE(!event->attr.precise_ip);

  reply	other threads:[~2015-07-15 12:35 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-02 15:18 perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm() Vince Weaver
2015-07-02 19:43 ` Peter Zijlstra
2015-07-03 13:13 ` Peter Zijlstra
2015-07-03 18:56   ` Stephane Eranian
2015-07-03 19:04     ` Peter Zijlstra
2015-07-03 19:49       ` Vince Weaver
2015-07-15  6:42         ` Stephane Eranian
2015-07-15 12:35           ` Peter Zijlstra [this message]
2015-07-16  6:02             ` Stephane Eranian
2015-07-16  7:15               ` Peter Zijlstra
2015-07-16  7:30                 ` Stephane Eranian
2015-07-16 21:12                   ` Stephane Eranian
2015-07-03 19:03   ` Vince Weaver
2015-07-03 20:08   ` Liang, Kan
2015-07-06 10:55     ` Peter Zijlstra
2015-07-06 13:47       ` Liang, Kan
2015-07-06 16:22         ` Vince Weaver
2015-07-06 16:23           ` Liang, Kan
2015-07-06 16:51             ` Vince Weaver
2015-08-04  9:01     ` [tip:perf/core] perf/x86/intel/pebs: Fix event disable PEBS buffer drain tip-bot for Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150715123546.GK2859@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=eranian@gmail.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox