public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: eranian@gmail.com
Cc: Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: PEBS bug on HSW: "Unexpected number of pebs records 10" (was: Re: [GIT PULL] perf changes for v3.12)
Date: Tue, 10 Sep 2013 15:38:45 +0200	[thread overview]
Message-ID: <20130910133845.GB7537@gmail.com> (raw)
In-Reply-To: <CAMsRxfLvbExOzjz8tQu7AchQgKBh5S4b7VMQmFtr1RxK4ksAvA@mail.gmail.com>


* Stephane Eranian <eranian@googlemail.com> wrote:

> Hi,
> 
> Ok, so I am able to reproduce the problem using a simpler
> test case with a simple multithreaded program where
> #threads >> #CPUs.

Does it go away if you use 'perf record --all-cpus'?

> [ 2229.021934] WARNING: CPU: 6 PID: 17496 at
> arch/x86/kernel/cpu/perf_event_intel_ds.c:1003
> intel_pmu_drain_pebs_hsw+0xa8/0xc0()
> [ 2229.021936] Unexpected number of pebs records 21
> 
> [ 2229.021966] Call Trace:
> [ 2229.021967]  <NMI>  [<ffffffff8159dcd6>] dump_stack+0x46/0x58
> [ 2229.021976]  [<ffffffff8108dfdc>] warn_slowpath_common+0x8c/0xc0
> [ 2229.021979]  [<ffffffff8108e0c6>] warn_slowpath_fmt+0x46/0x50
> [ 2229.021982]  [<ffffffff810646c8>] intel_pmu_drain_pebs_hsw+0xa8/0xc0
> [ 2229.021986]  [<ffffffff810668f0>] intel_pmu_handle_irq+0x220/0x380
> [ 2229.021991]  [<ffffffff810c1d35>] ? sched_clock_cpu+0xc5/0x120
> [ 2229.021995]  [<ffffffff815a5a84>] perf_event_nmi_handler+0x34/0x60
> [ 2229.021998]  [<ffffffff815a52b8>] nmi_handle.isra.3+0x88/0x180
> [ 2229.022001]  [<ffffffff815a5490>] do_nmi+0xe0/0x330
> [ 2229.022004]  [<ffffffff815a48f7>] end_repeat_nmi+0x1e/0x2e
> [ 2229.022008]  [<ffffffff810652b3>] ? intel_pmu_pebs_enable_all+0x33/0x40
> [ 2229.022011]  [<ffffffff810652b3>] ? intel_pmu_pebs_enable_all+0x33/0x40
> [ 2229.022015]  [<ffffffff810652b3>] ? intel_pmu_pebs_enable_all+0x33/0x40
> [ 2229.022016]  <<EOE>>  [<ffffffff810659f3>] intel_pmu_enable_all+0x23/0xa0
> [ 2229.022021]  [<ffffffff8105ff84>] x86_pmu_enable+0x274/0x310
> [ 2229.022025]  [<ffffffff81141927>] perf_pmu_enable+0x27/0x30
> [ 2229.022029]  [<ffffffff81143219>] perf_event_context_sched_in+0x79/0xc0
> 
> Could be a HW race whereby the PEBS of each HT threads get mixed up.

Yes, that seems plausible and would explain why the overrun is usually a 
small integer. We set up the DS with PEBS_BUFFER_SIZE == 4096, so with a 
record size of 192 bytes on HSW we should get index values of 0-21.

That fits within the indices range reported so far.

> [...] I will add a couple more checks to verify that. The intr_thres 
> should not have changed. Yet looks like we have a sitation where the 
> index is way past the threshold.

Btw., it would also be nice to add a check of ds->pebs_index against 
ds->pebs_absolute_maximum, to make sure the PEBS record index never goes 
outside the DS area. I.e. to protect against random corruption.

Right now we do only half a check:

        n = top - at;
        if (n <= 0)
                return;

this still allows an upwards overflow. We check x86_pmu.max_pebs_events 
but then let it continue:

        WARN_ONCE(n > x86_pmu.max_pebs_events,
                  "Unexpected number of pebs records %d\n", n);

        return __intel_pmu_drain_pebs_nhm(iregs, at, top);

Instead it should be something more robust, like:

	if (WARN_ONCE(n > max ...)) {
		/* Drain the PEBS buffer: */
		ds->pebs_index = ds->pebs_buffer_base;
		return;
	}

Thanks,

	Ingo

  parent reply	other threads:[~2013-09-10 13:38 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-03 13:29 [GIT PULL] perf changes for v3.12 Ingo Molnar
2013-09-03 13:37 ` Arnaldo Carvalho de Melo
2013-09-03 13:43   ` Ingo Molnar
2013-09-03 17:02 ` Vince Weaver
2013-09-04 17:53 ` Linus Torvalds
2013-09-05 10:56   ` Ingo Molnar
2013-09-05 12:42     ` Frederic Weisbecker
2013-09-05 12:51       ` Ingo Molnar
2013-09-05 12:58         ` Frederic Weisbecker
2013-09-10  8:06       ` Namhyung Kim
2013-09-10 11:18         ` Frederic Weisbecker
2013-09-05 13:38 ` Ingo Molnar
2013-09-08  2:17 ` Linus Torvalds
2013-09-09 10:05   ` Peter Zijlstra
2013-09-10 11:28     ` Stephane Eranian
2013-09-10 11:53       ` PEBS bug on HSW: "Unexpected number of pebs records 10" (was: Re: [GIT PULL] perf changes for v3.12) Ingo Molnar
2013-09-10 12:32         ` Stephane Eranian
2013-09-10 12:42           ` Ramkumar Ramachandra
2013-09-10 12:51           ` Ramkumar Ramachandra
2013-09-10 12:55             ` Stephane Eranian
2013-09-10 13:22               ` Ingo Molnar
2013-09-10 13:38           ` Ingo Molnar [this message]
2013-09-10 14:15             ` Stephane Eranian
2013-09-10 14:29               ` Ingo Molnar
2013-09-10 14:34                 ` Stephane Eranian
2013-09-10 17:14                   ` Ingo Molnar
2013-09-16 11:07                     ` Stephane Eranian
2013-09-16 15:41                       ` Ingo Molnar
2013-09-16 16:29                         ` Peter Zijlstra
2013-09-17  7:00                           ` Ingo Molnar
2013-09-23 15:25                           ` Stephane Eranian
2013-09-23 15:33                             ` Peter Zijlstra
2013-09-23 17:11                               ` Stephane Eranian
2013-09-23 17:24                                 ` Peter Zijlstra
2013-09-10 15:28               ` Peter Zijlstra
2013-09-10 16:14                 ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130910133845.GB7537@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@infradead.org \
    --cc=andi@firstfloor.org \
    --cc=eranian@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox