From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Don Zickus <dzickus@redhat.com>
Cc: mingo@elte.hu, peterz@infradead.org, robert.richter@amd.com,
fweisbec@gmail.com, linux-kernel@vger.kernel.org,
ying.huang@intel.com, ming.m.lin@intel.com, yinghai@kernel.org,
andi@firstfloor.org, eranian@google.com
Subject: Re: [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event on intel perf counter
Date: Thu, 2 Sep 2010 23:26:27 +0400 [thread overview]
Message-ID: <20100902192627.GB5538@lenovo> (raw)
In-Reply-To: <1283454469-1909-2-git-send-email-dzickus@redhat.com>
On Thu, Sep 02, 2010 at 03:07:47PM -0400, Don Zickus wrote:
> During testing of a patch to stop having the perf subsytem swallow nmis,
> it was uncovered that Nehalem boxes were randomly getting unknown nmis
> when using the perf tool.
>
> Moving the ack'ing of the PMI closer to when we get the status allows
> the hardware to properly re-set the PMU bit signaling another PMI was
> triggered during the processing of the first PMI. This allows the new
> logic for dealing with the shortcomings of multiple PMIs to handle the
> extra NMI by 'eat'ing it later.
>
> Now one can wonder why are we getting a second PMI when we disable all
> the PMUs in the begining of the NMI handler to prevent such a case, for
> that I do not know. But I know the fix below helps deal with this quirk.
>
> Tested on multiple Nehalems where the problem was occuring. With the
> patch, the code now loops a second time to handle the second PMI (whereas
> before it was not).
>
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 6 ++----
> 1 files changed, 2 insertions(+), 4 deletions(-)
>
Hi Don,
I might be missing something (I'm sure I'm actually) so enlighten me
a bit please
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index d8d86d0..1297bf1 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -712,7 +712,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
> struct perf_sample_data data;
> struct cpu_hw_events *cpuc;
> int bit, loops;
> - u64 ack, status;
> + u64 status;
>
Lets assume 1 counters is triggered and global bit is set as well
we have here
status = intel_pmu_get_status();
> perf_sample_data_init(&data, 0);
>
> @@ -728,6 +728,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>
> loops = 0;
> again:
> + intel_pmu_ack_status(status);
So here we write just being read value back to CTRL register and _if_ new
overflow happened in this window we've cleared it without processing.
> if (++loops > 100) {
> WARN_ONCE(1, "perfevents: irq loop stuck!\n");
> perf_event_print_debug();
> @@ -736,7 +737,6 @@ again:
> }
>
> inc_irq_stat(apic_perf_irqs);
> - ack = status;
>
> intel_pmu_lbr_read();
>
> @@ -761,8 +761,6 @@ again:
> x86_pmu_stop(event);
> }
>
> - intel_pmu_ack_status(ack);
> -
Here we cleared bits in "status" variable and then we read
status register again without cleaning bits in real physical
register which confuses me.
> /*
> * Repeat if there is more work to be done:
> */
> --
> 1.7.2.2
>
-- Cyrill
next prev parent reply other threads:[~2010-09-02 19:26 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-02 19:07 [PATCH 0/3 v2] nmi perf fixes Don Zickus
2010-09-02 19:07 ` [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event on intel perf counter Don Zickus
2010-09-02 19:26 ` Cyrill Gorcunov [this message]
2010-09-02 20:00 ` Don Zickus
2010-09-02 20:36 ` Cyrill Gorcunov
2010-09-03 7:10 ` [tip:perf/urgent] " tip-bot for Don Zickus
2010-09-03 7:39 ` Yinghai Lu
2010-09-03 15:00 ` Don Zickus
2010-09-03 17:15 ` Yinghai Lu
2010-09-03 18:35 ` Don Zickus
2010-09-03 19:24 ` Yinghai Lu
2010-09-03 20:10 ` Don Zickus
2010-10-04 23:24 ` Yinghai Lu
2010-10-11 20:25 ` Don Zickus
2010-09-02 19:07 ` [PATCH 2/3] perf, x86: Try to handle unknown nmis with an enabled PMU Don Zickus
2010-09-03 7:11 ` [tip:perf/urgent] " tip-bot for Robert Richter
2010-09-02 19:07 ` [PATCH 3/3] perf, x86: Fix handle_irq return values Don Zickus
2010-09-03 7:10 ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2010-09-10 11:41 ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 12:10 ` Stephane Eranian
2010-09-10 12:13 ` Stephane Eranian
2010-09-10 13:27 ` Don Zickus
2010-09-10 14:46 ` Ingo Molnar
2010-09-10 15:17 ` Robert Richter
2010-09-10 15:58 ` Peter Zijlstra
2010-09-10 16:41 ` Ingo Molnar
2010-09-10 16:42 ` Ingo Molnar
2010-09-10 16:37 ` Ingo Molnar
2010-09-10 16:51 ` Ingo Molnar
2010-09-10 15:56 ` [PATCH] x86: fix duplicate calls of the nmi handler Robert Richter
2010-09-10 16:15 ` Peter Zijlstra
2010-09-11 9:41 ` Ingo Molnar
2010-09-11 11:44 ` Robert Richter
2010-09-11 12:45 ` Ingo Molnar
2010-09-12 9:52 ` Robert Richter
2010-09-13 14:37 ` Robert Richter
2010-09-14 17:41 ` Robert Richter
2010-09-15 16:20 ` [PATCH] perf, x86: catch spurious interrupts after disabling counters Robert Richter
2010-09-15 16:36 ` Stephane Eranian
2010-09-15 17:00 ` Robert Richter
2010-09-15 17:32 ` Stephane Eranian
2010-09-15 18:44 ` Robert Richter
2010-09-15 19:34 ` Cyrill Gorcunov
2010-09-15 20:21 ` Stephane Eranian
2010-09-15 20:39 ` Cyrill Gorcunov
2010-09-15 22:27 ` Robert Richter
2010-09-16 14:51 ` Frederic Weisbecker
2010-09-15 16:46 ` Cyrill Gorcunov
2010-09-15 16:47 ` Stephane Eranian
2010-09-15 17:02 ` Cyrill Gorcunov
2010-09-15 17:28 ` Robert Richter
2010-09-15 17:40 ` Cyrill Gorcunov
2010-09-15 22:10 ` Robert Richter
2010-09-16 6:53 ` Cyrill Gorcunov
2010-09-16 17:34 ` Peter Zijlstra
2010-09-17 8:51 ` Robert Richter
2010-09-17 9:14 ` Peter Zijlstra
2010-09-17 13:06 ` Stephane Eranian
2010-09-20 8:41 ` Robert Richter
2010-09-24 0:02 ` Don Zickus
2010-09-24 3:18 ` Don Zickus
2010-09-24 10:03 ` Robert Richter
2010-09-24 13:38 ` Stephane Eranian
2010-09-30 12:33 ` Peter Zijlstra
2010-09-24 18:11 ` Don Zickus
2010-09-24 10:41 ` [tip:perf/urgent] perf, x86: Catch " tip-bot for Robert Richter
2010-09-29 12:26 ` Stephane Eranian
2010-09-29 12:53 ` Robert Richter
2010-09-29 12:54 ` Robert Richter
2010-09-29 13:13 ` Stephane Eranian
2010-09-29 13:28 ` Stephane Eranian
2010-09-29 15:01 ` Robert Richter
2010-09-29 15:12 ` Robert Richter
2010-09-29 15:27 ` Cyrill Gorcunov
2010-09-29 15:33 ` Stephane Eranian
2010-09-29 15:45 ` Cyrill Gorcunov
2010-09-29 15:51 ` Cyrill Gorcunov
2010-09-29 16:32 ` Robert Richter
2010-09-29 16:48 ` Cyrill Gorcunov
2010-09-29 16:00 ` Stephane Eranian
2010-09-29 17:09 ` Robert Richter
2010-09-29 17:41 ` Cyrill Gorcunov
2010-09-29 18:12 ` Don Zickus
2010-09-29 19:42 ` Stephane Eranian
2010-09-29 20:03 ` Don Zickus
2010-09-30 9:12 ` Robert Richter
2010-09-30 19:44 ` Don Zickus
2010-10-01 7:17 ` Robert Richter
[not found] ` <AANLkTimUyLaVaBigjm0-CwRsdh4UXWDiss2ffX53S+k_@mail.gmail.com>
2010-10-01 11:53 ` Stephane Eranian
2010-10-02 9:35 ` Robert Richter
2010-10-04 8:53 ` Stephane Eranian
2010-10-04 9:07 ` Andi Kleen
2010-10-04 17:28 ` Stephane Eranian
2010-09-29 16:31 ` Robert Richter
2010-09-29 16:22 ` Robert Richter
2010-09-29 19:01 ` Don Zickus
2010-09-29 13:39 ` Robert Richter
2010-09-29 13:56 ` Stephane Eranian
2010-09-29 14:00 ` Stephane Eranian
2010-10-02 9:50 ` Robert Richter
2010-10-02 17:40 ` Stephane Eranian
2010-09-29 15:02 ` Cyrill Gorcunov
2010-09-16 17:42 ` [PATCH] x86: fix duplicate calls of the nmi handler Peter Zijlstra
2010-09-16 20:18 ` Stephane Eranian
2010-09-17 7:09 ` Peter Zijlstra
2010-09-17 0:13 ` Huang Ying
2010-09-17 7:52 ` Peter Zijlstra
2010-09-17 8:13 ` Robert Richter
2010-09-17 8:37 ` Cyrill Gorcunov
2010-09-17 8:47 ` Huang Ying
2010-09-10 13:34 ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 13:52 ` Peter Zijlstra
2010-09-13 8:55 ` Cyrill Gorcunov
2010-09-13 9:54 ` Stephane Eranian
2010-09-13 10:07 ` Cyrill Gorcunov
2010-09-13 10:10 ` Stephane Eranian
2010-09-13 10:12 ` Cyrill Gorcunov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100902192627.GB5538@lenovo \
--to=gorcunov@gmail.com \
--cc=andi@firstfloor.org \
--cc=dzickus@redhat.com \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
--cc=ying.huang@intel.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox