From: Robert Richter <robert.richter@amd.com>
To: Don Zickus <dzickus@redhat.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
Andi Kleen <andi@firstfloor.org>,
Peter Zijlstra <peterz@infradead.org>,
"ying.huang@intel.com" <ying.huang@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
"paulmck@linux.vnet.ibm.com" <paulmck@linux.vnet.ibm.com>,
"avi@redhat.com" <avi@redhat.com>,
"jeremy@goop.org" <jeremy@goop.org>
Subject: Re: [V4][PATCH 4/6] x86, nmi: add in logic to handle multiple events and unknown NMIs
Date: Thu, 15 Sep 2011 18:55:43 +0200 [thread overview]
Message-ID: <20110915165543.GO6063@erda.amd.com> (raw)
In-Reply-To: <20110914201612.GK6063@erda.amd.com>
[-- Attachment #1: Type: text/plain, Size: 10507 bytes --]
On 14.09.11 22:16:12, Robert Richter wrote:
> On 14.09.11 13:58:09, Don Zickus wrote:
> > On Wed, Sep 14, 2011 at 06:26:53PM +0200, Robert Richter wrote:
> > > On 13.09.11 16:58:27, Don Zickus wrote:
> > > > @@ -87,6 +87,16 @@ static int notrace __kprobes nmi_handle(unsigned int type, struct pt_regs *regs)
> > > >
> > > > handled += a->handler(type, regs);
> > > >
> > > > + /*
> > > > + * Optimization: only loop once if this is not a
> > > > + * back-to-back NMI. The idea is nothing is dropped
> > > > + * on the first NMI, only on the second of a back-to-back
> > > > + * NMI. No need to waste cycles going through all the
> > > > + * handlers.
> > > > + */
> > > > + if (!b2b && handled)
> > > > + break;
> > >
> > > Don, if I am not missing something, this actually does not work
> > > because perfctr NMIs do not re-trigger. Suppose a handler running
> > > before perfctr. It sets 'handled' and the chain is stopped here. To
> > > run through the perfctr handler the NMI must retrigger which it
> > > doesn't.
> >
> > Your patch is incorrect. Your dummy handler does not handle a _real_ NMI.
> > Which means no _real_ NMI was ever generated. Of course perf won't work.
> > You just swallowed its NMI.
> >
> > The change I made is for nmi handlers that actually have an NMI associated
> > with them. The idea is if somebody generated an NMI, it will get handled
> > by a handler. If perf comes along and generates another NMI, it should
> > get latched. Upon handling the first NMI, the perf NMI should be sitting
> > queued up and cause the back-to-back NMI. In this case all the handlers
> > will be executed (to handle dropped NMIs).
>
> Yes, your thought about the latched NMI could work. Though I better
> test this with some real nmis from different sources. Unfortunately
> this is much harder to trigger. Will give it a try. It would be a
> pretty nice optimization then.
Don,
I did some tests today with parallel IBS and perfctr sessions running.
I see cases for all combinations of NMI back-to-back sequences (see
below for the traces, the patch and attached log file):
[1] IBS - IBS
[2] IBS - perfctr
[3] perfctr - IBS
[4] perfctr - perfctr
So we see that all cases exists. Unfortunately this is not an evidence
that the approach works in *any* case, because we don't see
potentially lost entries. This is hard to prove. But I think the can
assume it works as expected. I also asked the hw guys for
clarification. Will let you know if we must modify the algorithm.
-Robert
Some back-to-back traces:
<...>-2358 [002] 35.807818: perf_ibs_nmi_handler: b2b: seq 335: handled: 1 (#336), last handled: 1 (#335)
<...>-2358 [002] 35.808396: perf_ibs_nmi_handler: b2b: seq 340: handled: 1 (#341), last handled: 1 (#340) [1]
<...>-2358 [002] 35.814160: perf_ibs_nmi_handler: b2b: seq 391: handled: 1 (#392), last handled: 1 (#391)
<...>-2358 [002] 35.818585: perf_ibs_nmi_handler: b2b: seq 430: handled: 1 (#431), last handled: 1 (#430)
<...>-2349 [007] 36.026940: perf_ibs_nmi_handler: b2b: seq 2338: handled: 1 (#2339), last handled: 1 (#2338)
<...>-2349 [007] 36.027063: perf_ibs_nmi_handler: b2b: seq 2364: handled: 1 (#2365), last handled: 0 (#2364)
<...>-2349 [007] 36.027064: perf_event_nmi_handler: b2b: seq 168: handled: 0 (#2365), last handled: 1 (#2364)
<...>-2349 [007] 36.027066: perf_ibs_nmi_handler: b2b: seq 2365: handled: 0 (#2366), last handled: 1 (#2365) [2]
<...>-2349 [007] 36.027068: perf_event_nmi_handler: b2b: seq 169: handled: 1 (#2366), last handled: 0 (#2365) [2]
<...>-2349 [007] 36.027183: perf_ibs_nmi_handler: b2b: seq 2389: handled: 0 (#2390), last handled: 0 (#2389)
<...>-2349 [007] 36.027185: perf_event_nmi_handler: b2b: seq 193: handled: 1 (#2390), last handled: 1 (#2389)
<...>-2349 [007] 36.027189: perf_ibs_nmi_handler: b2b: seq 2391: handled: 1 (#2392), last handled: 0 (#2391) [3]
<...>-2349 [007] 36.027191: perf_event_nmi_handler: b2b: seq 195: handled: 0 (#2392), last handled: 1 (#2391) [3]
<...>-2349 [007] 36.027193: perf_ibs_nmi_handler: b2b: seq 2392: handled: 0 (#2393), last handled: 1 (#2392)
<...>-2349 [007] 36.027195: perf_event_nmi_handler: b2b: seq 196: handled: 1 (#2393), last handled: 0 (#2392)
<...>-2349 [007] 36.027206: perf_ibs_nmi_handler: b2b: seq 2395: handled: 0 (#2396), last handled: 0 (#2395)
<...>-2349 [007] 36.027208: perf_event_nmi_handler: b2b: seq 199: handled: 1 (#2396), last handled: 1 (#2395)
<...>-2349 [007] 36.027212: perf_ibs_nmi_handler: b2b: seq 2396: handled: 0 (#2397), last handled: 0 (#2396)
<...>-2349 [007] 36.027215: perf_event_nmi_handler: b2b: seq 200: handled: 1 (#2397), last handled: 1 (#2396)
<...>-2349 [007] 36.027218: perf_ibs_nmi_handler: b2b: seq 2397: handled: 0 (#2398), last handled: 0 (#2397)
<...>-2349 [007] 36.027221: perf_event_nmi_handler: b2b: seq 201: handled: 1 (#2398), last handled: 1 (#2397)
<...>-2349 [007] 36.027224: perf_ibs_nmi_handler: b2b: seq 2398: handled: 0 (#2399), last handled: 0 (#2398)
<...>-2349 [007] 36.027227: perf_event_nmi_handler: b2b: seq 202: handled: 1 (#2399), last handled: 1 (#2398)
<...>-2349 [007] 36.027302: perf_ibs_nmi_handler: b2b: seq 2414: handled: 0 (#2415), last handled: 0 (#2414)
<...>-2358 [002] 36.026053: perf_ibs_nmi_handler: b2b: seq 2306: handled: 0 (#2307), last handled: 0 (#2306)
<...>-2358 [002] 36.026056: perf_event_nmi_handler: b2b: seq 115: handled: 1 (#2307), last handled: 1 (#2306)
<...>-2358 [002] 36.026081: perf_ibs_nmi_handler: b2b: seq 2313: handled: 0 (#2314), last handled: 0 (#2313) [4]
<...>-2358 [002] 36.026083: perf_event_nmi_handler: b2b: seq 121: handled: 1 (#2314), last handled: 1 (#2313) [4]
<...>-2358 [002] 36.026115: perf_ibs_nmi_handler: b2b: seq 2320: handled: 0 (#2321), last handled: 0 (#2320)
<...>-2358 [002] 36.026117: perf_event_nmi_handler: b2b: seq 128: handled: 1 (#2321), last handled: 1 (#2320)
>
> > My only question to you is the IBS stuff you were working on. Does that
> > generate a _real_ NMI or does it just piggy back off of the perf NMI?
>
> Yes, IBS generates real NMIs, there is an own interrupt vector for
> it.
>
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>From 215f2880d166489892865c3e9e2b46ee157a7e2d Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Thu, 15 Sep 2011 11:57:28 +0200
Subject: [PATCH] perf-nmi-test
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/kernel/cpu/perf_event.c | 40 +++++++++++++++++++++++++++---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 36 ++++++++++++++++++++++++++-
2 files changed, 71 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 64eeac3..5830a81 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1376,15 +1376,47 @@ struct pmu_nmi_state {
static DEFINE_PER_CPU(struct pmu_nmi_state, pmu_nmi);
+#define irq_stats(x) (&per_cpu(irq_stat, x))
+
+struct last_nmi {
+ unsigned long rip;
+ unsigned int nmi_count;
+ int handled;
+ unsigned int seq;
+};
+
+static DEFINE_PER_CPU(struct last_nmi, last);
+
+static void check_nmi(struct pt_regs *regs, int handled)
+{
+ int cpu = smp_processor_id();
+ unsigned int nmi_count = irq_stats(cpu)->__nmi_count;
+
+ __this_cpu_inc(last.seq);
+ if (regs->ip == __this_cpu_read(last.rip)) {
+ trace_printk("b2b: seq %d: handled: %d (#%d), last handled: %d (#%d)\n",
+ __this_cpu_read(last.seq),
+ handled,
+ nmi_count,
+ __this_cpu_read(last.handled),
+ __this_cpu_read(last.nmi_count));
+ } else {
+ __this_cpu_write(last.rip, regs->ip);
+ }
+
+ __this_cpu_write(last.handled, handled);
+ __this_cpu_write(last.nmi_count, nmi_count);
+}
+
static int __kprobes
perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
- int handled;
+ int handled = NMI_DONE;
- if (!atomic_read(&active_events))
- return NMI_DONE;
+ if (atomic_read(&active_events))
+ handled = x86_pmu.handle_irq(regs);
- handled = x86_pmu.handle_irq(regs);
+ check_nmi(regs, handled);
return handled;
}
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 11da65b..ac10a94 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -422,6 +422,38 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
return 1;
}
+#define irq_stats(x) (&per_cpu(irq_stat, x))
+
+struct last_nmi {
+ unsigned long rip;
+ unsigned int nmi_count;
+ int handled;
+ unsigned int seq;
+};
+
+static DEFINE_PER_CPU(struct last_nmi, last);
+
+static void check_nmi(struct pt_regs *regs, int handled)
+{
+ int cpu = smp_processor_id();
+ unsigned int nmi_count = irq_stats(cpu)->__nmi_count;
+
+ __this_cpu_inc(last.seq);
+ if (regs->ip == __this_cpu_read(last.rip)) {
+ trace_printk("b2b: seq %d: handled: %d (#%d), last handled: %d (#%d)\n",
+ __this_cpu_read(last.seq),
+ handled,
+ nmi_count,
+ __this_cpu_read(last.handled),
+ __this_cpu_read(last.nmi_count));
+ } else {
+ __this_cpu_write(last.rip, regs->ip);
+ }
+
+ __this_cpu_write(last.handled, handled);
+ __this_cpu_write(last.nmi_count, nmi_count);
+}
+
static int __kprobes
perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
@@ -433,6 +465,8 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
if (handled)
inc_irq_stat(apic_perf_irqs);
+ check_nmi(regs, handled);
+
return handled;
}
@@ -463,7 +497,7 @@ static __init int perf_event_ibs_init(void)
perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
- register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, 0, "perf_ibs");
+ register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, NMI_FLAG_FIRST, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
return 0;
--
1.7.6.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
[-- Attachment #2: perf-nmi-test.log.bz2 --]
[-- Type: application/x-bzip2, Size: 11355 bytes --]
next prev parent reply other threads:[~2011-09-15 16:56 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-13 20:58 [V4][PATCH 0/6] x86, nmi: new NMI handling routines Don Zickus
2011-09-13 20:58 ` [V4][PATCH 1/6] x86, nmi: split out nmi from traps.c Don Zickus
2011-09-13 20:58 ` [V4][PATCH 2/6] x86, nmi: create new NMI handler routines Don Zickus
2011-09-13 20:58 ` [V4][PATCH 3/6] x86, nmi: wire up NMI handlers to new routines Don Zickus
2011-09-13 22:49 ` Corey Minyard
2011-09-13 20:58 ` [V4][PATCH 4/6] x86, nmi: add in logic to handle multiple events and unknown NMIs Don Zickus
2011-09-14 7:08 ` Avi Kivity
2011-09-14 13:00 ` Don Zickus
2011-09-14 13:22 ` Avi Kivity
2011-09-14 15:03 ` Don Zickus
2011-09-14 12:56 ` Don Zickus
2011-09-14 20:20 ` Robert Richter
2011-09-14 16:26 ` Robert Richter
2011-09-14 17:58 ` Don Zickus
2011-09-14 20:16 ` Robert Richter
2011-09-14 20:44 ` Don Zickus
2011-09-15 16:55 ` Robert Richter [this message]
2011-09-13 20:58 ` [V4][PATCH 5/6] x86, nmi: track NMI usage stats Don Zickus
2011-09-13 20:58 ` [V4][PATCH 6/6] x86, nmi: print out NMI stats in /proc/interrupts Don Zickus
2011-09-15 14:47 ` Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110915165543.GO6063@erda.amd.com \
--to=robert.richter@amd.com \
--cc=andi@firstfloor.org \
--cc=avi@redhat.com \
--cc=dzickus@redhat.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.