public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>, Jiri Olsa <jolsa@redhat.com>
Subject: [PATCH v4 0/3] x86/perf/amd: AMD PMC counters and NMI latency
Date: Tue, 2 Apr 2019 15:21:12 +0000	[thread overview]
Message-ID: <cover.1554218314.git.thomas.lendacky@amd.com> (raw)

This patch series addresses issues with increased NMI latency in newer
AMD processors that can result in unknown NMI messages when PMC counters
are active.

The following fixes are included in this series:

- Resolve a race condition when disabling an overflowed PMC counter,
  specifically when updating the PMC counter with a new value.
- Resolve handling of active PMC counter overflows in the perf NMI
  handler and when to report that the NMI is not related to a PMC.
- Remove earlier workaround for spurious NMIs by re-ordering the
  PMC stop sequence to disable the PMC first and then remove the PMC
  bit from the active_mask bitmap. As part of disabling the PMC, the
  code will wait for an overflow to be reset.

The last patch re-works the order of when the PMC is removed from the
active_mask. There was a comment from a long time ago about having
to clear the bit in active_mask before disabling the counter because
the perf NMI handler could re-enable the PMC again. Looking at the
handler today, I don't see that as possible, hence the reordering. The
question will be whether the Intel PMC support will now have issues.
There is still support for using x86_pmu_handle_irq() in the Intel
core.c file.

Also, I couldn't completely get rid of the "running" bit because it
is used by arch/x86/events/intel/p4.c. An old commit comment that
seems to indicate the p4 code suffered the spurious interrupts:
Commit 03e22198d237 ("perf, x86: Handle in flight NMIs on P4 platform").

---

Changes from v3:
- Changed nmi.h include from asm/nmi.h to linux/nmi.h.
- Let the information about the last patch in the cover letter, but
  removed the questions which were previously answered.
- Removed the RFC tag.

Changes from v2 (based on feedback from Peter Z):
- Simplified AMD specific disable_all callback by calling the common
  x86_pmu_disable_all() function and then checking and waiting for
  reset of and overflowed PMCs.
- Removed erroneous check for no active counters in the NMI latency
  mitigation patch, which effectively nullified commit 63e6be6d98e1.
- Reworked x86_pmu_stop() in order to remove 63e6be6d98e1.

Changes from v1 (based on feedback from Peter Z):
- Created an AMD specific disable_all callback function to handle the
  disabling of the counters and resolve the race condition
- Created an AMD specific handle_irq callback function that invokes the
  common x86_pmu_handle_irq() function and then performs the NMI latency
  mitigation.
- Take into account the possibility of non-perf NMI sources when applying
  the mitigation.

This patch series is based off of the perf/core branch of tip:
  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core

  Commit 1a9df9e29c2a ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")

Tom Lendacky (3):
  x86/perf/amd: Resolve race condition when disabling PMC
  x86/perf/amd: Resolve NMI latency issues for active PMCs
  x86/perf/amd: Remove need to check "running" bit in NMI handler

 arch/x86/events/amd/core.c | 139 +++++++++++++++++++++++++++++++++++--
 arch/x86/events/core.c     |  13 +---
 2 files changed, 137 insertions(+), 15 deletions(-)

-- 
2.17.1


             reply	other threads:[~2019-04-02 15:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 15:21 Lendacky, Thomas [this message]
2019-04-02 15:21 ` [PATCH v4 1/3] x86/perf/amd: Resolve race condition when disabling PMC Lendacky, Thomas
2019-04-02 15:21 ` [PATCH v4 2/3] x86/perf/amd: Resolve NMI latency issues for active PMCs Lendacky, Thomas
2019-04-02 15:21 ` [PATCH v4 3/3] x86/perf/amd: Remove need to check "running" bit in NMI handler Lendacky, Thomas
2019-04-03  7:54 ` [PATCH v4 0/3] x86/perf/amd: AMD PMC counters and NMI latency Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1554218314.git.thomas.lendacky@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox