From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Vince Weaver <vincent.weaver@maine.edu>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Jiri Olsa <jolsa@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Stephane Eranian <eranian@gmail.com>,
Stephane Eranian <eranian@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
"davej@codemonkey.org.uk" <davej@codemonkey.org.uk>,
"dvyukov@google.com" <dvyukov@google.com>,
Ingo Molnar <mingo@kernel.org>
Subject: [PATCH 4.8 21/37] perf/x86/intel: Cure bogus unwind from PEBS entries
Date: Wed, 30 Nov 2016 10:29:58 +0100 [thread overview]
Message-ID: <20161130092730.751903772@linuxfoundation.org> (raw)
In-Reply-To: <20161130092729.623248210@linuxfoundation.org>
4.8-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra <peterz@infradead.org>
commit b8000586c90b4804902058a38d3a59ce5708e695 upstream.
Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
unwinds sometimes do 'weird' things. In particular, we seemed to be
ending up unwinding from random places on the NMI stack.
While it was somewhat expected that the event record BP,SP would not
match the interrupt BP,SP in that the interrupt is strictly later than
the record event, it was overlooked that it could be on an already
overwritten stack.
Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
when we need stack unwinds.
Note that its still possible the unwind doesn't full match the actual
event, as its entirely possible to have done an (I)RET between record
and interrupt, but on average it should still point in the general
direction of where the event came from. Also, it's the best we can do,
considering.
The particular scenario that triggered the bogus NMI stack unwind was
a PEBS event with very short period, upon enabling the event at the
tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
triggers a record (while still on the NMI stack) which in turn
triggers the next PMI. This then causes back-to-back NMIs and we'll
try and unwind the stack-frame from the last NMI, which obviously is
now overwritten by our own.
Analyzed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
Cc: dvyukov@google.com <dvyukov@google.com>
Fixes: ca037701a025 ("perf, x86: Add PEBS infrastructure")
Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/events/intel/ds.c | 35 +++++++++++++++++++++++------------
arch/x86/events/perf_event.h | 2 +-
2 files changed, 24 insertions(+), 13 deletions(-)
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1070,20 +1070,20 @@ static void setup_pebs_sample_data(struc
}
/*
- * We use the interrupt regs as a base because the PEBS record
- * does not contain a full regs set, specifically it seems to
- * lack segment descriptors, which get used by things like
- * user_mode().
+ * We use the interrupt regs as a base because the PEBS record does not
+ * contain a full regs set, specifically it seems to lack segment
+ * descriptors, which get used by things like user_mode().
*
- * In the simple case fix up only the IP and BP,SP regs, for
- * PERF_SAMPLE_IP and PERF_SAMPLE_CALLCHAIN to function properly.
- * A possible PERF_SAMPLE_REGS will have to transfer all regs.
+ * In the simple case fix up only the IP for PERF_SAMPLE_IP.
+ *
+ * We must however always use BP,SP from iregs for the unwinder to stay
+ * sane; the record BP,SP can point into thin air when the record is
+ * from a previous PMI context or an (I)RET happend between the record
+ * and PMI.
*/
*regs = *iregs;
regs->flags = pebs->flags;
set_linear_ip(regs, pebs->ip);
- regs->bp = pebs->bp;
- regs->sp = pebs->sp;
if (sample_type & PERF_SAMPLE_REGS_INTR) {
regs->ax = pebs->ax;
@@ -1092,10 +1092,21 @@ static void setup_pebs_sample_data(struc
regs->dx = pebs->dx;
regs->si = pebs->si;
regs->di = pebs->di;
- regs->bp = pebs->bp;
- regs->sp = pebs->sp;
- regs->flags = pebs->flags;
+ /*
+ * Per the above; only set BP,SP if we don't need callchains.
+ *
+ * XXX: does this make sense?
+ */
+ if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ regs->bp = pebs->bp;
+ regs->sp = pebs->sp;
+ }
+
+ /*
+ * Preserve PERF_EFLAGS_VM from set_linear_ip().
+ */
+ regs->flags = pebs->flags | (regs->flags & PERF_EFLAGS_VM);
#ifndef CONFIG_X86_32
regs->r8 = pebs->r8;
regs->r9 = pebs->r9;
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -113,7 +113,7 @@ struct debug_store {
* Per register state.
*/
struct er_account {
- raw_spinlock_t lock; /* per-core: protect structure */
+ raw_spinlock_t lock; /* per-core: protect structure */
u64 config; /* extra MSR config */
u64 reg; /* extra MSR number */
atomic_t ref; /* reference count */
next prev parent reply other threads:[~2016-11-30 9:31 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20161130093010epcas2p3047cf63814e74dfcc79f43d37b446ae0@epcas2p3.samsung.com>
2016-11-30 9:29 ` [PATCH 4.8 00/37] 4.8.12-stable review Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 07/37] usb: chipidea: move the lock initialization to core file Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 08/37] USB: serial: cp210x: add ID for the Zone DPMX Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 09/37] USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 10/37] Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 11/37] scsi: mpt3sas: Fix secure erase premature termination Greg Kroah-Hartman
2016-11-30 16:49 ` Martin K. Petersen
2016-12-01 7:10 ` Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 12/37] tile: avoid using clocksource_cyc2ns with absolute cycle count Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 13/37] cfg80211: limit scan results cache size Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 15/37] drm/radeon: fix power state when port pm is unavailable (v2) Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 16/37] apparmor: fix change_hat not finding hat after policy replacement Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 17/37] NFSv4.x: hide array-bounds warning Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 18/37] x86/fpu: Fix invalid FPU ptrace state after execve() Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 19/37] x86/traps: Ignore high word of regs->cs in early_fixup_exception() Greg Kroah-Hartman
2016-11-30 9:29 ` [PATCH 4.8 20/37] perf/core: Fix address filter parser Greg Kroah-Hartman
2016-11-30 9:29 ` Greg Kroah-Hartman [this message]
2016-11-30 9:29 ` [PATCH 4.8 22/37] thermal/powerclamp: add back module device table Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 23/37] parisc: Fix races in parisc_setup_cache_timing() Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 24/37] parisc: Switch to generic sched_clock implementation Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 25/37] parisc: Fix race in pci-dma.c Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 26/37] parisc: Also flush data TLB in flush_icache_page_asm Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 27/37] mmc: sdhci-of-esdhc: fixup PRESENT_STATE read Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 28/37] mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 29/37] X.509: Fix double free in x509_cert_parse() " Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 30/37] xc2028: Fix use-after-free bug properly Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 31/37] device-dax: check devm_nsio_enable() return value Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 32/37] device-dax: fail all private mapping attempts Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 33/37] powerpc: Set missing wakeup bit in LPCR on POWER9 Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 34/37] powerpc/mm: Fixup kernel read only mapping Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 35/37] powerpc/boot: Fix the early OPAL console wrappers Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 36/37] can: bcm: fix support for CAN FD frames Greg Kroah-Hartman
2016-11-30 9:30 ` [PATCH 4.8 37/37] mm, oom: stop pre-mature high-order OOM killer invocations Greg Kroah-Hartman
[not found] ` <20161130092730.460938123@linuxfoundation.org>
2016-11-30 10:51 ` [PATCH 4.8 14/37] drm/amdgpu: fix power state when port pm is unavailable Peter Wu
2016-11-30 11:53 ` Greg Kroah-Hartman
2016-12-05 0:11 ` Peter Wu
2016-12-05 14:46 ` Greg Kroah-Hartman
2016-11-30 16:04 ` [PATCH 4.8 00/37] 4.8.12-stable review Shuah Khan
2016-12-01 7:14 ` Greg Kroah-Hartman
2016-11-30 23:34 ` Guenter Roeck
2016-12-01 7:15 ` Greg Kroah-Hartman
[not found] ` <583ed167.6602c20a.c3129.a6b8@mx.google.com>
[not found] ` <m2oa0wkjlk.fsf@baylibre.com>
2016-12-01 7:11 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161130092730.751903772@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=davej@codemonkey.org.uk \
--cc=dvyukov@google.com \
--cc=eranian@gmail.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.weaver@maine.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).