All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	David Arcari <darcari@redhat.com>, Jiri Olsa <jolsa@redhat.com>,
	Lendacky Thomas <Thomas.Lendacky@amd.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Stephane Eranian <eranian@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vince Weaver <vincent.weaver@maine.edu>,
	Ingo Molnar <mingo@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.19 25/25] perf/x86/intel: Fix race in intel_pmu_disable_event()
Date: Thu, 16 May 2019 07:40:28 -0400	[thread overview]
Message-ID: <20190516114029.8682-25-sashal@kernel.org> (raw)
In-Reply-To: <20190516114029.8682-1-sashal@kernel.org>

From: Jiri Olsa <jolsa@kernel.org>

[ Upstream commit 6f55967ad9d9752813e36de6d5fdbd19741adfc7 ]

New race in x86_pmu_stop() was introduced by replacing the
atomic __test_and_clear_bit() of cpuc->active_mask by separate
test_bit() and __clear_bit() calls in the following commit:

  3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")

The race causes panic for PEBS events with enabled callchains:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  ...
  RIP: 0010:perf_prepare_sample+0x8c/0x530
  Call Trace:
   <NMI>
   perf_event_output_forward+0x2a/0x80
   __perf_event_overflow+0x51/0xe0
   handle_pmi_common+0x19e/0x240
   intel_pmu_handle_irq+0xad/0x170
   perf_event_nmi_handler+0x2e/0x50
   nmi_handle+0x69/0x110
   default_do_nmi+0x3e/0x100
   do_nmi+0x11a/0x180
   end_repeat_nmi+0x16/0x1a
  RIP: 0010:native_write_msr+0x6/0x20
  ...
   </NMI>
   intel_pmu_disable_event+0x98/0xf0
   x86_pmu_stop+0x6e/0xb0
   x86_pmu_del+0x46/0x140
   event_sched_out.isra.97+0x7e/0x160
  ...

The event is configured to make samples from PEBS drain code,
but when it's disabled, we'll go through NMI path instead,
where data->callchain will not get allocated and we'll crash:

          x86_pmu_stop
            test_bit(hwc->idx, cpuc->active_mask)
            intel_pmu_disable_event(event)
            {
              ...
              intel_pmu_pebs_disable(event);
              ...

EVENT OVERFLOW ->  <NMI>
                     intel_pmu_handle_irq
                       handle_pmi_common
   TEST PASSES ->        test_bit(bit, cpuc->active_mask))
                           perf_event_overflow
                             perf_prepare_sample
                             {
                               ...
                               if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
                                     data->callchain = perf_callchain(event, regs);

         CRASH ->              size += data->callchain->nr;
                             }
                   </NMI>
              ...
              x86_pmu_disable_event(event)
            }

            __clear_bit(hwc->idx, cpuc->active_mask);

Fixing this by disabling the event itself before setting
off the PEBS bit.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Arcari <darcari@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/events/intel/core.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index a759e59990fbd..09c53bcbd497d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2074,15 +2074,19 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
 	cpuc->intel_cp_status &= ~(1ull << hwc->idx);
 
-	if (unlikely(event->attr.precise_ip))
-		intel_pmu_pebs_disable(event);
-
 	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
 		intel_pmu_disable_fixed(hwc);
 		return;
 	}
 
 	x86_pmu_disable_event(event);
+
+	/*
+	 * Needs to be called after x86_pmu_disable_event,
+	 * so we don't trigger the event without PEBS bit set.
+	 */
+	if (unlikely(event->attr.precise_ip))
+		intel_pmu_pebs_disable(event);
 }
 
 static void intel_pmu_del_event(struct perf_event *event)
-- 
2.20.1


      parent reply	other threads:[~2019-05-16 11:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 11:40 [PATCH AUTOSEL 4.19 01/25] xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 02/25] xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 03/25] vti4: ipip tunnel deregistration fixes Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 04/25] xfrm: clean up xfrm protocol checks Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 05/25] esp4: add length check for UDP encapsulation Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 06/25] xfrm: Honor original L3 slave device in xfrmi policy lookup Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 07/25] xfrm4: Fix uninitialized memory read in _decode_session4 Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 08/25] clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0) Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 09/25] power: supply: cpcap-battery: Fix division by zero Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 10/25] securityfs: fix use-after-free on symlink traversal Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 11/25] apparmorfs: " Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 12/25] PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 13/25] x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 14/25] mac80211: Fix kernel panic due to use of txq after free Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 15/25] net: ieee802154: fix missing checks for regmap_update_bits Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 16/25] KVM: arm/arm64: Ensure vcpu target is unset on reset failure Sasha Levin
2019-05-16 11:40   ` Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 17/25] power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 18/25] bpf: Fix preempt_enable_no_resched() abuse Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 19/25] qmi_wwan: new Wistron, ZTE and D-Link devices Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 20/25] iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb() Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 21/25] sched/cpufreq: Fix kobject memleak Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 22/25] x86/mm/mem_encrypt: Disable all instrumentation for early SME setup Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 23/25] ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour Sasha Levin
2019-05-16 11:40 ` [PATCH AUTOSEL 4.19 24/25] perf bench numa: Add define for RUSAGE_THREAD if not present Sasha Levin
2019-05-16 11:40   ` Sasha Levin
2019-05-16 11:40 ` Sasha Levin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190516114029.8682-25-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Thomas.Lendacky@amd.com \
    --cc=acme@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=darcari@redhat.com \
    --cc=eranian@google.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.