From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Jiri Olsa <jolsa@redhat.com>, Jiri Olsa <jolsa@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andrew Vagin <avagin@openvz.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Namhyung Kim <namhyung@kernel.org>,
Stephane Eranian <eranian@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vince Weaver <vincent.weaver@maine.edu>,
Ingo Molnar <mingo@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.4 22/25] perf/ring_buffer: Prevent concurent ring buffer access
Date: Tue, 16 Oct 2018 00:16:03 -0400 [thread overview]
Message-ID: <20181016041606.135876-22-sashal@kernel.org> (raw)
In-Reply-To: <20181016041606.135876-1-sashal@kernel.org>
From: Jiri Olsa <jolsa@redhat.com>
[ Upstream commit cd6fb677ce7e460c25bdd66f689734102ec7d642 ]
Some of the scheduling tracepoints allow the perf_tp_event
code to write to ring buffer under different cpu than the
code is running on.
This results in corrupted ring buffer data demonstrated in
following perf commands:
# perf record -e 'sched:sched_switch,sched:sched_wakeup' perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.383 [sec]
[ perf record: Woken up 8 times to write data ]
0x42b890 [0]: failed to process type: -1765585640
[ perf record: Captured and wrote 4.825 MB perf.data (29669 samples) ]
# perf report --stdio
0x42b890 [0]: failed to process type: -1765585640
The reason for the corruption are some of the scheduling tracepoints,
that have __perf_task dfined and thus allow to store data to another
cpu ring buffer:
sched_waking
sched_wakeup
sched_wakeup_new
sched_stat_wait
sched_stat_sleep
sched_stat_iowait
sched_stat_blocked
The perf_tp_event function first store samples for current cpu
related events defined for tracepoint:
hlist_for_each_entry_rcu(event, head, hlist_entry)
perf_swevent_event(event, count, &data, regs);
And then iterates events of the 'task' and store the sample
for any task's event that passes tracepoint checks:
ctx = rcu_dereference(task->perf_event_ctxp[perf_sw_context]);
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
if (event->attr.type != PERF_TYPE_TRACEPOINT)
continue;
if (event->attr.config != entry->type)
continue;
perf_swevent_event(event, count, &data, regs);
}
Above code can race with same code running on another cpu,
ending up with 2 cpus trying to store under the same ring
buffer, which is specifically not allowed.
This patch prevents the problem, by allowing only events with the same
current cpu to receive the event.
NOTE: this requires the use of (per-task-)per-cpu buffers for this
feature to work; perf-record does this.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
[peterz: small edits to Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: e6dab5ffab59 ("perf/trace: Add ability to set a target task for events")
Link: http://lkml.kernel.org/r/20180923161343.GB15054@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/events/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 990ac41d8a5f..330fcd1b1822 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7018,6 +7018,8 @@ void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
goto unlock;
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
+ if (event->cpu != smp_processor_id())
+ continue;
if (event->attr.type != PERF_TYPE_TRACEPOINT)
continue;
if (event->attr.config != entry->type)
--
2.17.1
next prev parent reply other threads:[~2018-10-16 4:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-16 4:15 [PATCH AUTOSEL 4.4 01/25] xfrm: Validate address prefix lengths in the xfrm selector Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 02/25] xfrm6: call kfree_skb when skb is toobig Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 03/25] mac80211: Always report TX status Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 04/25] cfg80211: reg: Init wiphy_idx in regulatory_hint_core() Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 05/25] cfg80211: Address some corner cases in scan result channel updating Sasha Levin
2018-11-02 9:19 ` Greg KH
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 06/25] ARM: 8799/1: mm: fix pci_ioremap_io() offset check Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 07/25] xfrm: validate template mode Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 08/25] mac80211_hwsim: do not omit multicast announce of first added radio Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 09/25] Bluetooth: SMP: fix crash in unpairing Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 10/25] pxa168fb: prepare the clock Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 11/25] bonding: avoid possible dead-lock Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 12/25] bnxt_en: Fix TX timeout during netpoll Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 13/25] asix: Check for supported Wake-on-LAN modes Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 14/25] ax88179_178a: " Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 15/25] lan78xx: " Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 16/25] sr9800: " Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 17/25] r8152: Check for supported Wake-on-LAN Modes Sasha Levin
2018-10-16 4:15 ` [PATCH AUTOSEL 4.4 18/25] smsc75xx: Check for Wake-on-LAN modes Sasha Levin
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 19/25] smsc95xx: " Sasha Levin
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 20/25] qlcnic: fix Tx descriptor corruption on 82xx devices Sasha Levin
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 21/25] i2c: i2c-scmi: fix for i2c_smbus_write_block_data Sasha Levin
2018-10-16 4:16 ` Sasha Levin [this message]
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 23/25] net/usb: cancel pending work when unbinding smsc75xx Sasha Levin
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 24/25] net: cxgb3_main: fix a missing-check bug Sasha Levin
2018-10-16 4:16 ` [PATCH AUTOSEL 4.4 25/25] mm/vmstat.c: fix outdated vmstat_text Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181016041606.135876-22-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=avagin@openvz.org \
--cc=eranian@google.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.weaver@maine.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox