public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: perf: WARNING: kernel/events/core.c:4893 perf_mmap_close
Date: Wed, 24 Aug 2016 10:19:05 +0100	[thread overview]
Message-ID: <20160824091905.GA16944@arm.com> (raw)
In-Reply-To: <87vayrl8fs.fsf@ashishki-desk.ger.corp.intel.com>

Hi Alex,

On Tue, Aug 23, 2016 at 08:08:23PM +0300, Alexander Shishkin wrote:
> Will Deacon <will.deacon@arm.com> writes:
> > On Fri, Aug 12, 2016 at 08:54:49PM +0300, Alexander Shishkin wrote:
> >> Yes, I tracked to a race between unmapping and set_output, trying to
> >> come up with a good fix now.
> >
> > Did you get anywhere with this? I think I just hit the same issue with
> > the intel-pt driver on my broadwell laptop with -rc3+PREEMPT (dump
> > below) and I'm more than happy to test a fix.
> 
> Yes, I posted a patchset [1] this morning that should fix this; if this
> is PT, 3/3 should suffice.

Unfortunately, I still see the same issue with those three patches applied,
so perhaps it's not quite the same as the problem reported by Vince.

A little bit of digging suggests that the preempt count isn't balanced
across the call to perf_pmu_output_stop from perf_mmap_close. Further
digging revealed the unbalanced get_cpu_ptr in __perf_pmu_output_stop.
Fix below!

I also noticed that we go to some effort to return an error code from
__perf_pmu_output_stop, but it's completly ignored by the cpu_function_call
machinery :(

> What's "pt-poker", btw? :)

A very boring application I wrote that just opens a pt event with config=0,
mmaps the AUX buffer and then spins waiting for the data that never comes.
The issue above occurs when I ctrl+c the application under a preemptible
kernel.

Will

--->8

>From 0d59936e07aa9b00f2a5640661bd1153f66914dc Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Wed, 24 Aug 2016 10:07:14 +0100
Subject: [PATCH] perf/core: Use this_cpu_ptr when stopping AUX events

When tearing down an aux buf for an event via perf_mmap_close,
__perf_event_output_stop is called on the event's CPU to ensure that
trace generation is halted before the process of unmapping and
freeing the buffer pages begins.

The callback is performed via cpu_function_call, which ensures that it
runs with interrupts disabled and is therefore not preemptible.
Unfortunately, the current code grabs the per-cpu context pointer using
get_cpu_ptr, which unnecessarily disables preemption and doesn't pair
the call with put_cpu_ptr, leading to a preempt_count() imbalance and
a BUG when freeing the aux buffer later on:

[   92.159611] ------------[ cut here ]------------
[   92.159617] WARNING: CPU: 1 PID: 2249 at kernel/events/ring_buffer.c:539 __rb_free_aux+0x10c/0x120
[   92.159618] Modules linked in:
[   92.159620] CPU: 1 PID: 2249 Comm: spe-poker Tainted: G        W       4.8.0-rc3 #2
[   92.159621] Hardware name: Dell Inc. XPS 13 9343/0F5KF3, BIOS A07 11/11/2015
[   92.159622]  0000000000000000 ffff88020f91fad8 ffffffff813379dd 0000000000000000
[   92.159624]  0000000000000000 ffff88020f91fb18 ffffffff81059ff6 0000021b0f91fb98
[   92.159625]  ffff8802100fe300 ffff88020de1da80 ffff88020de1d800 ffff8802100b8228
[   92.159627] Call Trace:
[   92.159630]  [<ffffffff813379dd>] dump_stack+0x4f/0x72
[   92.159632]  [<ffffffff81059ff6>] __warn+0xc6/0xe0
[   92.159633]  [<ffffffff8105a0c8>] warn_slowpath_null+0x18/0x20
[   92.159634]  [<ffffffff8112761c>] __rb_free_aux+0x10c/0x120
[   92.159636]  [<ffffffff81128163>] rb_free_aux+0x13/0x20
[   92.159637]  [<ffffffff8112515e>] perf_mmap_close+0x29e/0x2f0
[   92.159638]  [<ffffffff8111da30>] ? perf_iterate_ctx+0xe0/0xe0
[   92.159640]  [<ffffffff8115f685>] remove_vma+0x25/0x60
[   92.159641]  [<ffffffff81161796>] exit_mmap+0x106/0x140
[   92.159643]  [<ffffffff8105725c>] mmput+0x1c/0xd0
[   92.159644]  [<ffffffff8105cac3>] do_exit+0x253/0xbf0
[   92.159645]  [<ffffffff8105e32e>] do_group_exit+0x3e/0xb0
[   92.159647]  [<ffffffff81068d49>] get_signal+0x249/0x640
[   92.159648]  [<ffffffff8101c273>] do_signal+0x23/0x640
[   92.159651]  [<ffffffff81905f42>] ? _raw_write_unlock_irq+0x12/0x30
[   92.159652]  [<ffffffff81905f69>] ? _raw_spin_unlock_irq+0x9/0x10
[   92.159653]  [<ffffffff81901896>] ? __schedule+0x2c6/0x710
[   92.159655]  [<ffffffff810022a4>] exit_to_usermode_loop+0x74/0x90
[   92.159656]  [<ffffffff81002a56>] prepare_exit_to_usermode+0x26/0x30
[   92.159658]  [<ffffffff81906d1b>] retint_user+0x8/0x10
[   92.159659] ---[ end trace 926e713af6f1575e ]---

This patch uses this_cpu_ptr instead of get_cpu_ptr, since preemption is
already disabled by the caller.

Fixes: 95ff4ca26c49 ("perf/core: Free AUX pages in unmap path")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 kernel/events/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5650f5317e0c..3cfabdf7b942 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6166,7 +6166,7 @@ static int __perf_pmu_output_stop(void *info)
 {
 	struct perf_event *event = info;
 	struct pmu *pmu = event->pmu;
-	struct perf_cpu_context *cpuctx = get_cpu_ptr(pmu->pmu_cpu_context);
+	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
 	struct remote_output ro = {
 		.rb	= event->rb,
 	};
-- 
2.5.0

  reply	other threads:[~2016-08-24  9:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-08 20:22 perf: WARNING: kernel/events/core.c:4893 perf_mmap_close Vince Weaver
2016-08-09  5:15 ` Alexander Shishkin
2016-08-09 12:43   ` Vince Weaver
2016-08-09 12:53   ` Vince Weaver
2016-08-12 17:17   ` Vince Weaver
2016-08-12 17:54     ` Alexander Shishkin
2016-08-23 15:12       ` Will Deacon
2016-08-23 17:06         ` Peter Zijlstra
2016-08-23 17:08         ` Alexander Shishkin
2016-08-24  9:19           ` Will Deacon [this message]
2016-08-24 11:44             ` Alexander Shishkin
2016-08-24 13:13             ` [tip:perf/urgent] perf/core: Use this_cpu_ptr() when stopping AUX events tip-bot for Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160824091905.GA16944@arm.com \
    --to=will.deacon@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox