From: Ravi Bangoria <ravi.bangoria@amd.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "mingo@kernel.org" <mingo@kernel.org>,
"lucas.demarchi@intel.com" <lucas.demarchi@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"willy@infradead.org" <willy@infradead.org>,
"acme@kernel.org" <acme@kernel.org>,
"namhyung@kernel.org" <namhyung@kernel.org>,
"mark.rutland@arm.com" <mark.rutland@arm.com>,
"alexander.shishkin@linux.intel.com"
<alexander.shishkin@linux.intel.com>,
"jolsa@kernel.org" <jolsa@kernel.org>,
"irogers@google.com" <irogers@google.com>,
"adrian.hunter@intel.com" <adrian.hunter@intel.com>,
"kan.liang@linux.intel.com" <kan.liang@linux.intel.com>,
Ravi Bangoria <ravi.bangoria@amd.com>
Subject: Re: [PATCH v2 24/24] perf: Make perf_pmu_unregister() useable
Date: Wed, 19 Feb 2025 18:53:46 +0530 [thread overview]
Message-ID: <e8dfc3df-69d6-46ba-81db-4a04484fb7c3@amd.com> (raw)
In-Reply-To: <d585f337-5ce7-4a02-b890-5f3888e59ad0@amd.com>
Hi Peter,
>>>> Apparently not, it ends up with:
>>>>
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 145 PID: 5459 at kernel/events/core.c:281 event_function+0xd2/0xf0
>>>> WARNING: CPU: 145 PID: 5459 at kernel/events/core.c:286 event_function+0xd6/0xf0
>>>
>>>> remote_function+0x4f/0x70
>>>> generic_exec_single+0x7f/0x160
>>>> smp_call_function_single+0x110/0x160
>>>> event_function_call+0x98/0x1d0
>>>> _perf_event_disable+0x41/0x70
>>>> perf_event_for_each_child+0x40/0x90
>>>> _perf_ioctl+0xac/0xb00
>>>> perf_ioctl+0x45/0x80
>>>
>>> Took me a long while trying to blame this on the 'event->parent =
>>> NULL;', but AFAICT this is a new, unrelated issue.
>>>
>>> What I think happens is this perf_ioctl(DISABLE) vs pmu_detach_events()
>>> race, where the crux is that perf_ioctl() path does not take
>>> event2->mutex which allows the following interleave:
>>
>> This one was only with perf_fuzzer, so pmu_detach_events() code path was
>> not invoked.
>
> I think the issue is, unaccount_event() gets called for the child event
> after the child is detached. Since event->parent is NULL, unaccount_event()
> abruptly corrupts the perf_sched_work.
A different manifestation of the same underlying issue (perf_sched_work
is getting corrupted because of event->parent = NULL):
WARNING: CPU: 0 PID: 5799 at kernel/events/core.c:286 event_function+0xd6/0xf0
CPU: 0 UID: 1002 PID: 5799 Comm: perf_fuzzer Kdump: loaded Not tainted 6.14.0-rc1-pmu-unregister+ #263
RIP: 0010:event_function+0xd6/0xf0
RSP: 0018:ffa0000006a87828 EFLAGS: 00010086
RAX: 0000000000000007 RBX: ff11001008830ee0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffa0000006a87850 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffa0000006a87918
R13: 0000000000000000 R14: ff1100010e347c00 R15: ff110001226c9a40
FS: 0000000000000000(0000) GS:ff11001008800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000595fed7d3e12 CR3: 000000016954c005 CR4: 0000000000f71ef0
PKRU: 55555554
Call Trace:
<TASK>
? event_function+0xd6/0xf0
? event_function+0x3d/0xf0
remote_function+0x4c/0x70
generic_exec_single+0x7c/0x160
smp_call_function_single+0x110/0x160
event_function_call+0x98/0x1d0
perf_remove_from_context+0x46/0xa0
perf_event_release_kernel+0x1f3/0x210
perf_release+0x12/0x20
With the above trace as reference, something like below is what is
happening. (Code paths are based on my understanding of traces, I've
not verified each an every bit :P)
Assume:
task1: event1
task2 (child of task1): event2 (child of event1)
task3: event3
CPU 1 CPU 2 CPU 3 CPU 4
/* task3 sched in */
perf_event_task_sched_in()
if (static_branch_unlikely(&perf_sched_events)) /* true */
__perf_event_task_sched_in()
...
task2:
perf_event_exit_event() /* event2 */
event->parent = NULL /* by perf_perf_child_detach() */
free_event()
_free_event()
unaccount_event()
if (event->parent) /* false */
return; /* won't get executed */
perf_sched_work <== OFF ------.
|
| /* task3 sched out */
| perf_event_task_sched_out()
'-----> if (static_branch_unlikely(&perf_sched_events)) /* false */
| /* __perf_event_task_sched_out() won't get called */
|
| /* task3 sched in */
| perf_event_task_sched_in()
'--------------------------------------------------------------------------> if (static_branch_unlikely(&perf_sched_events)) /* false */
/* __perf_event_task_sched_in() won't get called
So cpuctx->task_ctx will remains NULL; */
close(event3)
perf_release()
perf_event_release_kernel()
perf_remove_from_context()
event_function_call()
/* IPI to CPU 3 ---> */
/* ---> IPI from CPU 4 */
event_function()
if (ctx->task) {
/* task_ctx is NULL whereas ctx is !NULL */
WARN_ON_ONCE(task_ctx != ctx);
^^^^^^^^^^^^
IIUC, event->parent is the only way to detect whether the event is a
child or not, so it makes sense to preserve the ->parent pointer even
after the child is detached.
Thanks,
Ravi
next prev parent reply other threads:[~2025-02-19 13:23 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 10:21 [PATCH v2 00/24] perf: Make perf_pmu_unregister() usable Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 01/24] lockdep: Fix might_fault() Peter Zijlstra
2025-02-06 18:19 ` David Hildenbrand
2025-02-05 10:21 ` [PATCH v2 02/24] perf: Ensure bpf_perf_link path is properly serialized Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 03/24] perf: Simplify child event tear-down Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 04/24] perf: Simplify perf_event_free_task() wait Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 05/24] perf: Simplify perf_event_release_kernel() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 06/24] perf: Fix pmus_lock vs pmus_srcu ordering Peter Zijlstra
2025-02-27 16:59 ` Lucas De Marchi
2025-02-05 10:21 ` [PATCH v2 07/24] perf: Fix perf_pmu_register() vs perf_init_event() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 08/24] perf: Cleanup perf_try_init_event() Peter Zijlstra
2025-03-05 11:29 ` [tip: perf/core] perf/core: Clean up perf_try_init_event() tip-bot2 for Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 09/24] perf: Simplify perf_event_alloc() error path Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 10/24] perf: Simplify perf_pmu_register() " Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 11/24] perf: Simplify perf_pmu_register() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 12/24] perf: Simplify perf_init_event() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 13/24] perf: Simplify perf_event_alloc() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 14/24] perf: Merge pmu_disable_count into cpu_pmu_context Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 15/24] perf: Add this_cpc() helper Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 16/24] perf: Detach perf_cpu_pmu_context and pmu lifetimes Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 17/24] perf: Introduce perf_free_addr_filters() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 18/24] perf: Robustify perf_event_free_bpf_prog() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 19/24] perf: Simplify perf_mmap() control flow Peter Zijlstra
2025-03-03 5:39 ` Ravi Bangoria
2025-03-03 11:19 ` Ingo Molnar
2025-03-03 13:36 ` Ravi Bangoria
2025-03-04 8:44 ` Ingo Molnar
2025-02-05 10:21 ` [PATCH v2 20/24] perf: Fix perf_mmap() failure path Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 21/24] perf: Further simplify perf_mmap() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 22/24] perf: Remove retry loop from perf_mmap() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 23/24] perf: Lift event->mmap_mutex in perf_mmap() Peter Zijlstra
2025-02-05 10:21 ` [PATCH v2 24/24] perf: Make perf_pmu_unregister() useable Peter Zijlstra
2025-02-10 6:39 ` Ravi Bangoria
2025-02-11 15:46 ` Peter Zijlstra
2025-02-10 6:42 ` Ravi Bangoria
2025-02-12 12:49 ` Peter Zijlstra
2025-02-13 7:52 ` Ravi Bangoria
2025-02-13 13:08 ` Peter Zijlstra
2025-02-14 3:57 ` Ravi Bangoria
2025-02-14 20:24 ` Peter Zijlstra
2025-02-17 8:24 ` Ravi Bangoria
2025-02-17 16:31 ` Ravi Bangoria
2025-02-19 13:23 ` Ravi Bangoria [this message]
2025-02-19 14:30 ` Ravi Bangoria
2025-02-10 6:59 ` Ravi Bangoria
2025-02-13 13:07 ` Peter Zijlstra
2025-03-03 6:01 ` [PATCH v2 00/24] perf: Make perf_pmu_unregister() usable Ravi Bangoria
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8dfc3df-69d6-46ba-81db-4a04484fb7c3@amd.com \
--to=ravi.bangoria@amd.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lucas.demarchi@intel.com \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox