linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "David Wang" <00107082@163.com>
To: "Yeoreum Yun" <yeoreum.yun@arm.com>
Cc: peterz@infradead.org, mingo@redhat.com, mingo@kernel.org,
	acme@kernel.org, namhyung@kernel.org, leo.yan@arm.com,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re:[PATCH 1/1] perf/core: fix dangling cgroup pointer in cpuctx
Date: Tue, 3 Jun 2025 10:01:41 +0800 (CST)	[thread overview]
Message-ID: <3842c9a5.20ee.1973385e209.Coremail.00107082@163.com> (raw)
In-Reply-To: <20250602184049.4010919-1-yeoreum.yun@arm.com>


At 2025-06-03 02:40:49, "Yeoreum Yun" <yeoreum.yun@arm.com> wrote:
>commit a3c3c6667("perf/core: Fix child_total_time_enabled accounting bug at task exit")
>changes the event->state update before list_del_event().
>This change prevents calling perf_cgroup_event_disable() as a result,
>cpuctx->cgrp can't be cleared properly and point to dangling point of cgroup.
>
>Because of this problem, some machin meets the below panic[0]:
>
>863.881960] sysved_call_function_sing le+0x4c/0xc0
>863.881301] asm_sysvec_call_function_single+0x16/0x20
>869.881344] RIP: 0633:0x7f9alcea3367
>663.681373] Code: 00 66 99 b8 ff ff ff ff c3 66 ....
>863.881524] RSP: 002b:00007fffa526fcf8 EFLAGS: 00000246
>869.881567] RAX: 0000562060c962d0 RBX: 0000000000000002 RCX: 00007f9a1cff1c60
>863.881625] RDX: 00007f9a0c000030 RSI: 00007f9alcff1c60 RDI: 00007f9a1ca91c20
>863.081682] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f9a1d6217a0
>869.881740] R10: 00007f9alca91c10 R11: 0000000000000246 R12: 00007f9a1d70c020
>869.881798] R13: 00007fffa5270030 R14: 00007fffa526fd00 R15: 0000000000000000
>863.881860] </TASK>
>863.881876) Modules linked in: snd_seq_dummy (E) snd_hrtimer (E)...
>...
>863.887142] button (E)
>863.912127] CR2: ffffe4afcc079650
>863.914593] --- [ end trace 0000000000000000 1--
>864.042750] RIP: 0010:ctx_sched_out+0x1ce/0x210
>864.045214] Code: 89 c6 4c 8b b9 de 00 00 00 48 ...
>864.050343] RSP: 0000:ffffaa4ec0f3fe60 EFLAGS: 00010086
>864.052929] RAX: 0000000000000002 RBX: ffff8e8eeed2a580 RCX: ffff8e8bded9bf00
>864.055518] RDX: 000000c92340b051 RSI: 000000c92340b051 RDI: ffff
>864.058093] RBP: 0000000000000000 R08: 0000000000000002 R09: 00
>864.060654] R10: 0000000000000000 R11: 0000000000000000 R12: 000
>864.063183] R13: ffff8e8eeed2a580 R14: 0000000000000007 R15: ffffe4afcc079650
>864.065729] FS: 00007f9a1ca91940 (0000) GS:ffff8e8f6b1c3000(0000) knIGS:0000000000000000
>864.068312] CS: 0010 DS: 0000 ES: 0000 CRO: 0000000080050033
>864.070898] CR2: ffffe4afcc079650 CR3: 00000001136d8000 CR4: 0000000000350ef0
>864.673523] Kernel panic - not syncing: Fatal exception in interrupt
>864.076410] Kernel Offset: 0xc00000 from 0xffffffff81000000 (relocation range: 0xff
>864.205401] --- [ end Kernel panic - not syncing: Fatal exception in interrupt ]---
>
>To address this call the perf_cgroup_event_disable() properly before
>list_del_event() in __perf_remove_from_context().
>
>Link: https://lore.kernel.org/all/aD2TspKH%2F7yvfYoO@e129823.arm.com/ [0]
>Fixes: a3c3c6667("perf/core: Fix child_total_time_enabled accounting bug at task exit")
>Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
>Tested-by: David Wang <00107082@163.com>
>---
> kernel/events/core.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
>diff --git a/kernel/events/core.c b/kernel/events/core.c
>index f34c99f8ce8f..909b9d5a65c1 100644
>--- a/kernel/events/core.c
>+++ b/kernel/events/core.c
>@@ -2498,6 +2498,10 @@ __perf_remove_from_context(struct perf_event *event,
> 		state = PERF_EVENT_STATE_DEAD;
> 	}
> 	event_sched_out(event, ctx);
>+
>+	if (event->state > PERF_EVENT_STATE_OFF)
>+		perf_cgroup_event_disable(event, ctx);
>+
> 	perf_event_set_state(event, min(event->state, state));
> 
> 	if (flags & DETACH_GROUP)
>-- 
>LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}

I think this patch is no better than my patch in the original report
https://lore.kernel.org/all/20250601173603.3920-1-00107082@163.com/


This patch is more aggressive,  it add more changes to original logic, same practice 
as in the offending commit.  would raise more concerns about hidden side-effect.

For example, this code  in list_del_event should raise concern about this patch
 2099          * We can have double detach due to exit/hot-unplug + close.
 2100          */
 2101         if (!(event->attach_state & PERF_ATTACH_CONTEXT))
 2102                 return;


Thanks
David

  reply	other threads:[~2025-06-03  2:02 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-02 18:40 [PATCH 1/1] perf/core: fix dangling cgroup pointer in cpuctx Yeoreum Yun
2025-06-03  2:01 ` David Wang [this message]
2025-06-03  4:46   ` Yeoreum Yun
2025-06-03  5:44     ` David Wang
2025-06-03  6:34       ` Yeoreum Yun
2025-06-03  6:39         ` Yeoreum Yun
2025-06-03  6:47           ` David Wang
2025-06-03  6:42         ` David Wang
2025-06-03  7:16           ` Yeoreum Yun
2025-06-03  7:31             ` David Wang
2025-06-03  8:15             ` David Wang
2025-06-03  6:54         ` David Wang
2025-06-03  9:20         ` Yeoreum Yun
2025-06-03 10:08           ` David Wang
2025-06-03 13:41             ` Yeoreum Yun
2025-06-03 14:02               ` David Wang
2025-06-03 14:00 ` Leo Yan
2025-06-03 14:44   ` Peter Zijlstra
2025-06-03 15:17     ` Yeoreum Yun
2025-06-04  7:06       ` Peter Zijlstra
2025-06-04  8:03     ` Peter Zijlstra
2025-06-04 10:06       ` Yeoreum Yun
2025-06-04 12:37         ` Peter Zijlstra
2025-06-04 12:54           ` Yeoreum Yun
2025-06-04 10:18       ` Leo Yan
2025-06-04 13:58         ` Peter Zijlstra
2025-06-04 15:17           ` Leo Yan
2025-06-04 14:16         ` Peter Zijlstra
2025-06-04 15:46           ` Leo Yan
2025-06-04 15:59             ` Peter Zijlstra
2025-06-05 11:29             ` Peter Zijlstra
2025-06-05 12:33               ` Peter Zijlstra
2025-06-05 17:21                 ` Leo Yan
2025-06-05 11:41           ` Peter Zijlstra
2025-06-03 15:05   ` Yeoreum Yun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3842c9a5.20ee.1973385e209.Coremail.00107082@163.com \
    --to=00107082@163.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leo.yan@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).