All of lore.kernel.org
 help / color / mirror / Atom feed
* [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
@ 2025-04-14  1:59 kernel test robot
  2025-04-14 19:01 ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: kernel test robot @ 2025-04-14  1:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	oliver.sang



Hello,

kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:

commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core

[test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]

in testcase: trinity
version: trinity-x86_64-ba2360ed-1_20241228
with following parameters:

	runtime: 300s
	group: group-02
	nr_groups: 5



config: x86_64-randconfig-078-20250407
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250413/202504131701.941039cd-lkp@intel.com


[  100.647813][ T3900] ==================================================================
[  100.648676][ T3900] BUG: KASAN: null-ptr-deref in put_event+0x2a/0x730
[  100.649303][ T3900] Write of size 8 at addr 0000000000000237 by task trinity-c1/3900
[  100.650021][ T3900] 
[  100.650314][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G                T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.650323][ T3900] Tainted: [T]=RANDSTRUCT
[  100.650325][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.650328][ T3900] Call Trace:
[  100.650332][ T3900]  <TASK>
[  100.650334][ T3900]  __dump_stack+0x19/0x30
[  100.650345][ T3900]  dump_stack_lvl+0xaf/0x118
[  100.650350][ T3900]  print_report+0x41/0x2d0
[  100.650359][ T3900]  kasan_report+0x15c/0x1a0
[  100.650367][ T3900]  ? put_event+0x2a/0x730
[  100.650373][ T3900]  ? put_event+0x2a/0x730
[  100.650379][ T3900]  kasan_check_range+0x2b3/0x2c0
[  100.650383][ T3900]  __kasan_check_write+0x18/0x20
[  100.650389][ T3900]  put_event+0x2a/0x730
[  100.650392][ T3900]  ? __free_event+0x707/0x7f0
[  100.650398][ T3900]  put_event+0x69f/0x730
[  100.650401][ T3900]  ? perf_event_wakeup+0x66/0x2c0
[  100.650404][ T3900]  ? perf_event_wakeup+0x1b3/0x2c0
[  100.650408][ T3900]  perf_event_exit_event+0xa6/0xd0
[  100.650417][ T3900]  perf_event_exit_task_context+0x44e/0x550
[  100.650424][ T3900]  perf_event_exit_task+0x1dd/0x2a0
[  100.650428][ T3900]  ? fpu__drop+0x131/0x390
[  100.650432][ T3900]  ? preempt_count_sub+0x218/0x2f0
[  100.650441][ T3900]  ? fpu__drop+0x131/0x390
[  100.650445][ T3900]  do_exit+0xa4d/0x2490
[  100.650449][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.650454][ T3900]  ? do_group_exit+0x1ae/0x290
[  100.650459][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.650463][ T3900]  ? trace_preempt_on+0x179/0x2e0
[  100.650473][ T3900]  do_group_exit+0x1be/0x290
[  100.650478][ T3900]  __x64_sys_exit_group+0x48/0x50
[  100.650481][ T3900]  x64_sys_call+0x2c68/0x2c70
[  100.650484][ T3900]  do_syscall_64+0xff/0x220
[  100.650493][ T3900]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  100.650499][ T3900] RIP: 0033:0x7fc7ce262349
[  100.650503][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[  100.650505][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[  100.650513][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[  100.650515][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[  100.650517][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[  100.650522][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[  100.650524][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[  100.650530][ T3900]  </TASK>
[  100.650532][ T3900] ==================================================================
[  100.673381][ T3900] BUG: kernel NULL pointer dereference, address: 0000000000000237
[  100.674119][ T3900] #PF: supervisor write access in kernel mode
[  100.674687][ T3900] #PF: error_code(0x0002) - not-present page
[  100.675251][ T3900] PGD 0 P4D 0 
[  100.675618][ T3900] Oops: Oops: 0002 [#1] SMP KASAN
[  100.676091][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G    B           T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.677189][ T3900] Tainted: [B]=BAD_PAGE, [T]=RANDSTRUCT
[  100.677704][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.678670][ T3900] RIP: 0010:put_event+0x2a/0x730
[  100.679152][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.680761][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[  100.681342][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.682061][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.682766][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.686319][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[  100.687356][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[  100.694079][ T3900] FS:  00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[  100.695168][ T3900] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.695958][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[  100.696932][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[  100.702041][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  100.703069][ T3900] Call Trace:
[  100.703563][ T3900]  <TASK>
[  100.704019][ T3900]  ? __free_event+0x707/0x7f0
[  100.704649][ T3900]  put_event+0x69f/0x730
[  100.705985][ T3900]  ? perf_event_wakeup+0x66/0x2c0
[  100.706648][ T3900]  ? perf_event_wakeup+0x1b3/0x2c0
[  100.707305][ T3900]  perf_event_exit_event+0xa6/0xd0
[  100.708809][ T3900]  perf_event_exit_task_context+0x44e/0x550
[  100.709677][ T3900]  perf_event_exit_task+0x1dd/0x2a0
[  100.710349][ T3900]  ? fpu__drop+0x131/0x390
[  100.710922][ T3900]  ? preempt_count_sub+0x218/0x2f0
[  100.711576][ T3900]  ? fpu__drop+0x131/0x390
[  100.712162][ T3900]  do_exit+0xa4d/0x2490
[  100.712728][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.717506][ T3900]  ? do_group_exit+0x1ae/0x290
[  100.718138][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.718797][ T3900]  ? trace_preempt_on+0x179/0x2e0
[  100.719458][ T3900]  do_group_exit+0x1be/0x290
[  100.720074][ T3900]  __x64_sys_exit_group+0x48/0x50
[  100.720714][ T3900]  x64_sys_call+0x2c68/0x2c70
[  100.721340][ T3900]  do_syscall_64+0xff/0x220
[  100.721946][ T3900]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  100.722683][ T3900] RIP: 0033:0x7fc7ce262349
[  100.723272][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[  100.724109][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[  100.725125][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[  100.726139][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[  100.727073][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[  100.727935][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[  100.728806][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[  100.729722][ T3900]  </TASK>
[  100.730174][ T3900] Modules linked in: tiny_power_button button pcspkr evdev input_leds loop
[  100.731230][ T3900] CR2: 0000000000000237
[  100.731791][ T3900] ---[ end trace 0000000000000000 ]---
[  100.731795][ T3903] BUG: kernel NULL pointer dereference, address: 0000000000000237
[  100.732199][ T3900] RIP: 0010:put_event+0x2a/0x730
[  100.732817][ T3903] #PF: supervisor write access in kernel mode
[  100.733228][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.733677][ T3903] #PF: error_code(0x0002) - not-present page
[  100.733686][ T3903] PGD 0 
[  100.735120][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[  100.735548][ T3903] P4D 0 
[  100.735789][ T3900] 
[  100.736225][ T3903] Oops: Oops: 0002 [#2] SMP KASAN
[  100.736469][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.736653][ T3903] CPU: 0 UID: 65534 PID: 3903 Comm: trinity-c4 Tainted: G    B D         T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.737053][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.737624][ T3903] Tainted: [B]=BAD_PAGE, [D]=DIE, [T]=RANDSTRUCT
[  100.737634][ T3903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.738628][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.739246][ T3903] RIP: 0010:put_event+0x2a/0x730
[  100.739740][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[  100.740493][ T3903] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.741089][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[  100.741461][ T3903] RSP: 0018:ffffc90004d97b70 EFLAGS: 00010246
[  100.741877][ T3900] FS:  00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[  100.743537][ T3903] 
[  100.744014][ T3900] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.744534][ T3903] RAX: 0000000000000001 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.745071][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[  100.745279][ T3903] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.745681][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[  100.746350][ T3903] RBP: ffffc90004d97bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.746828][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  100.746840][ T3900] Kernel panic - not syncing: Fatal exception
[  101.916945][ T3900] Shutting down cpus with NMI
[  101.929723][ T3900] Kernel Offset: disabled


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-14  1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
@ 2025-04-14 19:01 ` Peter Zijlstra
  2025-04-15  4:46   ` Oliver Sang
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-04-14 19:01 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland

On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
> 
> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
> 
> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
> 
> in testcase: trinity
> version: trinity-x86_64-ba2360ed-1_20241228
> with following parameters:
> 
> 	runtime: 300s
> 	group: group-02
> 	nr_groups: 5
> 
> 
> 
> config: x86_64-randconfig-078-20250407
> compiler: clang-20
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
> 

Does this help?

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2eb9cd5d86a1..528b679aaf7e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
 	_free_event(event);
 
 	/* Matches the refcount bump in inherit_event() */
-	if (parent)
+	if (parent && parent != EVENT_TOMBSTONE)
 		put_event(parent);
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-14 19:01 ` Peter Zijlstra
@ 2025-04-15  4:46   ` Oliver Sang
  2025-04-15  9:14     ` James Clark
  0 siblings, 1 reply; 12+ messages in thread
From: Oliver Sang @ 2025-04-15  4:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland, oliver.sang

hi, Peter Zijlstra,

On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
> > 
> > commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
> > 
> > [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
> > 
> > in testcase: trinity
> > version: trinity-x86_64-ba2360ed-1_20241228
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	group: group-02
> > 	nr_groups: 5
> > 
> > 
> > 
> > config: x86_64-randconfig-078-20250407
> > compiler: clang-20
> > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
> > 
> 
> Does this help?

yes, below patch fixes the issues we observed for da916e96e2. thanks

Tested-by: kernel test robot <oliver.sang@intel.com>

> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 2eb9cd5d86a1..528b679aaf7e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
>  	_free_event(event);
>  
>  	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>  		put_event(parent);
>  }
>  

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15  4:46   ` Oliver Sang
@ 2025-04-15  9:14     ` James Clark
  2025-04-15 10:08       ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: James Clark @ 2025-04-15  9:14 UTC (permalink / raw)
  To: Oliver Sang, Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland



On 15/04/2025 5:46 am, Oliver Sang wrote:
> hi, Peter Zijlstra,
> 
> On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
>>>
>>> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
>>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
>>>
>>> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
>>>
>>> in testcase: trinity
>>> version: trinity-x86_64-ba2360ed-1_20241228
>>> with following parameters:
>>>
>>> 	runtime: 300s
>>> 	group: group-02
>>> 	nr_groups: 5
>>>
>>>
>>>
>>> config: x86_64-randconfig-078-20250407
>>> compiler: clang-20
>>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>>
>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>>
>>>
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>>> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
>>>
>>
>> Does this help?
> 
> yes, below patch fixes the issues we observed for da916e96e2. thanks
> 
> Tested-by: kernel test robot <oliver.sang@intel.com>
> 

Also fixes the same issues we were seeing:

Tested-by: James Clark <james.clark@linaro.org>

>>
>> ---
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 2eb9cd5d86a1..528b679aaf7e 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
>>   	_free_event(event);
>>   
>>   	/* Matches the refcount bump in inherit_event() */
>> -	if (parent)
>> +	if (parent && parent != EVENT_TOMBSTONE)
>>   		put_event(parent);
>>   }
>>   
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15  9:14     ` James Clark
@ 2025-04-15 10:08       ` Peter Zijlstra
  2025-04-15 13:14         ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-04-15 10:08 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland

On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> On 15/04/2025 5:46 am, Oliver Sang wrote:

> > yes, below patch fixes the issues we observed for da916e96e2. thanks
> > 
> > Tested-by: kernel test robot <oliver.sang@intel.com>
> > 
> 
> Also fixes the same issues we were seeing:
> 
> Tested-by: James Clark <james.clark@linaro.org>

Excellent, thank you both! Now I gotta go write me a Changelog :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 10:08       ` Peter Zijlstra
@ 2025-04-15 13:14         ` Peter Zijlstra
  2025-04-15 15:52           ` James Clark
                             ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Peter Zijlstra @ 2025-04-15 13:14 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> > On 15/04/2025 5:46 am, Oliver Sang wrote:
> 
> > > yes, below patch fixes the issues we observed for da916e96e2. thanks
> > > 
> > > Tested-by: kernel test robot <oliver.sang@intel.com>
> > > 
> > 
> > Also fixes the same issues we were seeing:
> > 
> > Tested-by: James Clark <james.clark@linaro.org>
> 
> Excellent, thank you both! Now I gotta go write me a Changelog :-)

Hmm, so while writing Changelog, I noticed something else was off. The
case where event->parent was set to EVENT_TOMBSTONE now didn't have a
put_event(parent) anymore. So that needs to be put back in as well.

Frederic, afaict this should still be okay, since if we're detached,
then nothing will try and access event->parent in the free path.

Also, nothing in perf_pending_task() will try and access either
event->parent or event->pmu.

---
Subject: perf: Fix event->parent life-time issue
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Apr 15 12:12:52 CEST 2025

Due to an oversight in merging da916e96e2de ("perf: Make
perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
hang while freeing sigtrap event"), it is now possible to hit
put_event(EVENT_TOMBSTONE), which makes the computer sad.

This also means that for the event->parent == EVENT_TOMBSTONE, the
put_event() matching inherit_event() has gone missing.

Previously this was done in perf_event_release_kernel() after calling
perf_remove_from_context(), but with it delegated to put_event(), this
case is now entirely missed, leading to leaks.

Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
 	 * not being a child event. See for example unaccount_event().
 	 */
 	event->parent = EVENT_TOMBSTONE;
+	put_event(parent_event);
 }
 
 static bool is_orphaned_event(struct perf_event *event)
@@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
 	_free_event(event);
 
 	/* Matches the refcount bump in inherit_event() */
-	if (parent)
+	if (parent && parent != EVENT_TOMBSTONE)
 		put_event(parent);
 }
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 13:14         ` Peter Zijlstra
@ 2025-04-15 15:52           ` James Clark
  2025-04-16  8:46             ` Peter Zijlstra
  2025-04-16  8:36           ` Venkat Rao Bagalkote
  2025-04-17 13:01           ` [tip: perf/core] perf/core: Fix event->parent life-time issue tip-bot2 for Peter Zijlstra
  2 siblings, 1 reply; 12+ messages in thread
From: James Clark @ 2025-04-15 15:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker



On 15/04/2025 2:14 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
> 
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
> 
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
> 
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
> 
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
> 
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
> 
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
> 
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
> 
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   kernel/events/core.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
>   	 * not being a child event. See for example unaccount_event().
>   	 */
>   	event->parent = EVENT_TOMBSTONE;
> +	put_event(parent_event);
>   }
>   
>   static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
>   	_free_event(event);
>   
>   	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>   		put_event(parent);
>   }
>   

Hi Peter,

Unrelated to the pointer deref issue, I'm also seeing perf stat not 
working due to this commit. And that's both with and without this fixup:

  -> perf stat -- true

  Performance counter stats for 'true':

      <not counted> msec task-clock 

      <not counted>      context-switches 

      <not counted>      cpu-migrations 

      <not counted>      page-faults 

      <not counted>      armv8_cortex_a53/instructions/ 

      <not counted>      armv8_cortex_a57/instructions/ 

      <not counted>      armv8_cortex_a53/cycles/ 

      <not counted>      armv8_cortex_a57/cycles/ 

      <not counted>      armv8_cortex_a53/branches/ 

      <not counted>      armv8_cortex_a53/branch-misses/ 

      <not counted>      armv8_cortex_a57/branch-misses/ 


        0.074139992 seconds time elapsed

        0.000000000 seconds user
        0.054797000 seconds sys

Didn't look into it more other than bisecting it to this commit, but I 
can dig more unless the issue is obvious. This is on Arm big.LITTLE, 
although I didn't test it elsewhere so I'm not sure if that's relevant 
or not.

Thanks
James


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 13:14         ` Peter Zijlstra
  2025-04-15 15:52           ` James Clark
@ 2025-04-16  8:36           ` Venkat Rao Bagalkote
  2025-04-17 13:01           ` [tip: perf/core] perf/core: Fix event->parent life-time issue tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 12+ messages in thread
From: Venkat Rao Bagalkote @ 2025-04-16  8:36 UTC (permalink / raw)
  To: Peter Zijlstra, James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker, venkat88,
	Athira Rajeev, Madhavan Srinivasan


On 15/04/25 6:44 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
>
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
>
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
>
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
>
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
>
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
>
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
>
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   kernel/events/core.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
>   	 * not being a child event. See for example unaccount_event().
>   	 */
>   	event->parent = EVENT_TOMBSTONE;
> +	put_event(parent_event);
>   }
>   
>   static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
>   	_free_event(event);
>   
>   	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>   		put_event(parent);
>   }
>   


This issue is reported on IBM Power9 servers also. Tested the above 
patch, and issue is fixed. Hence,


Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>


Regards,

Venkat.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 15:52           ` James Clark
@ 2025-04-16  8:46             ` Peter Zijlstra
  2025-04-16 19:08               ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-04-16  8:46 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> due to this commit. And that's both with and without this fixup:
> 
>  -> perf stat -- true
> 
>  Performance counter stats for 'true':
> 
>      <not counted> msec task-clock
> 
>      <not counted>      context-switches
> 
>      <not counted>      cpu-migrations
> 
>      <not counted>      page-faults
> 
>      <not counted>      armv8_cortex_a53/instructions/
> 
>      <not counted>      armv8_cortex_a57/instructions/
> 
>      <not counted>      armv8_cortex_a53/cycles/
> 
>      <not counted>      armv8_cortex_a57/cycles/
> 
>      <not counted>      armv8_cortex_a53/branches/
> 
>      <not counted>      armv8_cortex_a53/branch-misses/
> 
>      <not counted>      armv8_cortex_a57/branch-misses/
> 
> 
>        0.074139992 seconds time elapsed
> 
>        0.000000000 seconds user
>        0.054797000 seconds sys
> 
> Didn't look into it more other than bisecting it to this commit, but I can
> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> didn't test it elsewhere so I'm not sure if that's relevant or not.

I can reproduce on x86 alderlake (first machine I tried), so let me go
have a quick poke.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-16  8:46             ` Peter Zijlstra
@ 2025-04-16 19:08               ` Peter Zijlstra
  2025-04-17  8:58                 ` James Clark
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-04-16 19:08 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> > Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> > due to this commit. And that's both with and without this fixup:
> > 
> >  -> perf stat -- true
> > 
> >  Performance counter stats for 'true':
> > 
> >      <not counted> msec task-clock
> > 
> >      <not counted>      context-switches
> > 
> >      <not counted>      cpu-migrations
> > 
> >      <not counted>      page-faults
> > 
> >      <not counted>      armv8_cortex_a53/instructions/
> > 
> >      <not counted>      armv8_cortex_a57/instructions/
> > 
> >      <not counted>      armv8_cortex_a53/cycles/
> > 
> >      <not counted>      armv8_cortex_a57/cycles/
> > 
> >      <not counted>      armv8_cortex_a53/branches/
> > 
> >      <not counted>      armv8_cortex_a53/branch-misses/
> > 
> >      <not counted>      armv8_cortex_a57/branch-misses/
> > 
> > 
> >        0.074139992 seconds time elapsed
> > 
> >        0.000000000 seconds user
> >        0.054797000 seconds sys
> > 
> > Didn't look into it more other than bisecting it to this commit, but I can
> > dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> > didn't test it elsewhere so I'm not sure if that's relevant or not.
> 
> I can reproduce on x86 alderlake (first machine I tried), so let me go
> have a quick poke.

Could you please try queue.git/perf/core ? I've fixed this and found
another problem.

I'll post the patches tomorrow, after the robot has had a go.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-16 19:08               ` Peter Zijlstra
@ 2025-04-17  8:58                 ` James Clark
  0 siblings, 0 replies; 12+ messages in thread
From: James Clark @ 2025-04-17  8:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker



On 16/04/2025 8:08 pm, Peter Zijlstra wrote:
> On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
>>> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
>>> due to this commit. And that's both with and without this fixup:
>>>
>>>   -> perf stat -- true
>>>
>>>   Performance counter stats for 'true':
>>>
>>>       <not counted> msec task-clock
>>>
>>>       <not counted>      context-switches
>>>
>>>       <not counted>      cpu-migrations
>>>
>>>       <not counted>      page-faults
>>>
>>>       <not counted>      armv8_cortex_a53/instructions/
>>>
>>>       <not counted>      armv8_cortex_a57/instructions/
>>>
>>>       <not counted>      armv8_cortex_a53/cycles/
>>>
>>>       <not counted>      armv8_cortex_a57/cycles/
>>>
>>>       <not counted>      armv8_cortex_a53/branches/
>>>
>>>       <not counted>      armv8_cortex_a53/branch-misses/
>>>
>>>       <not counted>      armv8_cortex_a57/branch-misses/
>>>
>>>
>>>         0.074139992 seconds time elapsed
>>>
>>>         0.000000000 seconds user
>>>         0.054797000 seconds sys
>>>
>>> Didn't look into it more other than bisecting it to this commit, but I can
>>> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
>>> didn't test it elsewhere so I'm not sure if that's relevant or not.
>>
>> I can reproduce on x86 alderlake (first machine I tried), so let me go
>> have a quick poke.
> 
> Could you please try queue.git/perf/core ? I've fixed this and found
> another problem.
> 
> I'll post the patches tomorrow, after the robot has had a go.

Yep that's all working now, thanks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip: perf/core] perf/core: Fix event->parent life-time issue
  2025-04-15 13:14         ` Peter Zijlstra
  2025-04-15 15:52           ` James Clark
  2025-04-16  8:36           ` Venkat Rao Bagalkote
@ 2025-04-17 13:01           ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 12+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2025-04-17 13:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, James Clark, Venkat Rao Bagalkote,
	Peter Zijlstra (Intel), Ingo Molnar, x86, linux-kernel

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     162c9e3faf58eef653c74d0c774e6583d9225467
Gitweb:        https://git.kernel.org/tip/162c9e3faf58eef653c74d0c774e6583d9225467
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Tue, 15 Apr 2025 12:12:52 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 17 Apr 2025 14:21:15 +02:00

perf/core: Fix event->parent life-time issue

Due to an oversight in merging:

  da916e96e2de ("perf: Make perf_pmu_unregister() useable")

on top of:

  56799bc03565 ("perf: Fix hang while freeing sigtrap event")

.. it is now possible to hit put_event(EVENT_TOMBSTONE), which makes
the computer sad.

This also means that for the event->parent == EVENT_TOMBSTONE, the
put_event() matching inherit_event() has gone missing.

Previously this was done in perf_event_release_kernel() after calling
perf_remove_from_context(), but with it delegated to put_event(), this
case is now entirely missed, leading to leaks.

Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: kernel test robot <oliver.sang@intel.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
Link: https://lkml.kernel.org/r/20250415131446.GN5600@noisy.programming.kicks-ass.net
---
 kernel/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1a19df9..43d87de 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2343,6 +2343,7 @@ static void perf_child_detach(struct perf_event *event)
 	 * not being a child event. See for example unaccount_event().
 	 */
 	event->parent = EVENT_TOMBSTONE;
+	put_event(parent_event);
 }
 
 static bool is_orphaned_event(struct perf_event *event)
@@ -5688,7 +5689,7 @@ static void put_event(struct perf_event *event)
 	_free_event(event);
 
 	/* Matches the refcount bump in inherit_event() */
-	if (parent)
+	if (parent && parent != EVENT_TOMBSTONE)
 		put_event(parent);
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-04-17 13:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14  1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
2025-04-14 19:01 ` Peter Zijlstra
2025-04-15  4:46   ` Oliver Sang
2025-04-15  9:14     ` James Clark
2025-04-15 10:08       ` Peter Zijlstra
2025-04-15 13:14         ` Peter Zijlstra
2025-04-15 15:52           ` James Clark
2025-04-16  8:46             ` Peter Zijlstra
2025-04-16 19:08               ` Peter Zijlstra
2025-04-17  8:58                 ` James Clark
2025-04-16  8:36           ` Venkat Rao Bagalkote
2025-04-17 13:01           ` [tip: perf/core] perf/core: Fix event->parent life-time issue tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.