linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
@ 2025-04-14  1:59 kernel test robot
  2025-04-14 19:01 ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: kernel test robot @ 2025-04-14  1:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	oliver.sang



Hello,

kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:

commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core

[test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]

in testcase: trinity
version: trinity-x86_64-ba2360ed-1_20241228
with following parameters:

	runtime: 300s
	group: group-02
	nr_groups: 5



config: x86_64-randconfig-078-20250407
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250413/202504131701.941039cd-lkp@intel.com


[  100.647813][ T3900] ==================================================================
[  100.648676][ T3900] BUG: KASAN: null-ptr-deref in put_event+0x2a/0x730
[  100.649303][ T3900] Write of size 8 at addr 0000000000000237 by task trinity-c1/3900
[  100.650021][ T3900] 
[  100.650314][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G                T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.650323][ T3900] Tainted: [T]=RANDSTRUCT
[  100.650325][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.650328][ T3900] Call Trace:
[  100.650332][ T3900]  <TASK>
[  100.650334][ T3900]  __dump_stack+0x19/0x30
[  100.650345][ T3900]  dump_stack_lvl+0xaf/0x118
[  100.650350][ T3900]  print_report+0x41/0x2d0
[  100.650359][ T3900]  kasan_report+0x15c/0x1a0
[  100.650367][ T3900]  ? put_event+0x2a/0x730
[  100.650373][ T3900]  ? put_event+0x2a/0x730
[  100.650379][ T3900]  kasan_check_range+0x2b3/0x2c0
[  100.650383][ T3900]  __kasan_check_write+0x18/0x20
[  100.650389][ T3900]  put_event+0x2a/0x730
[  100.650392][ T3900]  ? __free_event+0x707/0x7f0
[  100.650398][ T3900]  put_event+0x69f/0x730
[  100.650401][ T3900]  ? perf_event_wakeup+0x66/0x2c0
[  100.650404][ T3900]  ? perf_event_wakeup+0x1b3/0x2c0
[  100.650408][ T3900]  perf_event_exit_event+0xa6/0xd0
[  100.650417][ T3900]  perf_event_exit_task_context+0x44e/0x550
[  100.650424][ T3900]  perf_event_exit_task+0x1dd/0x2a0
[  100.650428][ T3900]  ? fpu__drop+0x131/0x390
[  100.650432][ T3900]  ? preempt_count_sub+0x218/0x2f0
[  100.650441][ T3900]  ? fpu__drop+0x131/0x390
[  100.650445][ T3900]  do_exit+0xa4d/0x2490
[  100.650449][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.650454][ T3900]  ? do_group_exit+0x1ae/0x290
[  100.650459][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.650463][ T3900]  ? trace_preempt_on+0x179/0x2e0
[  100.650473][ T3900]  do_group_exit+0x1be/0x290
[  100.650478][ T3900]  __x64_sys_exit_group+0x48/0x50
[  100.650481][ T3900]  x64_sys_call+0x2c68/0x2c70
[  100.650484][ T3900]  do_syscall_64+0xff/0x220
[  100.650493][ T3900]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  100.650499][ T3900] RIP: 0033:0x7fc7ce262349
[  100.650503][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[  100.650505][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[  100.650513][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[  100.650515][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[  100.650517][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[  100.650522][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[  100.650524][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[  100.650530][ T3900]  </TASK>
[  100.650532][ T3900] ==================================================================
[  100.673381][ T3900] BUG: kernel NULL pointer dereference, address: 0000000000000237
[  100.674119][ T3900] #PF: supervisor write access in kernel mode
[  100.674687][ T3900] #PF: error_code(0x0002) - not-present page
[  100.675251][ T3900] PGD 0 P4D 0 
[  100.675618][ T3900] Oops: Oops: 0002 [#1] SMP KASAN
[  100.676091][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G    B           T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.677189][ T3900] Tainted: [B]=BAD_PAGE, [T]=RANDSTRUCT
[  100.677704][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.678670][ T3900] RIP: 0010:put_event+0x2a/0x730
[  100.679152][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.680761][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[  100.681342][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.682061][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.682766][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.686319][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[  100.687356][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[  100.694079][ T3900] FS:  00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[  100.695168][ T3900] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.695958][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[  100.696932][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[  100.702041][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  100.703069][ T3900] Call Trace:
[  100.703563][ T3900]  <TASK>
[  100.704019][ T3900]  ? __free_event+0x707/0x7f0
[  100.704649][ T3900]  put_event+0x69f/0x730
[  100.705985][ T3900]  ? perf_event_wakeup+0x66/0x2c0
[  100.706648][ T3900]  ? perf_event_wakeup+0x1b3/0x2c0
[  100.707305][ T3900]  perf_event_exit_event+0xa6/0xd0
[  100.708809][ T3900]  perf_event_exit_task_context+0x44e/0x550
[  100.709677][ T3900]  perf_event_exit_task+0x1dd/0x2a0
[  100.710349][ T3900]  ? fpu__drop+0x131/0x390
[  100.710922][ T3900]  ? preempt_count_sub+0x218/0x2f0
[  100.711576][ T3900]  ? fpu__drop+0x131/0x390
[  100.712162][ T3900]  do_exit+0xa4d/0x2490
[  100.712728][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.717506][ T3900]  ? do_group_exit+0x1ae/0x290
[  100.718138][ T3900]  ? _raw_spin_unlock_irq+0x38/0x90
[  100.718797][ T3900]  ? trace_preempt_on+0x179/0x2e0
[  100.719458][ T3900]  do_group_exit+0x1be/0x290
[  100.720074][ T3900]  __x64_sys_exit_group+0x48/0x50
[  100.720714][ T3900]  x64_sys_call+0x2c68/0x2c70
[  100.721340][ T3900]  do_syscall_64+0xff/0x220
[  100.721946][ T3900]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  100.722683][ T3900] RIP: 0033:0x7fc7ce262349
[  100.723272][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[  100.724109][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[  100.725125][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[  100.726139][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[  100.727073][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[  100.727935][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[  100.728806][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[  100.729722][ T3900]  </TASK>
[  100.730174][ T3900] Modules linked in: tiny_power_button button pcspkr evdev input_leds loop
[  100.731230][ T3900] CR2: 0000000000000237
[  100.731791][ T3900] ---[ end trace 0000000000000000 ]---
[  100.731795][ T3903] BUG: kernel NULL pointer dereference, address: 0000000000000237
[  100.732199][ T3900] RIP: 0010:put_event+0x2a/0x730
[  100.732817][ T3903] #PF: supervisor write access in kernel mode
[  100.733228][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.733677][ T3903] #PF: error_code(0x0002) - not-present page
[  100.733686][ T3903] PGD 0 
[  100.735120][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[  100.735548][ T3903] P4D 0 
[  100.735789][ T3900] 
[  100.736225][ T3903] Oops: Oops: 0002 [#2] SMP KASAN
[  100.736469][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.736653][ T3903] CPU: 0 UID: 65534 PID: 3903 Comm: trinity-c4 Tainted: G    B D         T   6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) 
[  100.737053][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.737624][ T3903] Tainted: [B]=BAD_PAGE, [D]=DIE, [T]=RANDSTRUCT
[  100.737634][ T3903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  100.738628][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.739246][ T3903] RIP: 0010:put_event+0x2a/0x730
[  100.739740][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[  100.740493][ T3903] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[  100.741089][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[  100.741461][ T3903] RSP: 0018:ffffc90004d97b70 EFLAGS: 00010246
[  100.741877][ T3900] FS:  00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[  100.743537][ T3903] 
[  100.744014][ T3900] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.744534][ T3903] RAX: 0000000000000001 RBX: ffffffffffffffff RCX: 0000000000000000
[  100.745071][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[  100.745279][ T3903] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  100.745681][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[  100.746350][ T3903] RBP: ffffc90004d97bd0 R08: 0000000000000000 R09: 0000000000000000
[  100.746828][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  100.746840][ T3900] Kernel panic - not syncing: Fatal exception
[  101.916945][ T3900] Shutting down cpus with NMI
[  101.929723][ T3900] Kernel Offset: disabled


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-14  1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
@ 2025-04-14 19:01 ` Peter Zijlstra
  2025-04-15  4:46   ` Oliver Sang
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-14 19:01 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland

On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
> 
> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
> 
> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
> 
> in testcase: trinity
> version: trinity-x86_64-ba2360ed-1_20241228
> with following parameters:
> 
> 	runtime: 300s
> 	group: group-02
> 	nr_groups: 5
> 
> 
> 
> config: x86_64-randconfig-078-20250407
> compiler: clang-20
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
> 

Does this help?

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2eb9cd5d86a1..528b679aaf7e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
 	_free_event(event);
 
 	/* Matches the refcount bump in inherit_event() */
-	if (parent)
+	if (parent && parent != EVENT_TOMBSTONE)
 		put_event(parent);
 }
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf]  da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-14 19:01 ` Peter Zijlstra
@ 2025-04-15  4:46   ` Oliver Sang
  2025-04-15  9:14     ` James Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Sang @ 2025-04-15  4:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland, oliver.sang

hi, Peter Zijlstra,

On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
> > 
> > commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
> > 
> > [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
> > 
> > in testcase: trinity
> > version: trinity-x86_64-ba2360ed-1_20241228
> > with following parameters:
> > 
> > 	runtime: 300s
> > 	group: group-02
> > 	nr_groups: 5
> > 
> > 
> > 
> > config: x86_64-randconfig-078-20250407
> > compiler: clang-20
> > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
> > 
> 
> Does this help?

yes, below patch fixes the issues we observed for da916e96e2. thanks

Tested-by: kernel test robot <oliver.sang@intel.com>

> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 2eb9cd5d86a1..528b679aaf7e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
>  	_free_event(event);
>  
>  	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>  		put_event(parent);
>  }
>  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15  4:46   ` Oliver Sang
@ 2025-04-15  9:14     ` James Clark
  2025-04-15 10:08       ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: James Clark @ 2025-04-15  9:14 UTC (permalink / raw)
  To: Oliver Sang, Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
	Mark Rutland



On 15/04/2025 5:46 am, Oliver Sang wrote:
> hi, Peter Zijlstra,
> 
> On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
>>>
>>> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
>>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
>>>
>>> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
>>>
>>> in testcase: trinity
>>> version: trinity-x86_64-ba2360ed-1_20241228
>>> with following parameters:
>>>
>>> 	runtime: 300s
>>> 	group: group-02
>>> 	nr_groups: 5
>>>
>>>
>>>
>>> config: x86_64-randconfig-078-20250407
>>> compiler: clang-20
>>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>>
>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>>
>>>
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>>> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
>>>
>>
>> Does this help?
> 
> yes, below patch fixes the issues we observed for da916e96e2. thanks
> 
> Tested-by: kernel test robot <oliver.sang@intel.com>
> 

Also fixes the same issues we were seeing:

Tested-by: James Clark <james.clark@linaro.org>

>>
>> ---
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 2eb9cd5d86a1..528b679aaf7e 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
>>   	_free_event(event);
>>   
>>   	/* Matches the refcount bump in inherit_event() */
>> -	if (parent)
>> +	if (parent && parent != EVENT_TOMBSTONE)
>>   		put_event(parent);
>>   }
>>   
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15  9:14     ` James Clark
@ 2025-04-15 10:08       ` Peter Zijlstra
  2025-04-15 13:14         ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-15 10:08 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland

On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> On 15/04/2025 5:46 am, Oliver Sang wrote:

> > yes, below patch fixes the issues we observed for da916e96e2. thanks
> > 
> > Tested-by: kernel test robot <oliver.sang@intel.com>
> > 
> 
> Also fixes the same issues we were seeing:
> 
> Tested-by: James Clark <james.clark@linaro.org>

Excellent, thank you both! Now I gotta go write me a Changelog :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 10:08       ` Peter Zijlstra
@ 2025-04-15 13:14         ` Peter Zijlstra
  2025-04-15 15:52           ` James Clark
  2025-04-16  8:36           ` Venkat Rao Bagalkote
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-15 13:14 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> > On 15/04/2025 5:46 am, Oliver Sang wrote:
> 
> > > yes, below patch fixes the issues we observed for da916e96e2. thanks
> > > 
> > > Tested-by: kernel test robot <oliver.sang@intel.com>
> > > 
> > 
> > Also fixes the same issues we were seeing:
> > 
> > Tested-by: James Clark <james.clark@linaro.org>
> 
> Excellent, thank you both! Now I gotta go write me a Changelog :-)

Hmm, so while writing Changelog, I noticed something else was off. The
case where event->parent was set to EVENT_TOMBSTONE now didn't have a
put_event(parent) anymore. So that needs to be put back in as well.

Frederic, afaict this should still be okay, since if we're detached,
then nothing will try and access event->parent in the free path.

Also, nothing in perf_pending_task() will try and access either
event->parent or event->pmu.

---
Subject: perf: Fix event->parent life-time issue
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Apr 15 12:12:52 CEST 2025

Due to an oversight in merging da916e96e2de ("perf: Make
perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
hang while freeing sigtrap event"), it is now possible to hit
put_event(EVENT_TOMBSTONE), which makes the computer sad.

This also means that for the event->parent == EVENT_TOMBSTONE, the
put_event() matching inherit_event() has gone missing.

Previously this was done in perf_event_release_kernel() after calling
perf_remove_from_context(), but with it delegated to put_event(), this
case is now entirely missed, leading to leaks.

Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
 	 * not being a child event. See for example unaccount_event().
 	 */
 	event->parent = EVENT_TOMBSTONE;
+	put_event(parent_event);
 }
 
 static bool is_orphaned_event(struct perf_event *event)
@@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
 	_free_event(event);
 
 	/* Matches the refcount bump in inherit_event() */
-	if (parent)
+	if (parent && parent != EVENT_TOMBSTONE)
 		put_event(parent);
 }
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 13:14         ` Peter Zijlstra
@ 2025-04-15 15:52           ` James Clark
  2025-04-16  8:46             ` Peter Zijlstra
  2025-04-16  8:36           ` Venkat Rao Bagalkote
  1 sibling, 1 reply; 11+ messages in thread
From: James Clark @ 2025-04-15 15:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker



On 15/04/2025 2:14 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
> 
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
> 
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
> 
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
> 
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
> 
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
> 
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
> 
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
> 
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   kernel/events/core.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
>   	 * not being a child event. See for example unaccount_event().
>   	 */
>   	event->parent = EVENT_TOMBSTONE;
> +	put_event(parent_event);
>   }
>   
>   static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
>   	_free_event(event);
>   
>   	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>   		put_event(parent);
>   }
>   

Hi Peter,

Unrelated to the pointer deref issue, I'm also seeing perf stat not 
working due to this commit. And that's both with and without this fixup:

  -> perf stat -- true

  Performance counter stats for 'true':

      <not counted> msec task-clock 

      <not counted>      context-switches 

      <not counted>      cpu-migrations 

      <not counted>      page-faults 

      <not counted>      armv8_cortex_a53/instructions/ 

      <not counted>      armv8_cortex_a57/instructions/ 

      <not counted>      armv8_cortex_a53/cycles/ 

      <not counted>      armv8_cortex_a57/cycles/ 

      <not counted>      armv8_cortex_a53/branches/ 

      <not counted>      armv8_cortex_a53/branch-misses/ 

      <not counted>      armv8_cortex_a57/branch-misses/ 


        0.074139992 seconds time elapsed

        0.000000000 seconds user
        0.054797000 seconds sys

Didn't look into it more other than bisecting it to this commit, but I 
can dig more unless the issue is obvious. This is on Arm big.LITTLE, 
although I didn't test it elsewhere so I'm not sure if that's relevant 
or not.

Thanks
James


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 13:14         ` Peter Zijlstra
  2025-04-15 15:52           ` James Clark
@ 2025-04-16  8:36           ` Venkat Rao Bagalkote
  1 sibling, 0 replies; 11+ messages in thread
From: Venkat Rao Bagalkote @ 2025-04-16  8:36 UTC (permalink / raw)
  To: Peter Zijlstra, James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker, venkat88,
	Athira Rajeev, Madhavan Srinivasan


On 15/04/25 6:44 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
>
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
>
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
>
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
>
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
>
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
>
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
>
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   kernel/events/core.c |    3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
>   	 * not being a child event. See for example unaccount_event().
>   	 */
>   	event->parent = EVENT_TOMBSTONE;
> +	put_event(parent_event);
>   }
>   
>   static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
>   	_free_event(event);
>   
>   	/* Matches the refcount bump in inherit_event() */
> -	if (parent)
> +	if (parent && parent != EVENT_TOMBSTONE)
>   		put_event(parent);
>   }
>   


This issue is reported on IBM Power9 servers also. Tested the above 
patch, and issue is fixed. Hence,


Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>


Regards,

Venkat.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-15 15:52           ` James Clark
@ 2025-04-16  8:46             ` Peter Zijlstra
  2025-04-16 19:08               ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-16  8:46 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> due to this commit. And that's both with and without this fixup:
> 
>  -> perf stat -- true
> 
>  Performance counter stats for 'true':
> 
>      <not counted> msec task-clock
> 
>      <not counted>      context-switches
> 
>      <not counted>      cpu-migrations
> 
>      <not counted>      page-faults
> 
>      <not counted>      armv8_cortex_a53/instructions/
> 
>      <not counted>      armv8_cortex_a57/instructions/
> 
>      <not counted>      armv8_cortex_a53/cycles/
> 
>      <not counted>      armv8_cortex_a57/cycles/
> 
>      <not counted>      armv8_cortex_a53/branches/
> 
>      <not counted>      armv8_cortex_a53/branch-misses/
> 
>      <not counted>      armv8_cortex_a57/branch-misses/
> 
> 
>        0.074139992 seconds time elapsed
> 
>        0.000000000 seconds user
>        0.054797000 seconds sys
> 
> Didn't look into it more other than bisecting it to this commit, but I can
> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> didn't test it elsewhere so I'm not sure if that's relevant or not.

I can reproduce on x86 alderlake (first machine I tried), so let me go
have a quick poke.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-16  8:46             ` Peter Zijlstra
@ 2025-04-16 19:08               ` Peter Zijlstra
  2025-04-17  8:58                 ` James Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-16 19:08 UTC (permalink / raw)
  To: James Clark
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker

On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> > Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> > due to this commit. And that's both with and without this fixup:
> > 
> >  -> perf stat -- true
> > 
> >  Performance counter stats for 'true':
> > 
> >      <not counted> msec task-clock
> > 
> >      <not counted>      context-switches
> > 
> >      <not counted>      cpu-migrations
> > 
> >      <not counted>      page-faults
> > 
> >      <not counted>      armv8_cortex_a53/instructions/
> > 
> >      <not counted>      armv8_cortex_a57/instructions/
> > 
> >      <not counted>      armv8_cortex_a53/cycles/
> > 
> >      <not counted>      armv8_cortex_a57/cycles/
> > 
> >      <not counted>      armv8_cortex_a53/branches/
> > 
> >      <not counted>      armv8_cortex_a53/branch-misses/
> > 
> >      <not counted>      armv8_cortex_a57/branch-misses/
> > 
> > 
> >        0.074139992 seconds time elapsed
> > 
> >        0.000000000 seconds user
> >        0.054797000 seconds sys
> > 
> > Didn't look into it more other than bisecting it to this commit, but I can
> > dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> > didn't test it elsewhere so I'm not sure if that's relevant or not.
> 
> I can reproduce on x86 alderlake (first machine I tried), so let me go
> have a quick poke.

Could you please try queue.git/perf/core ? I've fixed this and found
another problem.

I'll post the patches tomorrow, after the robot has had a go.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
  2025-04-16 19:08               ` Peter Zijlstra
@ 2025-04-17  8:58                 ` James Clark
  0 siblings, 0 replies; 11+ messages in thread
From: James Clark @ 2025-04-17  8:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
	linux-perf-users, Mark Rutland, Frederic Weisbecker



On 16/04/2025 8:08 pm, Peter Zijlstra wrote:
> On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
>>> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
>>> due to this commit. And that's both with and without this fixup:
>>>
>>>   -> perf stat -- true
>>>
>>>   Performance counter stats for 'true':
>>>
>>>       <not counted> msec task-clock
>>>
>>>       <not counted>      context-switches
>>>
>>>       <not counted>      cpu-migrations
>>>
>>>       <not counted>      page-faults
>>>
>>>       <not counted>      armv8_cortex_a53/instructions/
>>>
>>>       <not counted>      armv8_cortex_a57/instructions/
>>>
>>>       <not counted>      armv8_cortex_a53/cycles/
>>>
>>>       <not counted>      armv8_cortex_a57/cycles/
>>>
>>>       <not counted>      armv8_cortex_a53/branches/
>>>
>>>       <not counted>      armv8_cortex_a53/branch-misses/
>>>
>>>       <not counted>      armv8_cortex_a57/branch-misses/
>>>
>>>
>>>         0.074139992 seconds time elapsed
>>>
>>>         0.000000000 seconds user
>>>         0.054797000 seconds sys
>>>
>>> Didn't look into it more other than bisecting it to this commit, but I can
>>> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
>>> didn't test it elsewhere so I'm not sure if that's relevant or not.
>>
>> I can reproduce on x86 alderlake (first machine I tried), so let me go
>> have a quick poke.
> 
> Could you please try queue.git/perf/core ? I've fixed this and found
> another problem.
> 
> I'll post the patches tomorrow, after the robot has had a go.

Yep that's all working now, thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-04-17  8:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14  1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
2025-04-14 19:01 ` Peter Zijlstra
2025-04-15  4:46   ` Oliver Sang
2025-04-15  9:14     ` James Clark
2025-04-15 10:08       ` Peter Zijlstra
2025-04-15 13:14         ` Peter Zijlstra
2025-04-15 15:52           ` James Clark
2025-04-16  8:46             ` Peter Zijlstra
2025-04-16 19:08               ` Peter Zijlstra
2025-04-17  8:58                 ` James Clark
2025-04-16  8:36           ` Venkat Rao Bagalkote

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).