* [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
@ 2025-04-14 1:59 kernel test robot
2025-04-14 19:01 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: kernel test robot @ 2025-04-14 1:59 UTC (permalink / raw)
To: Peter Zijlstra
Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
oliver.sang
Hello,
kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
[test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
in testcase: trinity
version: trinity-x86_64-ba2360ed-1_20241228
with following parameters:
runtime: 300s
group: group-02
nr_groups: 5
config: x86_64-randconfig-078-20250407
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250413/202504131701.941039cd-lkp@intel.com
[ 100.647813][ T3900] ==================================================================
[ 100.648676][ T3900] BUG: KASAN: null-ptr-deref in put_event+0x2a/0x730
[ 100.649303][ T3900] Write of size 8 at addr 0000000000000237 by task trinity-c1/3900
[ 100.650021][ T3900]
[ 100.650314][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary)
[ 100.650323][ T3900] Tainted: [T]=RANDSTRUCT
[ 100.650325][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 100.650328][ T3900] Call Trace:
[ 100.650332][ T3900] <TASK>
[ 100.650334][ T3900] __dump_stack+0x19/0x30
[ 100.650345][ T3900] dump_stack_lvl+0xaf/0x118
[ 100.650350][ T3900] print_report+0x41/0x2d0
[ 100.650359][ T3900] kasan_report+0x15c/0x1a0
[ 100.650367][ T3900] ? put_event+0x2a/0x730
[ 100.650373][ T3900] ? put_event+0x2a/0x730
[ 100.650379][ T3900] kasan_check_range+0x2b3/0x2c0
[ 100.650383][ T3900] __kasan_check_write+0x18/0x20
[ 100.650389][ T3900] put_event+0x2a/0x730
[ 100.650392][ T3900] ? __free_event+0x707/0x7f0
[ 100.650398][ T3900] put_event+0x69f/0x730
[ 100.650401][ T3900] ? perf_event_wakeup+0x66/0x2c0
[ 100.650404][ T3900] ? perf_event_wakeup+0x1b3/0x2c0
[ 100.650408][ T3900] perf_event_exit_event+0xa6/0xd0
[ 100.650417][ T3900] perf_event_exit_task_context+0x44e/0x550
[ 100.650424][ T3900] perf_event_exit_task+0x1dd/0x2a0
[ 100.650428][ T3900] ? fpu__drop+0x131/0x390
[ 100.650432][ T3900] ? preempt_count_sub+0x218/0x2f0
[ 100.650441][ T3900] ? fpu__drop+0x131/0x390
[ 100.650445][ T3900] do_exit+0xa4d/0x2490
[ 100.650449][ T3900] ? _raw_spin_unlock_irq+0x38/0x90
[ 100.650454][ T3900] ? do_group_exit+0x1ae/0x290
[ 100.650459][ T3900] ? _raw_spin_unlock_irq+0x38/0x90
[ 100.650463][ T3900] ? trace_preempt_on+0x179/0x2e0
[ 100.650473][ T3900] do_group_exit+0x1be/0x290
[ 100.650478][ T3900] __x64_sys_exit_group+0x48/0x50
[ 100.650481][ T3900] x64_sys_call+0x2c68/0x2c70
[ 100.650484][ T3900] do_syscall_64+0xff/0x220
[ 100.650493][ T3900] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 100.650499][ T3900] RIP: 0033:0x7fc7ce262349
[ 100.650503][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[ 100.650505][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[ 100.650513][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[ 100.650515][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[ 100.650517][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[ 100.650522][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[ 100.650524][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[ 100.650530][ T3900] </TASK>
[ 100.650532][ T3900] ==================================================================
[ 100.673381][ T3900] BUG: kernel NULL pointer dereference, address: 0000000000000237
[ 100.674119][ T3900] #PF: supervisor write access in kernel mode
[ 100.674687][ T3900] #PF: error_code(0x0002) - not-present page
[ 100.675251][ T3900] PGD 0 P4D 0
[ 100.675618][ T3900] Oops: Oops: 0002 [#1] SMP KASAN
[ 100.676091][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G B T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary)
[ 100.677189][ T3900] Tainted: [B]=BAD_PAGE, [T]=RANDSTRUCT
[ 100.677704][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 100.678670][ T3900] RIP: 0010:put_event+0x2a/0x730
[ 100.679152][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[ 100.680761][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[ 100.681342][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[ 100.682061][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 100.682766][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[ 100.686319][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[ 100.687356][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[ 100.694079][ T3900] FS: 00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[ 100.695168][ T3900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 100.695958][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[ 100.696932][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[ 100.702041][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 100.703069][ T3900] Call Trace:
[ 100.703563][ T3900] <TASK>
[ 100.704019][ T3900] ? __free_event+0x707/0x7f0
[ 100.704649][ T3900] put_event+0x69f/0x730
[ 100.705985][ T3900] ? perf_event_wakeup+0x66/0x2c0
[ 100.706648][ T3900] ? perf_event_wakeup+0x1b3/0x2c0
[ 100.707305][ T3900] perf_event_exit_event+0xa6/0xd0
[ 100.708809][ T3900] perf_event_exit_task_context+0x44e/0x550
[ 100.709677][ T3900] perf_event_exit_task+0x1dd/0x2a0
[ 100.710349][ T3900] ? fpu__drop+0x131/0x390
[ 100.710922][ T3900] ? preempt_count_sub+0x218/0x2f0
[ 100.711576][ T3900] ? fpu__drop+0x131/0x390
[ 100.712162][ T3900] do_exit+0xa4d/0x2490
[ 100.712728][ T3900] ? _raw_spin_unlock_irq+0x38/0x90
[ 100.717506][ T3900] ? do_group_exit+0x1ae/0x290
[ 100.718138][ T3900] ? _raw_spin_unlock_irq+0x38/0x90
[ 100.718797][ T3900] ? trace_preempt_on+0x179/0x2e0
[ 100.719458][ T3900] do_group_exit+0x1be/0x290
[ 100.720074][ T3900] __x64_sys_exit_group+0x48/0x50
[ 100.720714][ T3900] x64_sys_call+0x2c68/0x2c70
[ 100.721340][ T3900] do_syscall_64+0xff/0x220
[ 100.721946][ T3900] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 100.722683][ T3900] RIP: 0033:0x7fc7ce262349
[ 100.723272][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f.
[ 100.724109][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[ 100.725125][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349
[ 100.726139][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
[ 100.727073][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8
[ 100.727935][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128
[ 100.728806][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000
[ 100.729722][ T3900] </TASK>
[ 100.730174][ T3900] Modules linked in: tiny_power_button button pcspkr evdev input_leds loop
[ 100.731230][ T3900] CR2: 0000000000000237
[ 100.731791][ T3900] ---[ end trace 0000000000000000 ]---
[ 100.731795][ T3903] BUG: kernel NULL pointer dereference, address: 0000000000000237
[ 100.732199][ T3900] RIP: 0010:put_event+0x2a/0x730
[ 100.732817][ T3903] #PF: supervisor write access in kernel mode
[ 100.733228][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[ 100.733677][ T3903] #PF: error_code(0x0002) - not-present page
[ 100.733686][ T3903] PGD 0
[ 100.735120][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246
[ 100.735548][ T3903] P4D 0
[ 100.735789][ T3900]
[ 100.736225][ T3903] Oops: Oops: 0002 [#2] SMP KASAN
[ 100.736469][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[ 100.736653][ T3903] CPU: 0 UID: 65534 PID: 3903 Comm: trinity-c4 Tainted: G B D T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary)
[ 100.737053][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 100.737624][ T3903] Tainted: [B]=BAD_PAGE, [D]=DIE, [T]=RANDSTRUCT
[ 100.737634][ T3903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 100.738628][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000
[ 100.739246][ T3903] RIP: 0010:put_event+0x2a/0x730
[ 100.739740][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
[ 100.740493][ T3903] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc
[ 100.741089][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618
[ 100.741461][ T3903] RSP: 0018:ffffc90004d97b70 EFLAGS: 00010246
[ 100.741877][ T3900] FS: 00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000
[ 100.743537][ T3903]
[ 100.744014][ T3900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 100.744534][ T3903] RAX: 0000000000000001 RBX: ffffffffffffffff RCX: 0000000000000000
[ 100.745071][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0
[ 100.745279][ T3903] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 100.745681][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000
[ 100.746350][ T3903] RBP: ffffc90004d97bd0 R08: 0000000000000000 R09: 0000000000000000
[ 100.746828][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 100.746840][ T3900] Kernel panic - not syncing: Fatal exception
[ 101.916945][ T3900] Shutting down cpus with NMI
[ 101.929723][ T3900] Kernel Offset: disabled
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-14 1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
@ 2025-04-14 19:01 ` Peter Zijlstra
2025-04-15 4:46 ` Oliver Sang
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-14 19:01 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
Mark Rutland
On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
>
> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
>
> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
>
> in testcase: trinity
> version: trinity-x86_64-ba2360ed-1_20241228
> with following parameters:
>
> runtime: 300s
> group: group-02
> nr_groups: 5
>
>
>
> config: x86_64-randconfig-078-20250407
> compiler: clang-20
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
>
Does this help?
---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2eb9cd5d86a1..528b679aaf7e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
_free_event(event);
/* Matches the refcount bump in inherit_event() */
- if (parent)
+ if (parent && parent != EVENT_TOMBSTONE)
put_event(parent);
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-14 19:01 ` Peter Zijlstra
@ 2025-04-15 4:46 ` Oliver Sang
2025-04-15 9:14 ` James Clark
0 siblings, 1 reply; 11+ messages in thread
From: Oliver Sang @ 2025-04-15 4:46 UTC (permalink / raw)
To: Peter Zijlstra
Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
Mark Rutland, oliver.sang
hi, Peter Zijlstra,
On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
> >
> >
> > Hello,
> >
> > kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
> >
> > commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
> > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
> >
> > [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
> >
> > in testcase: trinity
> > version: trinity-x86_64-ba2360ed-1_20241228
> > with following parameters:
> >
> > runtime: 300s
> > group: group-02
> > nr_groups: 5
> >
> >
> >
> > config: x86_64-randconfig-078-20250407
> > compiler: clang-20
> > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> >
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
> >
>
> Does this help?
yes, below patch fixes the issues we observed for da916e96e2. thanks
Tested-by: kernel test robot <oliver.sang@intel.com>
>
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 2eb9cd5d86a1..528b679aaf7e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
> _free_event(event);
>
> /* Matches the refcount bump in inherit_event() */
> - if (parent)
> + if (parent && parent != EVENT_TOMBSTONE)
> put_event(parent);
> }
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 4:46 ` Oliver Sang
@ 2025-04-15 9:14 ` James Clark
2025-04-15 10:08 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: James Clark @ 2025-04-15 9:14 UTC (permalink / raw)
To: Oliver Sang, Peter Zijlstra
Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users,
Mark Rutland
On 15/04/2025 5:46 am, Oliver Sang wrote:
> hi, Peter Zijlstra,
>
> On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on:
>>>
>>> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable")
>>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
>>>
>>> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18]
>>>
>>> in testcase: trinity
>>> version: trinity-x86_64-ba2360ed-1_20241228
>>> with following parameters:
>>>
>>> runtime: 300s
>>> group: group-02
>>> nr_groups: 5
>>>
>>>
>>>
>>> config: x86_64-randconfig-078-20250407
>>> compiler: clang-20
>>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>>
>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>>
>>>
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>>> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
>>>
>>
>> Does this help?
>
> yes, below patch fixes the issues we observed for da916e96e2. thanks
>
> Tested-by: kernel test robot <oliver.sang@intel.com>
>
Also fixes the same issues we were seeing:
Tested-by: James Clark <james.clark@linaro.org>
>>
>> ---
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 2eb9cd5d86a1..528b679aaf7e 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event)
>> _free_event(event);
>>
>> /* Matches the refcount bump in inherit_event() */
>> - if (parent)
>> + if (parent && parent != EVENT_TOMBSTONE)
>> put_event(parent);
>> }
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 9:14 ` James Clark
@ 2025-04-15 10:08 ` Peter Zijlstra
2025-04-15 13:14 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-15 10:08 UTC (permalink / raw)
To: James Clark
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland
On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> On 15/04/2025 5:46 am, Oliver Sang wrote:
> > yes, below patch fixes the issues we observed for da916e96e2. thanks
> >
> > Tested-by: kernel test robot <oliver.sang@intel.com>
> >
>
> Also fixes the same issues we were seeing:
>
> Tested-by: James Clark <james.clark@linaro.org>
Excellent, thank you both! Now I gotta go write me a Changelog :-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 10:08 ` Peter Zijlstra
@ 2025-04-15 13:14 ` Peter Zijlstra
2025-04-15 15:52 ` James Clark
2025-04-16 8:36 ` Venkat Rao Bagalkote
0 siblings, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-15 13:14 UTC (permalink / raw)
To: James Clark
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker
On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
> > On 15/04/2025 5:46 am, Oliver Sang wrote:
>
> > > yes, below patch fixes the issues we observed for da916e96e2. thanks
> > >
> > > Tested-by: kernel test robot <oliver.sang@intel.com>
> > >
> >
> > Also fixes the same issues we were seeing:
> >
> > Tested-by: James Clark <james.clark@linaro.org>
>
> Excellent, thank you both! Now I gotta go write me a Changelog :-)
Hmm, so while writing Changelog, I noticed something else was off. The
case where event->parent was set to EVENT_TOMBSTONE now didn't have a
put_event(parent) anymore. So that needs to be put back in as well.
Frederic, afaict this should still be okay, since if we're detached,
then nothing will try and access event->parent in the free path.
Also, nothing in perf_pending_task() will try and access either
event->parent or event->pmu.
---
Subject: perf: Fix event->parent life-time issue
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue Apr 15 12:12:52 CEST 2025
Due to an oversight in merging da916e96e2de ("perf: Make
perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
hang while freeing sigtrap event"), it is now possible to hit
put_event(EVENT_TOMBSTONE), which makes the computer sad.
This also means that for the event->parent == EVENT_TOMBSTONE, the
put_event() matching inherit_event() has gone missing.
Previously this was done in perf_event_release_kernel() after calling
perf_remove_from_context(), but with it delegated to put_event(), this
case is now entirely missed, leading to leaks.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/events/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
* not being a child event. See for example unaccount_event().
*/
event->parent = EVENT_TOMBSTONE;
+ put_event(parent_event);
}
static bool is_orphaned_event(struct perf_event *event)
@@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
_free_event(event);
/* Matches the refcount bump in inherit_event() */
- if (parent)
+ if (parent && parent != EVENT_TOMBSTONE)
put_event(parent);
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 13:14 ` Peter Zijlstra
@ 2025-04-15 15:52 ` James Clark
2025-04-16 8:46 ` Peter Zijlstra
2025-04-16 8:36 ` Venkat Rao Bagalkote
1 sibling, 1 reply; 11+ messages in thread
From: James Clark @ 2025-04-15 15:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker
On 15/04/2025 2:14 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
>
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
>
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
>
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
>
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
>
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
>
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
>
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
>
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/events/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
> * not being a child event. See for example unaccount_event().
> */
> event->parent = EVENT_TOMBSTONE;
> + put_event(parent_event);
> }
>
> static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
> _free_event(event);
>
> /* Matches the refcount bump in inherit_event() */
> - if (parent)
> + if (parent && parent != EVENT_TOMBSTONE)
> put_event(parent);
> }
>
Hi Peter,
Unrelated to the pointer deref issue, I'm also seeing perf stat not
working due to this commit. And that's both with and without this fixup:
-> perf stat -- true
Performance counter stats for 'true':
<not counted> msec task-clock
<not counted> context-switches
<not counted> cpu-migrations
<not counted> page-faults
<not counted> armv8_cortex_a53/instructions/
<not counted> armv8_cortex_a57/instructions/
<not counted> armv8_cortex_a53/cycles/
<not counted> armv8_cortex_a57/cycles/
<not counted> armv8_cortex_a53/branches/
<not counted> armv8_cortex_a53/branch-misses/
<not counted> armv8_cortex_a57/branch-misses/
0.074139992 seconds time elapsed
0.000000000 seconds user
0.054797000 seconds sys
Didn't look into it more other than bisecting it to this commit, but I
can dig more unless the issue is obvious. This is on Arm big.LITTLE,
although I didn't test it elsewhere so I'm not sure if that's relevant
or not.
Thanks
James
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 13:14 ` Peter Zijlstra
2025-04-15 15:52 ` James Clark
@ 2025-04-16 8:36 ` Venkat Rao Bagalkote
1 sibling, 0 replies; 11+ messages in thread
From: Venkat Rao Bagalkote @ 2025-04-16 8:36 UTC (permalink / raw)
To: Peter Zijlstra, James Clark
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker, venkat88,
Athira Rajeev, Madhavan Srinivasan
On 15/04/25 6:44 pm, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote:
>>> On 15/04/2025 5:46 am, Oliver Sang wrote:
>>>> yes, below patch fixes the issues we observed for da916e96e2. thanks
>>>>
>>>> Tested-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>> Also fixes the same issues we were seeing:
>>>
>>> Tested-by: James Clark <james.clark@linaro.org>
>> Excellent, thank you both! Now I gotta go write me a Changelog :-)
> Hmm, so while writing Changelog, I noticed something else was off. The
> case where event->parent was set to EVENT_TOMBSTONE now didn't have a
> put_event(parent) anymore. So that needs to be put back in as well.
>
> Frederic, afaict this should still be okay, since if we're detached,
> then nothing will try and access event->parent in the free path.
>
> Also, nothing in perf_pending_task() will try and access either
> event->parent or event->pmu.
>
> ---
> Subject: perf: Fix event->parent life-time issue
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Apr 15 12:12:52 CEST 2025
>
> Due to an oversight in merging da916e96e2de ("perf: Make
> perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix
> hang while freeing sigtrap event"), it is now possible to hit
> put_event(EVENT_TOMBSTONE), which makes the computer sad.
>
> This also means that for the event->parent == EVENT_TOMBSTONE, the
> put_event() matching inherit_event() has gone missing.
>
> Previously this was done in perf_event_release_kernel() after calling
> perf_remove_from_context(), but with it delegated to put_event(), this
> case is now entirely missed, leading to leaks.
>
> Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/events/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per
> * not being a child event. See for example unaccount_event().
> */
> event->parent = EVENT_TOMBSTONE;
> + put_event(parent_event);
> }
>
> static bool is_orphaned_event(struct perf_event *event)
> @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event
> _free_event(event);
>
> /* Matches the refcount bump in inherit_event() */
> - if (parent)
> + if (parent && parent != EVENT_TOMBSTONE)
> put_event(parent);
> }
>
This issue is reported on IBM Power9 servers also. Tested the above
patch, and issue is fixed. Hence,
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Regards,
Venkat.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-15 15:52 ` James Clark
@ 2025-04-16 8:46 ` Peter Zijlstra
2025-04-16 19:08 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-16 8:46 UTC (permalink / raw)
To: James Clark
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker
On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> due to this commit. And that's both with and without this fixup:
>
> -> perf stat -- true
>
> Performance counter stats for 'true':
>
> <not counted> msec task-clock
>
> <not counted> context-switches
>
> <not counted> cpu-migrations
>
> <not counted> page-faults
>
> <not counted> armv8_cortex_a53/instructions/
>
> <not counted> armv8_cortex_a57/instructions/
>
> <not counted> armv8_cortex_a53/cycles/
>
> <not counted> armv8_cortex_a57/cycles/
>
> <not counted> armv8_cortex_a53/branches/
>
> <not counted> armv8_cortex_a53/branch-misses/
>
> <not counted> armv8_cortex_a57/branch-misses/
>
>
> 0.074139992 seconds time elapsed
>
> 0.000000000 seconds user
> 0.054797000 seconds sys
>
> Didn't look into it more other than bisecting it to this commit, but I can
> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> didn't test it elsewhere so I'm not sure if that's relevant or not.
I can reproduce on x86 alderlake (first machine I tried), so let me go
have a quick poke.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-16 8:46 ` Peter Zijlstra
@ 2025-04-16 19:08 ` Peter Zijlstra
2025-04-17 8:58 ` James Clark
0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2025-04-16 19:08 UTC (permalink / raw)
To: James Clark
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker
On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
> > Unrelated to the pointer deref issue, I'm also seeing perf stat not working
> > due to this commit. And that's both with and without this fixup:
> >
> > -> perf stat -- true
> >
> > Performance counter stats for 'true':
> >
> > <not counted> msec task-clock
> >
> > <not counted> context-switches
> >
> > <not counted> cpu-migrations
> >
> > <not counted> page-faults
> >
> > <not counted> armv8_cortex_a53/instructions/
> >
> > <not counted> armv8_cortex_a57/instructions/
> >
> > <not counted> armv8_cortex_a53/cycles/
> >
> > <not counted> armv8_cortex_a57/cycles/
> >
> > <not counted> armv8_cortex_a53/branches/
> >
> > <not counted> armv8_cortex_a53/branch-misses/
> >
> > <not counted> armv8_cortex_a57/branch-misses/
> >
> >
> > 0.074139992 seconds time elapsed
> >
> > 0.000000000 seconds user
> > 0.054797000 seconds sys
> >
> > Didn't look into it more other than bisecting it to this commit, but I can
> > dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
> > didn't test it elsewhere so I'm not sure if that's relevant or not.
>
> I can reproduce on x86 alderlake (first machine I tried), so let me go
> have a quick poke.
Could you please try queue.git/perf/core ? I've fixed this and found
another problem.
I'll post the patches tomorrow, after the robot has had a go.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event
2025-04-16 19:08 ` Peter Zijlstra
@ 2025-04-17 8:58 ` James Clark
0 siblings, 0 replies; 11+ messages in thread
From: James Clark @ 2025-04-17 8:58 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria,
linux-perf-users, Mark Rutland, Frederic Weisbecker
On 16/04/2025 8:08 pm, Peter Zijlstra wrote:
> On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote:
>>> Unrelated to the pointer deref issue, I'm also seeing perf stat not working
>>> due to this commit. And that's both with and without this fixup:
>>>
>>> -> perf stat -- true
>>>
>>> Performance counter stats for 'true':
>>>
>>> <not counted> msec task-clock
>>>
>>> <not counted> context-switches
>>>
>>> <not counted> cpu-migrations
>>>
>>> <not counted> page-faults
>>>
>>> <not counted> armv8_cortex_a53/instructions/
>>>
>>> <not counted> armv8_cortex_a57/instructions/
>>>
>>> <not counted> armv8_cortex_a53/cycles/
>>>
>>> <not counted> armv8_cortex_a57/cycles/
>>>
>>> <not counted> armv8_cortex_a53/branches/
>>>
>>> <not counted> armv8_cortex_a53/branch-misses/
>>>
>>> <not counted> armv8_cortex_a57/branch-misses/
>>>
>>>
>>> 0.074139992 seconds time elapsed
>>>
>>> 0.000000000 seconds user
>>> 0.054797000 seconds sys
>>>
>>> Didn't look into it more other than bisecting it to this commit, but I can
>>> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I
>>> didn't test it elsewhere so I'm not sure if that's relevant or not.
>>
>> I can reproduce on x86 alderlake (first machine I tried), so let me go
>> have a quick poke.
>
> Could you please try queue.git/perf/core ? I've fixed this and found
> another problem.
>
> I'll post the patches tomorrow, after the robot has had a go.
Yep that's all working now, thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-17 8:58 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot
2025-04-14 19:01 ` Peter Zijlstra
2025-04-15 4:46 ` Oliver Sang
2025-04-15 9:14 ` James Clark
2025-04-15 10:08 ` Peter Zijlstra
2025-04-15 13:14 ` Peter Zijlstra
2025-04-15 15:52 ` James Clark
2025-04-16 8:46 ` Peter Zijlstra
2025-04-16 19:08 ` Peter Zijlstra
2025-04-17 8:58 ` James Clark
2025-04-16 8:36 ` Venkat Rao Bagalkote
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).