* [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event @ 2025-04-14 1:59 kernel test robot 2025-04-14 19:01 ` Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: kernel test robot @ 2025-04-14 1:59 UTC (permalink / raw) To: Peter Zijlstra Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, oliver.sang Hello, kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on: commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18] in testcase: trinity version: trinity-x86_64-ba2360ed-1_20241228 with following parameters: runtime: 300s group: group-02 nr_groups: 5 config: x86_64-randconfig-078-20250407 compiler: clang-20 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250413/202504131701.941039cd-lkp@intel.com [ 100.647813][ T3900] ================================================================== [ 100.648676][ T3900] BUG: KASAN: null-ptr-deref in put_event+0x2a/0x730 [ 100.649303][ T3900] Write of size 8 at addr 0000000000000237 by task trinity-c1/3900 [ 100.650021][ T3900] [ 100.650314][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) [ 100.650323][ T3900] Tainted: [T]=RANDSTRUCT [ 100.650325][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 100.650328][ T3900] Call Trace: [ 100.650332][ T3900] <TASK> [ 100.650334][ T3900] __dump_stack+0x19/0x30 [ 100.650345][ T3900] dump_stack_lvl+0xaf/0x118 [ 100.650350][ T3900] print_report+0x41/0x2d0 [ 100.650359][ T3900] kasan_report+0x15c/0x1a0 [ 100.650367][ T3900] ? put_event+0x2a/0x730 [ 100.650373][ T3900] ? put_event+0x2a/0x730 [ 100.650379][ T3900] kasan_check_range+0x2b3/0x2c0 [ 100.650383][ T3900] __kasan_check_write+0x18/0x20 [ 100.650389][ T3900] put_event+0x2a/0x730 [ 100.650392][ T3900] ? __free_event+0x707/0x7f0 [ 100.650398][ T3900] put_event+0x69f/0x730 [ 100.650401][ T3900] ? perf_event_wakeup+0x66/0x2c0 [ 100.650404][ T3900] ? perf_event_wakeup+0x1b3/0x2c0 [ 100.650408][ T3900] perf_event_exit_event+0xa6/0xd0 [ 100.650417][ T3900] perf_event_exit_task_context+0x44e/0x550 [ 100.650424][ T3900] perf_event_exit_task+0x1dd/0x2a0 [ 100.650428][ T3900] ? fpu__drop+0x131/0x390 [ 100.650432][ T3900] ? preempt_count_sub+0x218/0x2f0 [ 100.650441][ T3900] ? fpu__drop+0x131/0x390 [ 100.650445][ T3900] do_exit+0xa4d/0x2490 [ 100.650449][ T3900] ? _raw_spin_unlock_irq+0x38/0x90 [ 100.650454][ T3900] ? do_group_exit+0x1ae/0x290 [ 100.650459][ T3900] ? _raw_spin_unlock_irq+0x38/0x90 [ 100.650463][ T3900] ? trace_preempt_on+0x179/0x2e0 [ 100.650473][ T3900] do_group_exit+0x1be/0x290 [ 100.650478][ T3900] __x64_sys_exit_group+0x48/0x50 [ 100.650481][ T3900] x64_sys_call+0x2c68/0x2c70 [ 100.650484][ T3900] do_syscall_64+0xff/0x220 [ 100.650493][ T3900] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 100.650499][ T3900] RIP: 0033:0x7fc7ce262349 [ 100.650503][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f. [ 100.650505][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 [ 100.650513][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349 [ 100.650515][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 [ 100.650517][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8 [ 100.650522][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128 [ 100.650524][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000 [ 100.650530][ T3900] </TASK> [ 100.650532][ T3900] ================================================================== [ 100.673381][ T3900] BUG: kernel NULL pointer dereference, address: 0000000000000237 [ 100.674119][ T3900] #PF: supervisor write access in kernel mode [ 100.674687][ T3900] #PF: error_code(0x0002) - not-present page [ 100.675251][ T3900] PGD 0 P4D 0 [ 100.675618][ T3900] Oops: Oops: 0002 [#1] SMP KASAN [ 100.676091][ T3900] CPU: 1 UID: 65534 PID: 3900 Comm: trinity-c1 Tainted: G B T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) [ 100.677189][ T3900] Tainted: [B]=BAD_PAGE, [T]=RANDSTRUCT [ 100.677704][ T3900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 100.678670][ T3900] RIP: 0010:put_event+0x2a/0x730 [ 100.679152][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc [ 100.680761][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246 [ 100.681342][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000 [ 100.682061][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 100.682766][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000 [ 100.686319][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff [ 100.687356][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618 [ 100.694079][ T3900] FS: 00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000 [ 100.695168][ T3900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 100.695958][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0 [ 100.696932][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000 [ 100.702041][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 [ 100.703069][ T3900] Call Trace: [ 100.703563][ T3900] <TASK> [ 100.704019][ T3900] ? __free_event+0x707/0x7f0 [ 100.704649][ T3900] put_event+0x69f/0x730 [ 100.705985][ T3900] ? perf_event_wakeup+0x66/0x2c0 [ 100.706648][ T3900] ? perf_event_wakeup+0x1b3/0x2c0 [ 100.707305][ T3900] perf_event_exit_event+0xa6/0xd0 [ 100.708809][ T3900] perf_event_exit_task_context+0x44e/0x550 [ 100.709677][ T3900] perf_event_exit_task+0x1dd/0x2a0 [ 100.710349][ T3900] ? fpu__drop+0x131/0x390 [ 100.710922][ T3900] ? preempt_count_sub+0x218/0x2f0 [ 100.711576][ T3900] ? fpu__drop+0x131/0x390 [ 100.712162][ T3900] do_exit+0xa4d/0x2490 [ 100.712728][ T3900] ? _raw_spin_unlock_irq+0x38/0x90 [ 100.717506][ T3900] ? do_group_exit+0x1ae/0x290 [ 100.718138][ T3900] ? _raw_spin_unlock_irq+0x38/0x90 [ 100.718797][ T3900] ? trace_preempt_on+0x179/0x2e0 [ 100.719458][ T3900] do_group_exit+0x1be/0x290 [ 100.720074][ T3900] __x64_sys_exit_group+0x48/0x50 [ 100.720714][ T3900] x64_sys_call+0x2c68/0x2c70 [ 100.721340][ T3900] do_syscall_64+0xff/0x220 [ 100.721946][ T3900] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 100.722683][ T3900] RIP: 0033:0x7fc7ce262349 [ 100.723272][ T3900] Code: Unable to access opcode bytes at 0x7fc7ce26231f. [ 100.724109][ T3900] RSP: 002b:00007ffdecd6a3e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 [ 100.725125][ T3900] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc7ce262349 [ 100.726139][ T3900] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 [ 100.727073][ T3900] RBP: 00007fc7ccbbe058 R08: ffffffffffffff80 R09: fffffffffffffff8 [ 100.727935][ T3900] R10: 00007fc7ce1a0200 R11: 0000000000000206 R12: 0000000000000128 [ 100.728806][ T3900] R13: 00007fc7ce18b6c0 R14: 00007fc7ccbbe058 R15: 00007fc7ccbbe000 [ 100.729722][ T3900] </TASK> [ 100.730174][ T3900] Modules linked in: tiny_power_button button pcspkr evdev input_leds loop [ 100.731230][ T3900] CR2: 0000000000000237 [ 100.731791][ T3900] ---[ end trace 0000000000000000 ]--- [ 100.731795][ T3903] BUG: kernel NULL pointer dereference, address: 0000000000000237 [ 100.732199][ T3900] RIP: 0010:put_event+0x2a/0x730 [ 100.732817][ T3903] #PF: supervisor write access in kernel mode [ 100.733228][ T3900] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc [ 100.733677][ T3903] #PF: error_code(0x0002) - not-present page [ 100.733686][ T3903] PGD 0 [ 100.735120][ T3900] RSP: 0018:ffffc90004d67b70 EFLAGS: 00010246 [ 100.735548][ T3903] P4D 0 [ 100.735789][ T3900] [ 100.736225][ T3903] Oops: Oops: 0002 [#2] SMP KASAN [ 100.736469][ T3900] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000 [ 100.736653][ T3903] CPU: 0 UID: 65534 PID: 3903 Comm: trinity-c4 Tainted: G B D T 6.15.0-rc1-00011-gda916e96e2de #1 PREEMPT(voluntary) [ 100.737053][ T3900] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 100.737624][ T3903] Tainted: [B]=BAD_PAGE, [D]=DIE, [T]=RANDSTRUCT [ 100.737634][ T3903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 100.738628][ T3900] RBP: ffffc90004d67bd0 R08: 0000000000000000 R09: 0000000000000000 [ 100.739246][ T3903] RIP: 0010:put_event+0x2a/0x730 [ 100.739740][ T3900] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff [ 100.740493][ T3903] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 38 48 89 fb 48 81 c7 38 02 00 00 be 08 00 00 00 e8 06 14 22 00 <f0> 48 ff 8b 38 02 00 00 0f 85 67 06 00 00 49 be 00 00 00 00 00 fc [ 100.741089][ T3900] R13: 1ffff11024ef368a R14: dffffc0000000000 R15: ffff88812779b618 [ 100.741461][ T3903] RSP: 0018:ffffc90004d97b70 EFLAGS: 00010246 [ 100.741877][ T3900] FS: 00007fc7ce18b740(0000) GS:ffff888428dd7000(0000) knlGS:0000000000000000 [ 100.743537][ T3903] [ 100.744014][ T3900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 100.744534][ T3903] RAX: 0000000000000001 RBX: ffffffffffffffff RCX: 0000000000000000 [ 100.745071][ T3900] CR2: 0000000000000237 CR3: 0000000004cd7000 CR4: 00000000000406b0 [ 100.745279][ T3903] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 100.745681][ T3900] DR0: 00007fc7cc290000 DR1: 0000000000000000 DR2: 0000000000000000 [ 100.746350][ T3903] RBP: ffffc90004d97bd0 R08: 0000000000000000 R09: 0000000000000000 [ 100.746828][ T3900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 [ 100.746840][ T3900] Kernel panic - not syncing: Fatal exception [ 101.916945][ T3900] Shutting down cpus with NMI [ 101.929723][ T3900] Kernel Offset: disabled -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-14 1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot @ 2025-04-14 19:01 ` Peter Zijlstra 2025-04-15 4:46 ` Oliver Sang 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2025-04-14 19:01 UTC (permalink / raw) To: kernel test robot Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote: > > > Hello, > > kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on: > > commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable") > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core > > [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18] > > in testcase: trinity > version: trinity-x86_64-ba2360ed-1_20241228 > with following parameters: > > runtime: 300s > group: group-02 > nr_groups: 5 > > > > config: x86_64-randconfig-078-20250407 > compiler: clang-20 > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com > Does this help? --- diff --git a/kernel/events/core.c b/kernel/events/core.c index 2eb9cd5d86a1..528b679aaf7e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event) _free_event(event); /* Matches the refcount bump in inherit_event() */ - if (parent) + if (parent && parent != EVENT_TOMBSTONE) put_event(parent); } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-14 19:01 ` Peter Zijlstra @ 2025-04-15 4:46 ` Oliver Sang 2025-04-15 9:14 ` James Clark 0 siblings, 1 reply; 11+ messages in thread From: Oliver Sang @ 2025-04-15 4:46 UTC (permalink / raw) To: Peter Zijlstra Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, oliver.sang hi, Peter Zijlstra, On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote: > On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote: > > > > > > Hello, > > > > kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on: > > > > commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable") > > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core > > > > [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18] > > > > in testcase: trinity > > version: trinity-x86_64-ba2360ed-1_20241228 > > with following parameters: > > > > runtime: 300s > > group: group-02 > > nr_groups: 5 > > > > > > > > config: x86_64-randconfig-078-20250407 > > compiler: clang-20 > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@intel.com> > > | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com > > > > Does this help? yes, below patch fixes the issues we observed for da916e96e2. thanks Tested-by: kernel test robot <oliver.sang@intel.com> > > --- > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2eb9cd5d86a1..528b679aaf7e 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event) > _free_event(event); > > /* Matches the refcount bump in inherit_event() */ > - if (parent) > + if (parent && parent != EVENT_TOMBSTONE) > put_event(parent); > } > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 4:46 ` Oliver Sang @ 2025-04-15 9:14 ` James Clark 2025-04-15 10:08 ` Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: James Clark @ 2025-04-15 9:14 UTC (permalink / raw) To: Oliver Sang, Peter Zijlstra Cc: oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland On 15/04/2025 5:46 am, Oliver Sang wrote: > hi, Peter Zijlstra, > > On Mon, Apr 14, 2025 at 09:01:38PM +0200, Peter Zijlstra wrote: >> On Mon, Apr 14, 2025 at 09:59:25AM +0800, kernel test robot wrote: >>> >>> >>> Hello, >>> >>> kernel test robot noticed "BUG:KASAN:null-ptr-deref_in_put_event" on: >>> >>> commit: da916e96e2dedcb2d40de77a7def833d315b81a6 ("perf: Make perf_pmu_unregister() useable") >>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core >>> >>> [test failed on linux-next/master 29e7bf01ed8033c9a14ed0dc990dfe2736dbcd18] >>> >>> in testcase: trinity >>> version: trinity-x86_64-ba2360ed-1_20241228 >>> with following parameters: >>> >>> runtime: 300s >>> group: group-02 >>> nr_groups: 5 >>> >>> >>> >>> config: x86_64-randconfig-078-20250407 >>> compiler: clang-20 >>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G >>> >>> (please refer to attached dmesg/kmsg for entire log/backtrace) >>> >>> >>> >>> If you fix the issue in a separate patch/commit (i.e. not just a new version of >>> the same patch/commit), kindly add following tags >>> | Reported-by: kernel test robot <oliver.sang@intel.com> >>> | Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com >>> >> >> Does this help? > > yes, below patch fixes the issues we observed for da916e96e2. thanks > > Tested-by: kernel test robot <oliver.sang@intel.com> > Also fixes the same issues we were seeing: Tested-by: James Clark <james.clark@linaro.org> >> >> --- >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 2eb9cd5d86a1..528b679aaf7e 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -5687,7 +5687,7 @@ static void put_event(struct perf_event *event) >> _free_event(event); >> >> /* Matches the refcount bump in inherit_event() */ >> - if (parent) >> + if (parent && parent != EVENT_TOMBSTONE) >> put_event(parent); >> } >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 9:14 ` James Clark @ 2025-04-15 10:08 ` Peter Zijlstra 2025-04-15 13:14 ` Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2025-04-15 10:08 UTC (permalink / raw) To: James Clark Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote: > On 15/04/2025 5:46 am, Oliver Sang wrote: > > yes, below patch fixes the issues we observed for da916e96e2. thanks > > > > Tested-by: kernel test robot <oliver.sang@intel.com> > > > > Also fixes the same issues we were seeing: > > Tested-by: James Clark <james.clark@linaro.org> Excellent, thank you both! Now I gotta go write me a Changelog :-) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 10:08 ` Peter Zijlstra @ 2025-04-15 13:14 ` Peter Zijlstra 2025-04-15 15:52 ` James Clark 2025-04-16 8:36 ` Venkat Rao Bagalkote 0 siblings, 2 replies; 11+ messages in thread From: Peter Zijlstra @ 2025-04-15 13:14 UTC (permalink / raw) To: James Clark Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote: > On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote: > > On 15/04/2025 5:46 am, Oliver Sang wrote: > > > > yes, below patch fixes the issues we observed for da916e96e2. thanks > > > > > > Tested-by: kernel test robot <oliver.sang@intel.com> > > > > > > > Also fixes the same issues we were seeing: > > > > Tested-by: James Clark <james.clark@linaro.org> > > Excellent, thank you both! Now I gotta go write me a Changelog :-) Hmm, so while writing Changelog, I noticed something else was off. The case where event->parent was set to EVENT_TOMBSTONE now didn't have a put_event(parent) anymore. So that needs to be put back in as well. Frederic, afaict this should still be okay, since if we're detached, then nothing will try and access event->parent in the free path. Also, nothing in perf_pending_task() will try and access either event->parent or event->pmu. --- Subject: perf: Fix event->parent life-time issue From: Peter Zijlstra <peterz@infradead.org> Date: Tue Apr 15 12:12:52 CEST 2025 Due to an oversight in merging da916e96e2de ("perf: Make perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix hang while freeing sigtrap event"), it is now possible to hit put_event(EVENT_TOMBSTONE), which makes the computer sad. This also means that for the event->parent == EVENT_TOMBSTONE, the put_event() matching inherit_event() has gone missing. Previously this was done in perf_event_release_kernel() after calling perf_remove_from_context(), but with it delegated to put_event(), this case is now entirely missed, leading to leaks. Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/events/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per * not being a child event. See for example unaccount_event(). */ event->parent = EVENT_TOMBSTONE; + put_event(parent_event); } static bool is_orphaned_event(struct perf_event *event) @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event _free_event(event); /* Matches the refcount bump in inherit_event() */ - if (parent) + if (parent && parent != EVENT_TOMBSTONE) put_event(parent); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 13:14 ` Peter Zijlstra @ 2025-04-15 15:52 ` James Clark 2025-04-16 8:46 ` Peter Zijlstra 2025-04-16 8:36 ` Venkat Rao Bagalkote 1 sibling, 1 reply; 11+ messages in thread From: James Clark @ 2025-04-15 15:52 UTC (permalink / raw) To: Peter Zijlstra Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker On 15/04/2025 2:14 pm, Peter Zijlstra wrote: > On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote: >> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote: >>> On 15/04/2025 5:46 am, Oliver Sang wrote: >> >>>> yes, below patch fixes the issues we observed for da916e96e2. thanks >>>> >>>> Tested-by: kernel test robot <oliver.sang@intel.com> >>>> >>> >>> Also fixes the same issues we were seeing: >>> >>> Tested-by: James Clark <james.clark@linaro.org> >> >> Excellent, thank you both! Now I gotta go write me a Changelog :-) > > Hmm, so while writing Changelog, I noticed something else was off. The > case where event->parent was set to EVENT_TOMBSTONE now didn't have a > put_event(parent) anymore. So that needs to be put back in as well. > > Frederic, afaict this should still be okay, since if we're detached, > then nothing will try and access event->parent in the free path. > > Also, nothing in perf_pending_task() will try and access either > event->parent or event->pmu. > > --- > Subject: perf: Fix event->parent life-time issue > From: Peter Zijlstra <peterz@infradead.org> > Date: Tue Apr 15 12:12:52 CEST 2025 > > Due to an oversight in merging da916e96e2de ("perf: Make > perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix > hang while freeing sigtrap event"), it is now possible to hit > put_event(EVENT_TOMBSTONE), which makes the computer sad. > > This also means that for the event->parent == EVENT_TOMBSTONE, the > put_event() matching inherit_event() has gone missing. > > Previously this was done in perf_event_release_kernel() after calling > perf_remove_from_context(), but with it delegated to put_event(), this > case is now entirely missed, leading to leaks. > > Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable") > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > kernel/events/core.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per > * not being a child event. See for example unaccount_event(). > */ > event->parent = EVENT_TOMBSTONE; > + put_event(parent_event); > } > > static bool is_orphaned_event(struct perf_event *event) > @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event > _free_event(event); > > /* Matches the refcount bump in inherit_event() */ > - if (parent) > + if (parent && parent != EVENT_TOMBSTONE) > put_event(parent); > } > Hi Peter, Unrelated to the pointer deref issue, I'm also seeing perf stat not working due to this commit. And that's both with and without this fixup: -> perf stat -- true Performance counter stats for 'true': <not counted> msec task-clock <not counted> context-switches <not counted> cpu-migrations <not counted> page-faults <not counted> armv8_cortex_a53/instructions/ <not counted> armv8_cortex_a57/instructions/ <not counted> armv8_cortex_a53/cycles/ <not counted> armv8_cortex_a57/cycles/ <not counted> armv8_cortex_a53/branches/ <not counted> armv8_cortex_a53/branch-misses/ <not counted> armv8_cortex_a57/branch-misses/ 0.074139992 seconds time elapsed 0.000000000 seconds user 0.054797000 seconds sys Didn't look into it more other than bisecting it to this commit, but I can dig more unless the issue is obvious. This is on Arm big.LITTLE, although I didn't test it elsewhere so I'm not sure if that's relevant or not. Thanks James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 15:52 ` James Clark @ 2025-04-16 8:46 ` Peter Zijlstra 2025-04-16 19:08 ` Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2025-04-16 8:46 UTC (permalink / raw) To: James Clark Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote: > Unrelated to the pointer deref issue, I'm also seeing perf stat not working > due to this commit. And that's both with and without this fixup: > > -> perf stat -- true > > Performance counter stats for 'true': > > <not counted> msec task-clock > > <not counted> context-switches > > <not counted> cpu-migrations > > <not counted> page-faults > > <not counted> armv8_cortex_a53/instructions/ > > <not counted> armv8_cortex_a57/instructions/ > > <not counted> armv8_cortex_a53/cycles/ > > <not counted> armv8_cortex_a57/cycles/ > > <not counted> armv8_cortex_a53/branches/ > > <not counted> armv8_cortex_a53/branch-misses/ > > <not counted> armv8_cortex_a57/branch-misses/ > > > 0.074139992 seconds time elapsed > > 0.000000000 seconds user > 0.054797000 seconds sys > > Didn't look into it more other than bisecting it to this commit, but I can > dig more unless the issue is obvious. This is on Arm big.LITTLE, although I > didn't test it elsewhere so I'm not sure if that's relevant or not. I can reproduce on x86 alderlake (first machine I tried), so let me go have a quick poke. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-16 8:46 ` Peter Zijlstra @ 2025-04-16 19:08 ` Peter Zijlstra 2025-04-17 8:58 ` James Clark 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2025-04-16 19:08 UTC (permalink / raw) To: James Clark Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote: > On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote: > > Unrelated to the pointer deref issue, I'm also seeing perf stat not working > > due to this commit. And that's both with and without this fixup: > > > > -> perf stat -- true > > > > Performance counter stats for 'true': > > > > <not counted> msec task-clock > > > > <not counted> context-switches > > > > <not counted> cpu-migrations > > > > <not counted> page-faults > > > > <not counted> armv8_cortex_a53/instructions/ > > > > <not counted> armv8_cortex_a57/instructions/ > > > > <not counted> armv8_cortex_a53/cycles/ > > > > <not counted> armv8_cortex_a57/cycles/ > > > > <not counted> armv8_cortex_a53/branches/ > > > > <not counted> armv8_cortex_a53/branch-misses/ > > > > <not counted> armv8_cortex_a57/branch-misses/ > > > > > > 0.074139992 seconds time elapsed > > > > 0.000000000 seconds user > > 0.054797000 seconds sys > > > > Didn't look into it more other than bisecting it to this commit, but I can > > dig more unless the issue is obvious. This is on Arm big.LITTLE, although I > > didn't test it elsewhere so I'm not sure if that's relevant or not. > > I can reproduce on x86 alderlake (first machine I tried), so let me go > have a quick poke. Could you please try queue.git/perf/core ? I've fixed this and found another problem. I'll post the patches tomorrow, after the robot has had a go. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-16 19:08 ` Peter Zijlstra @ 2025-04-17 8:58 ` James Clark 0 siblings, 0 replies; 11+ messages in thread From: James Clark @ 2025-04-17 8:58 UTC (permalink / raw) To: Peter Zijlstra Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker On 16/04/2025 8:08 pm, Peter Zijlstra wrote: > On Wed, Apr 16, 2025 at 10:46:10AM +0200, Peter Zijlstra wrote: >> On Tue, Apr 15, 2025 at 04:52:56PM +0100, James Clark wrote: >>> Unrelated to the pointer deref issue, I'm also seeing perf stat not working >>> due to this commit. And that's both with and without this fixup: >>> >>> -> perf stat -- true >>> >>> Performance counter stats for 'true': >>> >>> <not counted> msec task-clock >>> >>> <not counted> context-switches >>> >>> <not counted> cpu-migrations >>> >>> <not counted> page-faults >>> >>> <not counted> armv8_cortex_a53/instructions/ >>> >>> <not counted> armv8_cortex_a57/instructions/ >>> >>> <not counted> armv8_cortex_a53/cycles/ >>> >>> <not counted> armv8_cortex_a57/cycles/ >>> >>> <not counted> armv8_cortex_a53/branches/ >>> >>> <not counted> armv8_cortex_a53/branch-misses/ >>> >>> <not counted> armv8_cortex_a57/branch-misses/ >>> >>> >>> 0.074139992 seconds time elapsed >>> >>> 0.000000000 seconds user >>> 0.054797000 seconds sys >>> >>> Didn't look into it more other than bisecting it to this commit, but I can >>> dig more unless the issue is obvious. This is on Arm big.LITTLE, although I >>> didn't test it elsewhere so I'm not sure if that's relevant or not. >> >> I can reproduce on x86 alderlake (first machine I tried), so let me go >> have a quick poke. > > Could you please try queue.git/perf/core ? I've fixed this and found > another problem. > > I'll post the patches tomorrow, after the robot has had a go. Yep that's all working now, thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event 2025-04-15 13:14 ` Peter Zijlstra 2025-04-15 15:52 ` James Clark @ 2025-04-16 8:36 ` Venkat Rao Bagalkote 1 sibling, 0 replies; 11+ messages in thread From: Venkat Rao Bagalkote @ 2025-04-16 8:36 UTC (permalink / raw) To: Peter Zijlstra, James Clark Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, x86, Ravi Bangoria, linux-perf-users, Mark Rutland, Frederic Weisbecker, venkat88, Athira Rajeev, Madhavan Srinivasan On 15/04/25 6:44 pm, Peter Zijlstra wrote: > On Tue, Apr 15, 2025 at 12:08:40PM +0200, Peter Zijlstra wrote: >> On Tue, Apr 15, 2025 at 10:14:05AM +0100, James Clark wrote: >>> On 15/04/2025 5:46 am, Oliver Sang wrote: >>>> yes, below patch fixes the issues we observed for da916e96e2. thanks >>>> >>>> Tested-by: kernel test robot <oliver.sang@intel.com> >>>> >>> Also fixes the same issues we were seeing: >>> >>> Tested-by: James Clark <james.clark@linaro.org> >> Excellent, thank you both! Now I gotta go write me a Changelog :-) > Hmm, so while writing Changelog, I noticed something else was off. The > case where event->parent was set to EVENT_TOMBSTONE now didn't have a > put_event(parent) anymore. So that needs to be put back in as well. > > Frederic, afaict this should still be okay, since if we're detached, > then nothing will try and access event->parent in the free path. > > Also, nothing in perf_pending_task() will try and access either > event->parent or event->pmu. > > --- > Subject: perf: Fix event->parent life-time issue > From: Peter Zijlstra <peterz@infradead.org> > Date: Tue Apr 15 12:12:52 CEST 2025 > > Due to an oversight in merging da916e96e2de ("perf: Make > perf_pmu_unregister() useable") on top of 56799bc03565 ("perf: Fix > hang while freeing sigtrap event"), it is now possible to hit > put_event(EVENT_TOMBSTONE), which makes the computer sad. > > This also means that for the event->parent == EVENT_TOMBSTONE, the > put_event() matching inherit_event() has gone missing. > > Previously this was done in perf_event_release_kernel() after calling > perf_remove_from_context(), but with it delegated to put_event(), this > case is now entirely missed, leading to leaks. > > Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable") > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > kernel/events/core.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -2343,6 +2343,7 @@ static void perf_child_detach(struct per > * not being a child event. See for example unaccount_event(). > */ > event->parent = EVENT_TOMBSTONE; > + put_event(parent_event); > } > > static bool is_orphaned_event(struct perf_event *event) > @@ -5688,7 +5689,7 @@ static void put_event(struct perf_event > _free_event(event); > > /* Matches the refcount bump in inherit_event() */ > - if (parent) > + if (parent && parent != EVENT_TOMBSTONE) > put_event(parent); > } > This issue is reported on IBM Power9 servers also. Tested the above patch, and issue is fixed. Hence, Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Regards, Venkat. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-17 8:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-14 1:59 [tip:perf/core] [perf] da916e96e2: BUG:KASAN:null-ptr-deref_in_put_event kernel test robot 2025-04-14 19:01 ` Peter Zijlstra 2025-04-15 4:46 ` Oliver Sang 2025-04-15 9:14 ` James Clark 2025-04-15 10:08 ` Peter Zijlstra 2025-04-15 13:14 ` Peter Zijlstra 2025-04-15 15:52 ` James Clark 2025-04-16 8:46 ` Peter Zijlstra 2025-04-16 19:08 ` Peter Zijlstra 2025-04-17 8:58 ` James Clark 2025-04-16 8:36 ` Venkat Rao Bagalkote
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).