All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pengfei Xu <pengfei.xu@intel.com>
To: <linux-kernel@vger.kernel.org>
Cc: <linux-tip-commits@vger.kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>, <x86@kernel.org>,
	<syzkaller-bugs@googlegroups.com>
Subject: Re: [tip: perf/core] perf: Fix event_function_call() locking
Date: Tue, 13 Aug 2024 09:34:53 +0800	[thread overview]
Message-ID: <Zrq4PRAVxjlnvFnb@xpf.sh.intel.com> (raw)
In-Reply-To: <172311312801.2215.8572777928731385731.tip-bot2@tip-bot2>

Hi Peter and perf experts,

There is "BUG: using smp_processor_id() in preemptible code in
event_function_call" in v6.11-rc3 Intel internal with linux-next patches in
syzkaller testing.
And this issue was reproduced in next-20240812 also.

Bisected and found related commit in Intel internal repo:
558abc7e3f89 perf: Fix event_function_call() locking

After revert above commit on top of Intel internal kernel, this issue
was gone.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240812_194918_event_function_call
Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/repro.c
Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/repro.prog
Syzkaller analysis report: https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/repro.report
Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/bisect_info.log
Issue bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240812_194918_event_function_call/bzImage_dcdef334e8852ca433e908db67744965101f3fc1.tar.gz
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240812_194918_event_function_call/dcdef334e8852ca433e908db67744965101f3fc1_dmesg.log

"
[   17.071069] BUG: using smp_processor_id() in preemptible [00000000] code: repro/726
[   17.071875] caller is debug_smp_processor_id+0x20/0x30
[   17.072369] CPU: 0 UID: 0 PID: 726 Comm: repro Not tainted 6.11.0-rc3-next-20240812-dcdef334e885+ #1
[   17.073225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   17.074205] Call Trace:
[   17.074436]  <TASK>
[   17.074646]  dump_stack_lvl+0x121/0x150
[   17.075005]  dump_stack+0x19/0x20
[   17.075313]  check_preemption_disabled+0x168/0x180
[   17.075753]  debug_smp_processor_id+0x20/0x30
[   17.076156]  event_function_call+0xd7/0x5c0
[   17.076520]  ? __pfx___perf_remove_from_context+0x10/0x10
[   17.076986]  ? __this_cpu_preempt_check+0x21/0x30
[   17.077404]  ? __pfx_event_function_call+0x10/0x10
[   17.077850]  ? __this_cpu_preempt_check+0x21/0x30
[   17.078276]  ? _raw_spin_unlock_irq+0x2c/0x60
[   17.078681]  ? lockdep_hardirqs_on+0x89/0x110
[   17.079084]  perf_remove_from_context+0xf7/0x1d0
[   17.079506]  perf_event_release_kernel+0x186/0x870
[   17.079944]  ? lock_is_held_type+0xef/0x150
[   17.080326]  ? locks_remove_file+0x3bb/0x5e0
[   17.080727]  ? __pfx_perf_event_release_kernel+0x10/0x10
[   17.081202]  ? __this_cpu_preempt_check+0x21/0x30
[   17.081634]  ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
[   17.082116]  ? evm_file_release+0x193/0x1f0
[   17.082502]  ? __pfx_perf_release+0x10/0x10
[   17.082880]  perf_release+0x40/0x60
[   17.083203]  __fput+0x426/0xbc0
[   17.083509]  ____fput+0x1f/0x30
[   17.083806]  task_work_run+0x19c/0x2b0
[   17.084159]  ? __pfx_task_work_run+0x10/0x10
[   17.084553]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[   17.085036]  ? switch_task_namespaces+0xc6/0x110
[   17.085463]  do_exit+0xb19/0x2aa0
[   17.085778]  ? lock_release+0x441/0x870
[   17.086138]  ? __pfx_do_exit+0x10/0x10
[   17.086485]  ? __this_cpu_preempt_check+0x21/0x30
[   17.087007]  ? _raw_spin_unlock_irq+0x2c/0x60
[   17.087219]  ? lockdep_hardirqs_on+0x89/0x110
[   17.087426]  do_group_exit+0xe4/0x2c0
[   17.087607]  __x64_sys_exit_group+0x4d/0x60
[   17.087799]  x64_sys_call+0x20c4/0x20d0
[   17.087974]  do_syscall_64+0x6d/0x140
[   17.088144]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   17.088374] RIP: 0033:0x7f6c5a518a4d
[   17.088541] Code: Unable to access opcode bytes at 0x7f6c5a518a23.
[   17.088832] RSP: 002b:00007fffca44fe68 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[   17.089166] RAX: ffffffffffffffda RBX: 00007f6c5a5f69e0 RCX: 00007f6c5a518a4d
[   17.089476] RDX: 00000000000000e7 RSI: ffffffffffffff80 RDI: 0000000000000000
[   17.089790] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
[   17.090100] R10: 00007fffca44fd10 R11: 0000000000000246 R12: 00007f6c5a5f69e0
[   17.090411] R13: 00007f6c5a5fbf00 R14: 0000000000000001 R15: 00007f6c5a5fbee8
[   17.090740]  </TASK>
[   17.162307] BUG: using smp_processor_id() in preemptible [00000000] code: repro/727
[   17.163284] caller is debug_smp_processor_id+0x20/0x30
[   17.163765] CPU: 0 UID: 0 PID: 727 Comm: repro Not tainted 6.11.0-rc3-next-20240812-dcdef334e885+ #1
[   17.164571] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   17.165606] Call Trace:
[   17.165840]  <TASK>
[   17.166045]  dump_stack_lvl+0x121/0x150
[   17.166407]  dump_stack+0x19/0x20
[   17.166732]  check_preemption_disabled+0x168/0x180
[   17.166944]  debug_smp_processor_id+0x20/0x30
[   17.167153]  event_function_call+0xd7/0x5c0
[   17.167355]  ? __pfx___perf_remove_from_context+0x10/0x10
[   17.167611]  ? __this_cpu_preempt_check+0x21/0x30
[   17.167835]  ? __pfx_event_function_call+0x10/0x10
[   17.168072]  ? __this_cpu_preempt_check+0x21/0x30
[   17.168297]  ? _raw_spin_unlock_irq+0x2c/0x60
[   17.168505]  ? lockdep_hardirqs_on+0x89/0x110
[   17.168718]  perf_remove_from_context+0xf7/0x1d0
[   17.168939]  perf_event_release_kernel+0x186/0x870
[   17.169164]  ? lock_is_held_type+0xef/0x150
[   17.169362]  ? locks_remove_file+0x3bb/0x5e0
[   17.169546]  ? __pfx_perf_event_release_kernel+0x10/0x10
[   17.169791]  ? __this_cpu_preempt_check+0x21/0x30
[   17.169994]  ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
[   17.170219]  ? evm_file_release+0x193/0x1f0
[   17.170421]  ? __pfx_perf_release+0x10/0x10
[   17.170620]  perf_release+0x40/0x60
[   17.170790]  __fput+0x426/0xbc0
[   17.170938]  ____fput+0x1f/0x30
[   17.171082]  task_work_run+0x19c/0x2b0
[   17.171271]  ? __pfx_task_work_run+0x10/0x10
[   17.171478]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[   17.171732]  ? switch_task_namespaces+0xc6/0x110
[   17.171957]  do_exit+0xb19/0x2aa0
[   17.172122]  ? lock_release+0x441/0x870
[   17.172314]  ? __pfx_do_exit+0x10/0x10
[   17.172476]  ? __this_cpu_preempt_check+0x21/0x30
[   17.172700]  ? _raw_spin_unlock_irq+0x2c/0x60
[   17.172910]  ? lockdep_hardirqs_on+0x89/0x110
[   17.173126]  do_group_exit+0xe4/0x2c0
[   17.173307]  __x64_sys_exit_group+0x4d/0x60
[   17.173509]  x64_sys_call+0x20c4/0x20d0
[   17.173679]  do_syscall_64+0x6d/0x140
[   17.173854]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   17.174091] RIP: 0033:0x7fc803b18a4d
[   17.174245] Code: Unable to access opcode bytes at 0x7fc803b18a23.
[   17.174522] RSP: 002b:00007ffd949207a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[   17.174892] RAX: ffffffffffffffda RBX: 00007fc803bf69e0 RCX: 00007fc803b18a4d
[   17.175208] RDX: 00000000000000e7 RSI: ffffffffffffff80 RDI: 0000000000000000
[   17.175526] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
[   17.175849] R10: 00007ffd94920650 R11: 0000000000000246 R12: 00007fc803bf69e0
[   17.176170] R13: 00007fc803bfbf00 R14: 0000000000000001 R15: 00007fc803bfbee8
[   17.176508]  </TASK>
"

I hope it's helpful.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
  // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
  // You could change the bzImage_xxx as you want
  // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install

Best Regards,
Thanks!


On 2024-08-08 at 10:32:08 -0000, tip-bot2 for Peter Zijlstra wrote:
> The following commit has been merged into the perf/core branch of tip:
> 
> Commit-ID:     558abc7e3f895049faa46b08656be4c60dc6e9fd
> Gitweb:        https://git.kernel.org/tip/558abc7e3f895049faa46b08656be4c60dc6e9fd
> Author:        Peter Zijlstra <peterz@infradead.org>
> AuthorDate:    Wed, 07 Aug 2024 13:29:27 +02:00
> Committer:     Peter Zijlstra <peterz@infradead.org>
> CommitterDate: Thu, 08 Aug 2024 12:27:31 +02:00
> 
> perf: Fix event_function_call() locking
> 
> All the event_function/@func call context already uses perf_ctx_lock()
> except for the !ctx->is_active case. Make it all consistent.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> Reviewed-by: Namhyung Kim <namhyung@kernel.org>
> Link: https://lore.kernel.org/r/20240807115550.138301094@infradead.org
> ---
>  kernel/events/core.c |  9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index eb03c9a..ab49dea 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -263,6 +263,7 @@ unlock:
>  static void event_function_call(struct perf_event *event, event_f func, void *data)
>  {
>  	struct perf_event_context *ctx = event->ctx;
> +	struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
>  	struct task_struct *task = READ_ONCE(ctx->task); /* verified in event_function */
>  	struct event_function_struct efs = {
>  		.event = event,
> @@ -291,22 +292,22 @@ again:
>  	if (!task_function_call(task, event_function, &efs))
>  		return;
>  
> -	raw_spin_lock_irq(&ctx->lock);
> +	perf_ctx_lock(cpuctx, ctx);
>  	/*
>  	 * Reload the task pointer, it might have been changed by
>  	 * a concurrent perf_event_context_sched_out().
>  	 */
>  	task = ctx->task;
>  	if (task == TASK_TOMBSTONE) {
> -		raw_spin_unlock_irq(&ctx->lock);
> +		perf_ctx_unlock(cpuctx, ctx);
>  		return;
>  	}
>  	if (ctx->is_active) {
> -		raw_spin_unlock_irq(&ctx->lock);
> +		perf_ctx_unlock(cpuctx, ctx);
>  		goto again;
>  	}
>  	func(event, NULL, ctx, data);
> -	raw_spin_unlock_irq(&ctx->lock);
> +	perf_ctx_unlock(cpuctx, ctx);
>  }
>  
>  /*

  reply	other threads:[~2024-08-13  1:34 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-07 11:29 [PATCH 0/5] perf: Per PMU context reschedule and misc Peter Zijlstra
2024-08-07 11:29 ` [PATCH 1/5] perf: Optimize context reschedule for single PMU cases Peter Zijlstra
2024-08-08 10:32   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2024-08-07 11:29 ` [PATCH 2/5] perf: Extract a few helpers Peter Zijlstra
2024-08-08 10:32   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2024-08-07 11:29 ` [PATCH 3/5] perf: Fix event_function_call() locking Peter Zijlstra
2024-08-08 10:32   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2024-08-13  1:34     ` Pengfei Xu [this message]
2024-08-13 15:19       ` Naresh Kamboju
2024-08-13 18:28         ` Namhyung Kim
2024-08-13 21:02           ` Peter Zijlstra
2024-08-14  2:35             ` Pengfei Xu
2024-08-07 11:29 ` [PATCH 4/5] perf: Add context time freeze Peter Zijlstra
2024-08-07 15:17   ` Liang, Kan
2024-08-07 19:09     ` Peter Zijlstra
2024-08-08 10:32   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2024-08-07 11:29 ` [PATCH 5/5] perf: Optimize __pmu_ctx_sched_out() Peter Zijlstra
2024-08-08 10:32   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2024-08-07 15:19 ` [PATCH 0/5] perf: Per PMU context reschedule and misc Liang, Kan
2024-08-07 18:54 ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zrq4PRAVxjlnvFnb@xpf.sh.intel.com \
    --to=pengfei.xu@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.