All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pengfei Xu <pengfei.xu@intel.com>
To: <lihuafei1@huawei.com>
Cc: <rostedt@goodmis.org>, <linux-kernel@vger.kernel.org>,
	<lkp@intel.com>, <frederic@kernel.org>,
	<quic_neeraju@quicinc.com>, <paulmck@kernel.org>,
	<heng.su@intel.com>
Subject: [Syzkaller & bisect] There is "task hung in synchronize_rcu bisect" in v6.3-rc2 kernel
Date: Wed, 15 Mar 2023 20:20:44 +0800	[thread overview]
Message-ID: <ZBG4HOCQIlGFFcIn@xpf.sh.intel.com> (raw)

Hi Li Huafei and kernel experts,

Greeting!

Platform: x86 platforms
There is "task hung in synchronize_rcu bisect" in v6.3-rc2 kernel in guest:

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230315_023443_synchronize_rcu
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230315_023443_synchronize_rcu/repro.c
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230315_023443_synchronize_rcu/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230315_023443_synchronize_rcu/bisect_info.log
v6.3-rc2 issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230315_023443_synchronize_rcu/eeac8ede17557680855031c6f305ece2378af326_dmesg.log
"
[   24.844767] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=330 'systemd'
[   31.392668] hrtimer: interrupt took 10726570 ns
[   62.552360] rcu_tasks_wait_gp: rcu_tasks grace period number 9 (since boot) is 2942 jiffies old.
[   92.760386] rcu_tasks_wait_gp: rcu_tasks grace period number 9 (since boot) is 10494 jiffies old.
[  182.808389] rcu_tasks_wait_gp: rcu_tasks grace period number 9 (since boot) is 33006 jiffies old.
[  300.120396] INFO: task repro:540 blocked for more than 147 seconds.
[  300.120937]       Not tainted 6.3.0-rc2-eeac8ede1755+ #1
[  300.121351] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  300.121888] task:repro           state:D stack:0     pid:540   ppid:538    flags:0x00004006
[  300.122552] Call Trace:
[  300.122835]  <TASK>
[  300.123267]  __schedule+0x40a/0xc90
[  300.124351]  ? wait_for_completion+0x7b/0x180
[  300.124772]  schedule+0x5b/0xe0
[  300.125156]  schedule_timeout+0x50b/0x670
[  300.125829]  ? schedule_timeout+0x9/0x670
[  300.126414]  ? wait_for_completion+0x7b/0x180
[  300.126826]  wait_for_completion+0xa6/0x180
[  300.127478]  __wait_rcu_gp+0x136/0x160
[  300.127922]  ? __pfx_arch_ftrace_ops_list_func+0x10/0x10
[  300.128639]  synchronize_rcu_tasks_generic.part.24+0x3b/0x60
[  300.129092]  ? __pfx_call_rcu_tasks+0x10/0x10
[  300.129499]  ? __pfx_wakeme_after_rcu+0x10/0x10
[  300.130115]  ? verify_cpu+0x10/0x100
[  300.130574]  synchronize_rcu_tasks_generic+0x24/0x70
[  300.131020]  synchronize_rcu_tasks+0x19/0x20
[  300.131438]  ftrace_shutdown+0x1cc/0x410
[  300.132206]  unregister_ftrace_function+0x35/0x230
[  300.132722]  ? __sanitizer_cov_trace_switch+0x57/0xa0
[  300.133311]  perf_ftrace_event_register+0x95/0xf0
[  300.133750]  ? __pfx_tp_perf_event_destroy+0x10/0x10
[  300.134190]  perf_trace_destroy+0x3a/0xa0
[  300.134590]  ? __pfx_tp_perf_event_destroy+0x10/0x10
[  300.135028]  tp_perf_event_destroy+0x1e/0x30
[  300.135473]  _free_event+0x101/0x810
[  300.136191]  put_event+0x3c/0x50
[  300.136605]  perf_event_release_kernel+0x2de/0x360
[  300.137011]  ? perf_event_release_kernel+0x9/0x360
[  300.137577]  ? __pfx_perf_release+0x10/0x10
[  300.138031]  perf_release+0x22/0x30
[  300.138434]  __fput+0x11f/0x450
[  300.139038]  ____fput+0x1e/0x30
[  300.139425]  task_work_run+0xb6/0x120
[  300.140164]  do_exit+0x547/0x1360
[  300.140632]  ? write_comp_data+0x2f/0x90
[  300.141223]  do_group_exit+0x5e/0xf0
[  300.141732]  get_signal+0x15d1/0x1600
[  300.142596]  arch_do_signal_or_restart+0x33/0x280
[  300.143463]  exit_to_user_mode_prepare+0x13b/0x210
[  300.143955]  syscall_exit_to_user_mode+0x2d/0x60
[  300.144623]  do_syscall_64+0x4a/0x90
[  300.145032]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  300.145459] RIP: 0033:0x7f2a7abd859d
[  300.145823] RSP: 002b:00007ffcc732b338 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
[  300.146429] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 00007f2a7abd859d
[  300.146930] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 00000000200004c0
[  300.147436] RBP: 00007ffcc732b350 R08: 0000000000000000 R09: 00007ffcc732b350
[  300.147938] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000004011e0
[  300.148645] R13: 00007ffcc732b470 R14: 0000000000000000 R15: 0000000000000000
[  300.149713]  </TASK>
[  300.149978] 
[  300.149978] Showing all locks held in the system:
[  300.150435] 1 lock held by rcu_tasks_kthre/11:
[  300.150808]  #0: ffffffff83f63490 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.151849] 1 lock held by rcu_tasks_rude_/12:
[  300.152593]  #0: ffffffff83f63210 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.153645] 1 lock held by rcu_tasks_trace/13:
[  300.154013]  #0: ffffffff83f62f50 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.155078] 1 lock held by khungtaskd/29:
[  300.155425]  #0: ffffffff83f63ea0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x1b/0x1e0
[  300.156644] 2 locks held by repro/540:
[  300.156977]  #0: ffffffff840206a8 (event_mutex){+.+.}-{3:3}, at: perf_trace_destroy+0x21/0xa0
[  300.157971]  #1: ffffffff8401cd68 (ftrace_lock){+.+.}-{3:3}, at: unregister_ftrace_function+0x2b/0x230
[  300.159005] 
[  300.159233] =============================================
[  300.159233] 
"
Bisected and found first bad commit:
"
0e792b89e6800cd9cb4757a76a96f7ef3e8b6294
ftrace: Fix use-after-free for dynamic ftrace_ops
"
After reverted the above commit on top of v6.3-rc2 kernel, this issue was gone.

And this issue could be reproduced in 2100s.

There was no this issue report in syzbot link:
https://syzkaller.appspot.com/upstream

I hope it's helpful.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.

             reply	other threads:[~2023-03-15 12:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15 12:20 Pengfei Xu [this message]
2023-04-13  2:45 ` [Syzkaller & bisect] There is "task hung in synchronize_rcu bisect" in v6.3-rc2 kernel Pengfei Xu
2023-04-13 11:19   ` Frederic Weisbecker
2023-04-13 11:51     ` Zhang, Qiang1
2023-04-13 12:09     ` Pengfei Xu
2023-04-13 12:40     ` Jens Axboe
2023-04-13 13:55       ` Frederic Weisbecker
     [not found] ` <20230413070042.1479-1-hdanton@sina.com>
2023-04-13  8:03   ` Pengfei Xu
     [not found]   ` <20230413083017.1518-1-hdanton@sina.com>
2023-04-13  9:08     ` Pengfei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBG4HOCQIlGFFcIn@xpf.sh.intel.com \
    --to=pengfei.xu@intel.com \
    --cc=frederic@kernel.org \
    --cc=heng.su@intel.com \
    --cc=lihuafei1@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=paulmck@kernel.org \
    --cc=quic_neeraju@quicinc.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.