Linux RCU subsystem development
 help / color / mirror / Atom feed
From: Pengfei Xu <pengfei.xu@intel.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Jens Axboe <axboe@fb.com>, <boqun.feng@gmail.com>,
	<quic_neeraju@quicinc.com>, <paulmck@kernel.org>,
	<heng.su@intel.com>, <lkp@intel.com>, <peterz@infradead.org>,
	<rcu@vger.kernel.org>
Subject: Re: [Syzkaller & bisect] There is "sys_perf_event_open" soft lockup BUG in v6.3-rc2 kernel
Date: Tue, 21 Mar 2023 13:53:15 +0800	[thread overview]
Message-ID: <ZBlGS9SlKzQkbbSF@xpf.sh.intel.com> (raw)
In-Reply-To: <ZBiOdCtHy1I2irFT@lothringen>

Hi Frederic Weisbecker,

On 2023-03-20 at 17:48:52 +0100, Frederic Weisbecker wrote:
> On Sat, Mar 18, 2023 at 10:32:17AM +0800, Pengfei Xu wrote:
> > Hi Frederic Weisbecker,
> > 
> > On 2023-03-17 at 15:09:44 +0100, Frederic Weisbecker wrote:
> > > On Fri, Mar 17, 2023 at 03:48:33PM +0800, Pengfei Xu wrote:
> > > > Hi Frederic Weisbecker and kernel experts,
> > > > 
> > > > Platform: x86 platforms
> > > > There is "sys_perf_event_open" soft lockup BUG in v6.3-rc2 kernel in guest.
> > > 
> > > I can reproduce with you tests which is based on v6.2-rc5. However when
> > > I forward port your .config to a v6.3-rc2, the issue doesn't trigger anymore.
> > > 
> > > Did you manage to reproduce on v6.3-rc2? And if so do you still have the related
> > > .config ?
> > > 
> >   Ah, I fogot to say: kconfig_origin will be changed after "make olddefconfig",
> >   there were many items changed in .config after "make olddefconfig" in v6.3-rc2.
> > 
> >  I used below way to make the .config.
> >  1. Copy the kconfig origin to .config: https://github.com/xupengfe/syzkaller_logs/blob/main/230316_062127_sys_perf_event_open/kconfig_origin
> >  2. Fogort that the bisect script will change .config: CONFIG_LOCALVERSION="-kvm"  ->  CONFIG_LOCALVERSION="-eeac8ede1755", seems to have little effect.
> >  3. make olddefconfig  // Then .config will be changed in v6.3-rc2 kernel code.
> >     Put .config after make olddefconfig in link:
> >     https://github.com/xupengfe/syzkaller_logs/blob/main/230316_062127_sys_perf_event_open/kconfig_v6.3-rc2_after_make_olddefconfig
> >  4. make -jx bzImage   //x should equal or less than cpu num your pc has
> > 
> >     Put v6.3-rc2 bzImage in link:
> >     https://github.com/xupengfe/syzkaller_logs/blob/main/230316_062127_sys_perf_event_open/bzImage_eeac8ede17557680855031c6f305ece2378af326
> > 
> > And it could be reproduced after maunally test in 150s.
> > v6.3-rc2 reproduced dmesg:
> > https://github.com/xupengfe/syzkaller_logs/blob/main/230316_062127_sys_perf_event_open/v6.3-rc2_perf_related_problem_dmesg.log
> > 
> > And it could be reproduced on our ADL-N client x86 PC in guest.
> 
> Thanks!
> 
> Now it triggers but I get something a bit different:
> 
> [  299.258474] INFO: task kworker/u4:1:30 blocked for more than 147 seconds.
> [  299.259223]       Not tainted 6.3.0-rc2-kvm-dirty #1
> [  299.259657] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  299.260529] task:kworker/u4:1    state:D stack:0     pid:30    ppid:2      flags:0x00004000
> [  299.261484] Workqueue: events_unbound io_ring_exit_work
> [  299.262163] Call Trace:
> [  299.262514]  <TASK>
> [  299.262826]  __schedule+0x414/0xcb0
> [  299.263303]  ? wait_for_completion+0x77/0x170
> [  299.263753]  schedule+0x63/0xd0
> [  299.264120]  schedule_timeout+0x2fe/0x530
> [  299.264635]  ? __this_cpu_preempt_check+0x1c/0x30
> [  299.265169]  ? _raw_spin_unlock_irq+0x27/0x60
> [  299.265621]  ? lockdep_hardirqs_on+0x88/0x120
> [  299.266054]  ? wait_for_completion+0x77/0x170
> [  299.266686]  wait_for_completion+0x9e/0x170
> [  299.267198]  io_ring_exit_work+0x2b0/0x810
> [  299.267669]  ? __pfx_io_tctx_exit_cb+0x10/0x10
> [  299.268176]  process_one_work+0x34e/0x810
> [  299.268620]  ? __pfx_io_ring_exit_work+0x10/0x10
> [  299.269061]  ? process_one_work+0x34e/0x810
> [  299.269561]  worker_thread+0x4e/0x530
> [  299.270052]  ? __pfx_worker_thread+0x10/0x10
> [  299.270635]  kthread+0x128/0x160
> [  299.270962]  ? __pfx_kthread+0x10/0x10
> [  299.271405]  ret_from_fork+0x2c/0x50
> [  299.271850]  </TASK>

Thanks for your info!
Seems this issue could get different behavior on different platforms.

And you behavior seems like the other problem like below link:
https://lore.kernel.org/lkml/5ff2b3c0-eb96-c423-dcee-1bdf6604e9df@kernel.dk/

I found this issue could be reproduced on our ADL-N and RPL-S client platforms.
And the related commit is just suspected commit, maybe it's not the root cause
of the issue.
And I hope above info is helpful.

Thanks!
BR.

  reply	other threads:[~2023-03-21  5:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  7:48 [Syzkaller & bisect] There is "sys_perf_event_open" soft lockup BUG in v6.3-rc2 kernel Pengfei Xu
2023-03-17 14:09 ` Frederic Weisbecker
2023-03-18  2:32   ` Pengfei Xu
2023-03-20 16:48     ` Frederic Weisbecker
2023-03-21  5:53       ` Pengfei Xu [this message]
2023-03-22  7:19         ` Pengfei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBlGS9SlKzQkbbSF@xpf.sh.intel.com \
    --to=pengfei.xu@intel.com \
    --cc=axboe@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=heng.su@intel.com \
    --cc=lkp@intel.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox