public inbox for linuxppc-dev@ozlabs.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: rcu <rcu@vger.kernel.org>, Zhouyi Zhou <zhouzhouyi@gmail.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Subject: Re: rcu_sched self-detected stall on CPU
Date: Fri, 8 Apr 2022 08:52:38 -0700	[thread overview]
Message-ID: <20220408155238.GG4285@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <87k0bz7i1s.fsf@mpe.ellerman.id.au>

On Sat, Apr 09, 2022 at 12:42:39AM +1000, Michael Ellerman wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> > "Paul E. McKenney" <paulmck@kernel.org> writes:
> >> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> >>> Hi
> >>> 
> >>> I can reproduce it in a ppc virtual cloud server provided by Oregon
> >>> State University.  Following is what I do:
> >>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> >>> -o linux-5.18-rc1.tar.gz
> >>> 2) tar zxf linux-5.18-rc1.tar.gz
> >>> 3) cp config linux-5.18-rc1/.config
> >>> 4) cd linux-5.18-rc1
> >>> 5) make vmlinux -j 8
> >>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> >>> -smp 2 (QEMU 4.2.1)
> >>> 7) after 12 rounds, the bug got reproduced:
> >>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
> >>
> >> Just to make sure, are you both seeing the same thing?  Last I knew,
> >> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> >> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> >> I miss something?
> >>
> >> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> >> kthread slept for three milliseconds, but did not wake up for more than
> >> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> >> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> >> output below (but maybe my idea of what is healthy for powerpc systems
> >> is outdated).  Please see also the inline annotations.
> >>
> >> Thoughts from the PPC guys?
> >
> > I haven't seen it in my testing. But using Miguel's config I can
> > reproduce it seemingly on every boot.
> >
> > For me it bisects to:
> >
> >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
> >
> > Which seems plausible.
> >
> > Reverting that on mainline makes the bug go away.
> >
> > I don't see an obvious bug in the diff, but I could be wrong, or the old
> > code was papering over an existing bug?
> >
> > I'll try and work out what it is about Miguel's config that exposes
> > this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

Would it make sense to select HIGH_RES_TIMERS from one of the Kconfig*
files in arch/powerpc?  Asking for a friend.  ;-)

							Thanx, Paul

  reply	other threads:[~2022-04-08 15:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-06 18:25     ` Zhouyi Zhou
2022-04-06 19:50       ` Paul E. McKenney
2022-04-07  2:26         ` Zhouyi Zhou
2022-04-07 10:07           ` Miguel Ojeda
2022-04-07 15:15             ` Paul E. McKenney
2022-04-07 17:05               ` Miguel Ojeda
2022-04-07 17:55                 ` Paul E. McKenney
2022-04-07 23:14                   ` Zhouyi Zhou
2022-04-08  1:43                     ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 10:02       ` Zhouyi Zhou
2022-04-08 14:07         ` Paul E. McKenney
2022-04-08 14:25           ` Zhouyi Zhou
2022-04-10 11:33             ` Michael Ellerman
2022-04-11  3:05               ` Paul E. McKenney
2022-04-12  6:53                 ` Michael Ellerman
2022-04-12 13:36                   ` Paul E. McKenney
2022-04-08 13:52       ` Miguel Ojeda
2022-04-08 14:06       ` Paul E. McKenney
2022-04-08 14:42       ` Michael Ellerman
2022-04-08 15:52         ` Paul E. McKenney [this message]
2022-04-08 17:02         ` Miguel Ojeda
2022-04-13  5:11         ` Nicholas Piggin
2022-04-13  6:10           ` Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU) Nicholas Piggin
2022-04-14 17:15             ` Paul E. McKenney
2022-04-22 15:53           ` Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220408155238.GG4285@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=miguel.ojeda.sandonis@gmail.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=zhouzhouyi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox