All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: rcu <rcu@vger.kernel.org>,
	Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Nicholas Piggin <npiggin@gmail.com>
Subject: Re: rcu_sched self-detected stall on CPU
Date: Sun, 10 Apr 2022 21:33:43 +1000	[thread overview]
Message-ID: <871qy56ulk.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <CAABZP2zEU8eULq30ZLcUeqxjXuLTKO4b3wm_Jo458Nq_JJ7pEw@mail.gmail.com>

Zhouyi Zhou <zhouzhouyi@gmail.com> writes:
> On Fri, Apr 8, 2022 at 10:07 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>> On Fri, Apr 08, 2022 at 06:02:19PM +0800, Zhouyi Zhou wrote:
>> > On Fri, Apr 8, 2022 at 3:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
...
>> > > I haven't seen it in my testing. But using Miguel's config I can
>> > > reproduce it seemingly on every boot.
>> > >
>> > > For me it bisects to:
>> > >
>> > >   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>> > >
>> > > Which seems plausible.
>> > I also bisect to 35de589cb879 ("powerpc/time: improve decrementer
>> > clockevent processing")
...
>>
>> > > Reverting that on mainline makes the bug go away.

>> > I also revert that on the mainline, and am currently doing a pressure
>> > test (by repeatedly invoking qemu and checking the console.log) on PPC
>> > VM in Oregon State University.

> After 306 rounds of stress test on mainline without triggering the bug
> (last for 4 hours and 27 minutes), I think the bug is indeed caused by
> 35de589cb879 ("powerpc/time: improve decrementer clockevent
> processing") and stop the test for now.

Thanks for testing, that's pretty conclusive.

I'm not inclined to actually revert it yet.

We need to understand if there's actually a bug in the patch, or if it's
just exposing some existing bug/bad behavior we have. The fact that it
only appears with CONFIG_HIGH_RES_TIMERS=n is suspicious.

Do we have some code that inadvertently relies on something enabled by
HIGH_RES_TIMERS=y, or do we have a bug that is hidden by HIGH_RES_TIMERS=y ?

cheers

  reply	other threads:[~2022-04-10 11:34 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06  9:31   ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-06 17:00     ` Paul E. McKenney
2022-04-06 18:25     ` Zhouyi Zhou
2022-04-06 18:25       ` Zhouyi Zhou
2022-04-06 19:50       ` Paul E. McKenney
2022-04-06 19:50         ` Paul E. McKenney
2022-04-07  2:26         ` Zhouyi Zhou
2022-04-07  2:26           ` Zhouyi Zhou
2022-04-07 10:07           ` Miguel Ojeda
2022-04-07 10:07             ` Miguel Ojeda
2022-04-07 15:15             ` Paul E. McKenney
2022-04-07 15:15               ` Paul E. McKenney
2022-04-07 17:05               ` Miguel Ojeda
2022-04-07 17:05                 ` Miguel Ojeda
2022-04-07 17:55                 ` Paul E. McKenney
2022-04-07 17:55                   ` Paul E. McKenney
2022-04-07 23:14                   ` Zhouyi Zhou
2022-04-07 23:14                     ` Zhouyi Zhou
2022-04-08  1:43                     ` Paul E. McKenney
2022-04-08  1:43                       ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 10:02       ` Zhouyi Zhou
2022-04-08 10:02         ` Zhouyi Zhou
2022-04-08 14:07         ` Paul E. McKenney
2022-04-08 14:07           ` Paul E. McKenney
2022-04-08 14:25           ` Zhouyi Zhou
2022-04-08 14:25             ` Zhouyi Zhou
2022-04-10 11:33             ` Michael Ellerman [this message]
2022-04-11  3:05               ` Paul E. McKenney
2022-04-11  3:05                 ` Paul E. McKenney
2022-04-12  6:53                 ` Michael Ellerman
2022-04-12  6:53                   ` Michael Ellerman
2022-04-12 13:36                   ` Paul E. McKenney
2022-04-12 13:36                     ` Paul E. McKenney
2022-04-08 13:52       ` Miguel Ojeda
2022-04-08 13:52         ` Miguel Ojeda
2022-04-08 14:06       ` Paul E. McKenney
2022-04-08 14:06         ` Paul E. McKenney
2022-04-08 14:42       ` Michael Ellerman
2022-04-08 15:52         ` Paul E. McKenney
2022-04-08 15:52           ` Paul E. McKenney
2022-04-08 17:02         ` Miguel Ojeda
2022-04-08 17:02           ` Miguel Ojeda
2022-04-13  5:11         ` Nicholas Piggin
2022-04-13  5:11           ` Nicholas Piggin
2022-04-13  6:10           ` Low-res tick handler device not going to ONESHOT_STOPPED when tick is stopped (was: rcu_sched self-detected stall on CPU) Nicholas Piggin
2022-04-13  6:10             ` Nicholas Piggin
2022-04-14 17:15             ` Paul E. McKenney
2022-04-14 17:15               ` Paul E. McKenney
2022-04-22 15:53           ` Thomas Gleixner
2022-04-22 15:53             ` Re: Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin
2022-04-23  2:29               ` Re: Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871qy56ulk.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=miguel.ojeda.sandonis@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=zhouzhouyi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.