From: Roy Hopkins <rhopkins@suse.de>
To: Peter Zijlstra <peterz@infradead.org>,
Guenter Roeck <linux@roeck-us.net>
Cc: Joel Fernandes <joel@joelfernandes.org>,
paulmck@kernel.org, Pavel Machek <pavel@denx.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, patches@lists.linux.dev,
linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, shuah@kernel.org,
patches@kernelci.org, lkft-triage@lists.linaro.org,
jonathanh@nvidia.com, f.fainelli@gmail.com,
sudipm.mukherjee@gmail.com, srw@sladewatkins.net, rwarsow@gmx.de,
conor@kernel.org, rcu@vger.kernel.org,
Ingo Molnar <mingo@kernel.org>
Subject: Re: scheduler problems in -next (was: Re: [PATCH 6.4 000/227] 6.4.7-rc1 review)
Date: Mon, 31 Jul 2023 17:08:29 +0100 [thread overview]
Message-ID: <7ff2a2393d78275b14ff867f3af902b5d4b93ea2.camel@suse.de> (raw)
In-Reply-To: <20230731145232.GM29590@hirez.programming.kicks-ass.net>
On Mon, 2023-07-31 at 16:52 +0200, Peter Zijlstra wrote:
> On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
>
> > > I've taken your config above, and the rootfs.ext2 and run-sh from x86/.
> > > I've then modified run-sh to use:
> > >
> > > qemu-system-x86_64 -enable-kvm -cpu host
> > >
> > > What I'm seeing is that some boots get stuck at:
> > >
> > > [ 0.608230] Running RCU-tasks wait API self tests
> > >
> > > Is this the right 'problem' ?
> > >
> >
> >
> > Yes, exactly.
>
> Excellent! Let me prod that with something sharp, see what comes
> creeping out.
In an effort to get up to speed with this area of the kernel, I've been playing
around with this too today and managed to reproduce the problem using the same
configuration. I'm completely new to this code but I think I may have found the
root of the problem.
What I've found is that there is a race condition between starting the RCU tasks
grace-period thread in rcu_spawn_tasks_kthread_generic() and a subsequent call
to synchronize_rcu_tasks_generic(). This results in rtp->tasks_gp_mutex being
locked in the initial thread which subsequently blocks the newly started grace-
period thread.
The problem is that although synchronize_rcu_tasks_generic() checks to see if
the grace-period kthread is running, it uses rtp->kthread_ptr to achieve this.
This is only set in the thread entry point and not when the thread is created,
meaning that it is set only after the creating thread yields or is preempted. If
this has not happened before the next call to synchronize_rcu_tasks_generic()
then a deadlock occurs.
I've created a debug patch that introduces a new flag in rcu_tasks that is set
when the kthread is created and used this in synchronize_rcu_tasks_generic() in
place of READ_ONCE(rtp->kthread_ptr). This fixes the issue in my test
environment.
I'm happy to have a go at submitting a patch for this if it helps.
next prev parent reply other threads:[~2023-07-31 16:08 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20230725104514.821564989@linuxfoundation.org>
2023-07-27 3:58 ` [PATCH 6.4 000/227] 6.4.7-rc1 review Joel Fernandes
2023-07-27 11:35 ` Pavel Machek
2023-07-27 13:26 ` Joel Fernandes
2023-07-27 14:06 ` Paul E. McKenney
2023-07-27 14:39 ` Guenter Roeck
2023-07-27 16:07 ` Paul E. McKenney
2023-07-27 17:39 ` Guenter Roeck
2023-07-27 20:33 ` Paul E. McKenney
2023-07-27 23:18 ` Joel Fernandes
[not found] ` <99B56FC7-9474-4968-B1DD-5862572FD0BA@joelfernandes.org>
2023-07-28 22:58 ` Paul E. McKenney
2023-07-29 1:25 ` Joel Fernandes
2023-07-29 5:50 ` Paul E. McKenney
2023-07-30 4:00 ` scheduler problems in -next (was: Re: [PATCH 6.4 000/227] 6.4.7-rc1 review) Guenter Roeck
2023-07-31 14:19 ` Peter Zijlstra
2023-07-31 14:35 ` Guenter Roeck
2023-07-31 14:47 ` Peter Zijlstra
2023-07-31 15:03 ` Guenter Roeck
2023-07-31 14:39 ` Peter Zijlstra
2023-07-31 14:48 ` Guenter Roeck
2023-07-31 14:52 ` Peter Zijlstra
2023-07-31 16:08 ` Roy Hopkins [this message]
2023-07-31 16:14 ` Peter Zijlstra
2023-07-31 16:30 ` Roy Hopkins
2023-07-31 16:34 ` Guenter Roeck
2023-07-31 21:15 ` Peter Zijlstra
2023-08-01 17:32 ` Guenter Roeck
2023-08-01 19:08 ` Peter Zijlstra
2023-08-01 21:32 ` Paul E. McKenney
2023-08-01 19:11 ` Paul E. McKenney
2023-08-01 19:14 ` Paul E. McKenney
2023-08-02 13:57 ` Roy Hopkins
2023-08-02 15:05 ` Paul E. McKenney
2023-08-02 15:31 ` Roy Hopkins
2023-08-02 16:51 ` Paul E. McKenney
2023-08-02 15:45 ` Guenter Roeck
2023-08-02 17:20 ` Paul E. McKenney
2023-08-02 17:14 ` Linus Torvalds
2023-08-02 17:48 ` Paul E. McKenney
2023-07-28 4:22 ` [PATCH 6.4 000/227] 6.4.7-rc1 review Guenter Roeck
2023-07-31 3:54 ` Paul E. McKenney
2023-07-31 3:56 ` Paul E. McKenney
2023-07-31 4:16 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7ff2a2393d78275b14ff867f3af902b5d4b93ea2.camel@suse.de \
--to=rhopkins@suse.de \
--cc=akpm@linux-foundation.org \
--cc=conor@kernel.org \
--cc=f.fainelli@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=joel@joelfernandes.org \
--cc=jonathanh@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=lkft-triage@lists.linaro.org \
--cc=mingo@kernel.org \
--cc=patches@kernelci.org \
--cc=patches@lists.linux.dev \
--cc=paulmck@kernel.org \
--cc=pavel@denx.de \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rwarsow@gmx.de \
--cc=shuah@kernel.org \
--cc=srw@sladewatkins.net \
--cc=stable@vger.kernel.org \
--cc=sudipm.mukherjee@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox