From: Tejun Heo <tj@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Workqueues splat due to ending up on wrong CPU
Date: Tue, 3 Dec 2019 10:13:59 -0800 [thread overview]
Message-ID: <20191203181359.GD2196666@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <20191203174547.GG2889@paulmck-ThinkPad-P72>
Hello, Paul.
On Tue, Dec 03, 2019 at 09:45:47AM -0800, Paul E. McKenney wrote:
> Good point, and yes, you have told me this before.
>
> Furthermore, in all of these cases, the process was supposed to be
> running on CPU 0, which cannot be taken offline on any of the systems
> under test. Which is leading me to wonder if the workqueue CPU-online
> notifier is sometimes moving more kthreads to the newly onlined CPU than
> it is supposed to. Tejun, could that be happening?
All the warnings that you posted are from rescuers and they jump
around different cpus so that it's on the correct cpu for the specific
work item being rescued. This is a completely separate thing from the
usual worker management and rescuers don't interact with hot[un]plug
callbacks in any way. I think something like the following is what's
happening:
* A work item is queued to CPU5 but it hasn't been dispatched for a
bit so rescuer gets summoned. The rescuer executes the work item
and stays there.
* CPU 5 goes down. The rescuer is asleep and doesn't get affected.
* CPU 5 is coming up. It has online set but the stopper hasn't been
enabled yet.
* A work item was queued on CPU0 but hasn't been dispatched for a
bit, so rescuer is woken up.
* Rescuer wakes up fine on CPU5 as it's online. Seeing the CPU0 work
item, the rescuer tries to migrate to CPU0 by calling
set_cpus_allowed_ptr(); however, stopper isn't up yet and migration
doesn't actually happen.
* Boom. Rescuer is now executing CPU0 work item on CPU5.
Thanks.
--
tejun
next prev parent reply other threads:[~2019-12-03 18:14 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-25 23:03 Workqueues splat due to ending up on wrong CPU Paul E. McKenney
2019-11-26 18:33 ` Tejun Heo
2019-11-26 22:05 ` Paul E. McKenney
2019-11-27 15:50 ` Paul E. McKenney
2019-11-28 16:18 ` Paul E. McKenney
2019-11-29 15:58 ` Paul E. McKenney
2019-12-02 1:55 ` Paul E. McKenney
2019-12-02 20:13 ` Tejun Heo
2019-12-02 23:39 ` Paul E. McKenney
2019-12-03 10:00 ` Peter Zijlstra
2019-12-03 17:45 ` Paul E. McKenney
2019-12-03 18:13 ` Tejun Heo [this message]
2019-12-03 9:55 ` Peter Zijlstra
2019-12-03 10:06 ` Peter Zijlstra
2019-12-03 15:42 ` Tejun Heo
2019-12-03 16:04 ` Paul E. McKenney
2019-12-04 20:11 ` Paul E. McKenney
2019-12-05 10:29 ` Peter Zijlstra
2019-12-05 10:32 ` Peter Zijlstra
2019-12-05 14:48 ` Paul E. McKenney
2019-12-06 3:19 ` Paul E. McKenney
2019-12-06 18:52 ` Paul E. McKenney
2019-12-06 22:00 ` Paul E. McKenney
2019-12-09 18:59 ` Paul E. McKenney
2019-12-10 9:08 ` Peter Zijlstra
2019-12-10 22:56 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191203181359.GD2196666@devbig004.ftw2.facebook.com \
--to=tj@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox