From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: WARN_ON_ONCE() in process_one_work()?
Date: Mon, 2 Jul 2018 21:05:18 -0700 [thread overview]
Message-ID: <20180703040518.GV3593@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180702210540.GL533219@devbig577.frc2.facebook.com>
On Mon, Jul 02, 2018 at 02:05:40PM -0700, Tejun Heo wrote:
> Hello, Paul.
>
> Sorry about the late reply.
>
> On Wed, Jun 20, 2018 at 12:29:01PM -0700, Paul E. McKenney wrote:
> > I have hit this WARN_ON_ONCE() in process_one_work:
> >
> > WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> > raw_smp_processor_id() != pool->cpu);
> >
> > This looks like it is my rcu_gp workqueue (see splat below), and it
> > appears to be intermittent. This happens on rcutorture scenario SRCU-N,
> > which does random CPU-hotplug operations (in case that helps).
> >
> > Is this related to the recent addition of WQ_MEM_RECLAIM? Either way,
> > what should I do to further debug this?
>
> Hmm... I checked the code paths but couldn't spot anything suspicious.
> Can you please apply the following patch and see whether it triggers
> before hitting the warn and if so report what it says?
I will apply this, but be advised that I have not seen that WARN_ON_ONCE()
trigger since. :-/
Thanx, Paul
> Thanks.
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 0db8938fbb23..81caab9643b2 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -79,6 +79,15 @@ static struct lockdep_map cpuhp_state_up_map =
> static struct lockdep_map cpuhp_state_down_map =
> STATIC_LOCKDEP_MAP_INIT("cpuhp_state-down", &cpuhp_state_down_map);
>
> +int cpuhp_current_state(int cpu)
> +{
> + return per_cpu_ptr(&cpuhp_state, cpu)->state;
> +}
> +
> +int cpuhp_target_state(int cpu)
> +{
> + return per_cpu_ptr(&cpuhp_state, cpu)->target;
> +}
>
> static inline void cpuhp_lock_acquire(bool bringup)
> {
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 78b192071ef7..365cf6342808 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1712,6 +1712,9 @@ static struct worker *alloc_worker(int node)
> return worker;
> }
>
> +int cpuhp_current_state(int cpu);
> +int cpuhp_target_state(int cpu);
> +
> /**
> * worker_attach_to_pool() - attach a worker to a pool
> * @worker: worker to be attached
> @@ -1724,13 +1727,20 @@ static struct worker *alloc_worker(int node)
> static void worker_attach_to_pool(struct worker *worker,
> struct worker_pool *pool)
> {
> + int ret;
> +
> mutex_lock(&wq_pool_attach_mutex);
>
> /*
> * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> * online CPUs. It'll be re-applied when any of the CPUs come up.
> */
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + if (ret && pool->cpu >= 0 && worker->rescue_wq)
> + printk("XXX rescuer failed to attach: ret=%d pool=%d this_cpu=%d target_cpu=%d cpuhp_state=%d chuhp_target=%d\n",
> + ret, pool->id, raw_smp_processor_id(), pool->cpu,
> + cpuhp_current_state(pool->cpu),
> + cpuhp_target_state(pool->cpu));
>
> /*
> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>
next prev parent reply other threads:[~2018-07-03 4:03 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 19:29 WARN_ON_ONCE() in process_one_work()? Paul E. McKenney
2018-07-02 21:05 ` Tejun Heo
2018-07-03 4:05 ` Paul E. McKenney [this message]
2018-07-03 16:40 ` Paul E. McKenney
2018-07-03 20:12 ` Tejun Heo
2018-07-03 21:44 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2017-05-01 16:57 Paul E. McKenney
2017-05-01 18:38 ` Paul E. McKenney
2017-05-01 18:44 ` Tejun Heo
2017-05-01 18:58 ` Paul E. McKenney
2017-05-05 17:11 ` Paul E. McKenney
2017-06-13 20:58 ` Tejun Heo
2017-06-13 22:31 ` Paul E. McKenney
2017-06-14 15:15 ` Paul E. McKenney
2017-06-15 15:38 ` Paul E. McKenney
2017-06-16 17:36 ` Paul E. McKenney
2017-06-17 11:53 ` Tejun Heo
2017-06-17 17:31 ` Paul E. McKenney
2017-06-18 10:40 ` Tejun Heo
2017-06-20 16:45 ` Paul E. McKenney
2017-06-21 15:30 ` Paul E. McKenney
2017-06-23 16:41 ` Paul E. McKenney
2017-06-27 16:27 ` Paul E. McKenney
2017-05-01 18:42 ` Tejun Heo
2017-05-01 19:42 ` Steven Rostedt
2017-05-01 19:50 ` Tejun Heo
2017-05-01 20:02 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180703040518.GV3593@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).