From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: WARN_ON_ONCE() in process_one_work()?
Date: Mon, 2 Jul 2018 21:05:18 -0700 [thread overview]
Message-ID: <20180703040518.GV3593@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180702210540.GL533219@devbig577.frc2.facebook.com>
On Mon, Jul 02, 2018 at 02:05:40PM -0700, Tejun Heo wrote:
> Hello, Paul.
>
> Sorry about the late reply.
>
> On Wed, Jun 20, 2018 at 12:29:01PM -0700, Paul E. McKenney wrote:
> > I have hit this WARN_ON_ONCE() in process_one_work:
> >
> > WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> > raw_smp_processor_id() != pool->cpu);
> >
> > This looks like it is my rcu_gp workqueue (see splat below), and it
> > appears to be intermittent. This happens on rcutorture scenario SRCU-N,
> > which does random CPU-hotplug operations (in case that helps).
> >
> > Is this related to the recent addition of WQ_MEM_RECLAIM? Either way,
> > what should I do to further debug this?
>
> Hmm... I checked the code paths but couldn't spot anything suspicious.
> Can you please apply the following patch and see whether it triggers
> before hitting the warn and if so report what it says?
I will apply this, but be advised that I have not seen that WARN_ON_ONCE()
trigger since. :-/
Thanx, Paul
> Thanks.
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 0db8938fbb23..81caab9643b2 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -79,6 +79,15 @@ static struct lockdep_map cpuhp_state_up_map =
> static struct lockdep_map cpuhp_state_down_map =
> STATIC_LOCKDEP_MAP_INIT("cpuhp_state-down", &cpuhp_state_down_map);
>
> +int cpuhp_current_state(int cpu)
> +{
> + return per_cpu_ptr(&cpuhp_state, cpu)->state;
> +}
> +
> +int cpuhp_target_state(int cpu)
> +{
> + return per_cpu_ptr(&cpuhp_state, cpu)->target;
> +}
>
> static inline void cpuhp_lock_acquire(bool bringup)
> {
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 78b192071ef7..365cf6342808 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1712,6 +1712,9 @@ static struct worker *alloc_worker(int node)
> return worker;
> }
>
> +int cpuhp_current_state(int cpu);
> +int cpuhp_target_state(int cpu);
> +
> /**
> * worker_attach_to_pool() - attach a worker to a pool
> * @worker: worker to be attached
> @@ -1724,13 +1727,20 @@ static struct worker *alloc_worker(int node)
> static void worker_attach_to_pool(struct worker *worker,
> struct worker_pool *pool)
> {
> + int ret;
> +
> mutex_lock(&wq_pool_attach_mutex);
>
> /*
> * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> * online CPUs. It'll be re-applied when any of the CPUs come up.
> */
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + if (ret && pool->cpu >= 0 && worker->rescue_wq)
> + printk("XXX rescuer failed to attach: ret=%d pool=%d this_cpu=%d target_cpu=%d cpuhp_state=%d chuhp_target=%d\n",
> + ret, pool->id, raw_smp_processor_id(), pool->cpu,
> + cpuhp_current_state(pool->cpu),
> + cpuhp_target_state(pool->cpu));
>
> /*
> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>
next prev parent reply other threads:[~2018-07-03 4:03 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 19:29 WARN_ON_ONCE() in process_one_work()? Paul E. McKenney
2018-07-02 21:05 ` Tejun Heo
2018-07-03 4:05 ` Paul E. McKenney [this message]
2018-07-03 16:40 ` Paul E. McKenney
2018-07-03 20:12 ` Tejun Heo
2018-07-03 21:44 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2017-05-01 16:57 Paul E. McKenney
2017-05-01 18:38 ` Paul E. McKenney
2017-05-01 18:44 ` Tejun Heo
2017-05-01 18:58 ` Paul E. McKenney
2017-05-05 17:11 ` Paul E. McKenney
2017-06-13 20:58 ` Tejun Heo
2017-06-13 22:31 ` Paul E. McKenney
2017-06-14 15:15 ` Paul E. McKenney
2017-06-15 15:38 ` Paul E. McKenney
2017-06-16 17:36 ` Paul E. McKenney
2017-06-17 11:53 ` Tejun Heo
2017-06-17 17:31 ` Paul E. McKenney
2017-06-18 10:40 ` Tejun Heo
2017-06-20 16:45 ` Paul E. McKenney
2017-06-21 15:30 ` Paul E. McKenney
2017-06-23 16:41 ` Paul E. McKenney
2017-06-27 16:27 ` Paul E. McKenney
2017-05-01 18:42 ` Tejun Heo
2017-05-01 19:42 ` Steven Rostedt
2017-05-01 19:50 ` Tejun Heo
2017-05-01 20:02 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180703040518.GV3593@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.