public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Ming Lei <tom.leiming@gmail.com>, Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Michael Holzheu <holzheu@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning
Date: Mon, 15 Aug 2016 13:19:08 +0200	[thread overview]
Message-ID: <20160815111908.GA3903@osiris> (raw)
In-Reply-To: <CACVXFVNrMjk46pB_E=5fQP2njN8cntSKJ_BMnR-Z4ZmxsMpqyg@mail.gmail.com>

On Mon, Aug 08, 2016 at 03:45:05PM +0800, Ming Lei wrote:
> On Sat, Jul 30, 2016 at 7:25 PM, Heiko Carstens
> <heiko.carstens@de.ibm.com> wrote:
> > On Wed, Jul 27, 2016 at 05:23:05PM +0200, Thomas Gleixner wrote:
> >> On Wed, 27 Jul 2016, Heiko Carstens wrote:
> >> > [    3.162961] ([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
> >> > [    3.162963] ([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
> >> > [    3.162968] ([<000000000015d46c>] create_worker+0x174/0x1c0)
> >> > [    3.162971] ([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
> >>
> >> > For some unknown reason select_task_rq() gets called with a task that has
> >> > nr_cpus_allowed == 0. Hence "cpu = cpumask_any(tsk_cpus_allowed(p));"
> >> > within select_task_rq() will set cpu to nr_cpu_ids which in turn causes the
> >> > warning later on.
> >> >
> >> > It only happens with more than one node, otherwise it seems to work fine.
> >> >
> >> > Any idea what could be wrong here?
> >>
> >> create_worker()
> >>     tsk = kthread_create_on_node();
> >>     kthread_bind_mask(tsk, pool->attrs->cpumask);
> >>         do_set_cpus_allowed(tsk, mask);
> >>             set_cpus_allowed_common(tsk, mask);
> >>                 cpumask_copy(&tsk->cpus_allowed, mask);
> >>                 tsk->nr_cpus_allowed = cpumask_weight(mask);
> >>     wake_up_process(task);
> >>
> >> So this looks like pool->attrs->cpumask is simply empty.....
> >
> > Just had some time to look into this a bit more. Looks like we initialize
> > the cpu_to_node_masks (way) too late on s390 for fake numa. So Peter's
> > patch just revealed that problem.
> >
> > I'll see if initializing the masks earlier will fix this, but I think it
> > will.
> 
> Hello,
> 
> Is there any fix for this issue?  I can see the issue on arm64 running
> v4.7 kernel too.  And the oops can be avoided by reverting commit
> e9d867a(sched: Allow per-cpu kernel threads to run on online && !active).

I don't know about the arm64 issue. The s390 problem is a result from
initializing the cpu_to_node mapping too late.

However, the workqueue code seems to assume that we know the cpu_to_node
mapping for all _possible_ cpus very early and apparently it assumes that
this mapping is stable and doesn't change anymore.

This assumption however contradicts the purpose of 346404682434 ("numa, cpu
hotplug: change links of CPU and node when changing node number by onlining
CPU").

So something is wrong here...

On s390 with fake numa we wouldn't even know the mapping of all _possible_
cpus at boot time. When establishing the node mapping we try hard to map
our existing cpu topology into a sane node mapping. However we simply don't
know where non-present cpus are located topology-wise.  Even for present
cpus the answer is not always there since present cpus can be in either the
state "configured" (topology location known - cpu online possible) or
"deconfigured" (topology location unknown - cpu online not possible).

I can imagine several ways to fix this for s390, but before doing that I'm
wondering if the workqueue code is correct with

a) assuming that the cpu_to_node() mapping is valid for all _possible_ cpus
   that early

and

b) that the cpu_to_node() mapping does never change

Tejun?

  reply	other threads:[~2016-08-15 11:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-27 12:54 [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning Heiko Carstens
2016-07-27 15:23 ` Thomas Gleixner
2016-07-30 11:25   ` Heiko Carstens
2016-08-08  7:45     ` Ming Lei
2016-08-15 11:19       ` Heiko Carstens [this message]
2016-08-15 22:48         ` Tejun Heo
2016-08-16  7:55           ` Heiko Carstens
2016-08-16 15:20             ` Tejun Heo
2016-08-16 15:29               ` Peter Zijlstra
2016-08-16 15:42                 ` Tejun Heo
2016-08-16 22:19                   ` Heiko Carstens
2016-08-17  9:20                     ` Michael Holzheu
2016-08-17 13:58                     ` Tejun Heo
2016-08-18  9:30                       ` Michael Holzheu
2016-08-18 14:42                         ` Tejun Heo
2016-08-19  9:52                           ` Michael Holzheu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160815111908.GA3903@osiris \
    --to=heiko.carstens@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=holzheu@linux.vnet.ibm.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox