linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <bsingharora@gmail.com>
To: Michael Ellerman <mpe@ellerman.id.au>, Tejun Heo <tj@kernel.org>
Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, torvalds@linux-foundation.org,
	kernel-team@fb.com, Peter Zijlstra <peterz@infradead.org>
Subject: Re: Oops on Power8 (was Re: [PATCH v2 1/7] workqueue: make workqueue available early during boot)
Date: Tue, 11 Oct 2016 23:21:09 +1100	[thread overview]
Message-ID: <3f7ddc83-fcd3-79c4-81b6-ec3c4de53be6@gmail.com> (raw)
In-Reply-To: <87a8eb5dwa.fsf@concordia.ellerman.id.au>



On 11/10/16 22:22, Michael Ellerman wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
>> Hello, Michael.
>>
>> On Mon, Oct 10, 2016 at 09:22:55PM +1100, Michael Ellerman wrote:
>>> This patch seems to be causing one of my Power8 boxes not to boot.
>>>
>>> Specifically commit 3347fa092821 ("workqueue: make workqueue available
>>> early during boot") in linux-next.
>>>
>>> If I revert this on top of next-20161005 then the machine boots again.
>>>
>>> I've attached the oops below. It looks like the cfs_rq of p->se is NULL?
>>
>> Hah, weird that it's arch dependent, or maybe it's just different
>> config options.  Most likely, it's caused by workqueue_init() call
>> being moved too early.  Can you please try the following patch and see
>> whether the problem goes away?
> 
> No that doesn't help.
> 
> What does is this:
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 94732d1ab00a..4e79549d242f 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1614,7 +1614,8 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags)
>  	 * [ this allows ->select_task() to simply return task_cpu(p) and
>  	 *   not worry about this generic constraint ]
>  	 */
> -	if (unlikely(!cpumask_test_cpu(cpu, tsk_cpus_allowed(p)) ||
> +	if (unlikely(cpu >= nr_cpu_ids ||
> +		     !cpumask_test_cpu(cpu, tsk_cpus_allowed(p)) ||
>  		     !cpu_online(cpu)))
>  		cpu = select_fallback_rq(task_cpu(p), p);
>  
> 
> The oops happens because we're in enqueue_task_fair() and p->se->cfs_rq
> is NULL.
> 
> The cfs_rq is NULL because we did set_task_rq(p, 2048), where 2048 is
> NR_CPUS. That causes us to index past the end of the tg->cfs_rq array in
> set_task_rq() and happen to get NULL.
> 
> We never should have done set_task_rq(p, 2048), because 2048 is >=
> nr_cpu_ids, which means it's not a valid CPU number, and set_task_rq()
> doesn't cope with that.
> 
> The reason we're calling set_task_rq() with CPU 2048 is because
> in select_task_rq() we had tsk_nr_cpus_allowed() = 0, because
> tsk_cpus_allowed(p) is an empty cpu mask.
> 
> That means we do in select_task_rq():
>   cpu = cpumask_any(tsk_cpus_allowed(p));                                                                                                                                    
> 
> And when tsk_cpus_allowed(p) is empty cpumask_any() returns nr_cpu_ids,
> causing cpu to be set to 2048 in my case.
> 
> select_task_rq() then does the check to see if it should use a fallback
> rq:
> 
> if (unlikely(!cpumask_test_cpu(cpu, tsk_cpus_allowed(p)) ||                                                                                                                        
> 	     !cpu_online(cpu)))
> 	cpu = select_fallback_rq(task_cpu(p), p);
> 
> 
> But in both those checks we end up indexing off the end of the cpu mask,
> because cpu is >= nr_cpu_ids. At least on my system they both return
> true and so we return cpu == 2048.
> 
> The patch above is pretty clearly not the right fix, though maybe it's a
> good safety measure.
> 
> Presumably we shouldn't be ending up with tsk_cpus_allowed() being
> empty, but I haven't had time to track down why that's happening.
> 
> cheers
> 

+peterz

FYI: I see the samething on my cpu as well, its just that I get lucky
and cpu_online(cpu) returns false.

I think from a functional perspective we may want to get some additional
debug checks for places where the cpumask is empty early during boot.

Looks like there is a dependency between cpumasks and cpus coming online.
I wonder if we can hit similar issues during hotplug

FWIW, your patch looks correct to me, though one might argue that
cpumask_test_cpu() is a better place to fix it

Balbir Singh

  reply	other threads:[~2016-10-11 12:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1473967821-24363-1-git-send-email-tj@kernel.org>
     [not found] ` <1473967821-24363-2-git-send-email-tj@kernel.org>
     [not found]   ` <20160917172314.GB10771@mtj.duckdns.org>
2016-10-10 10:22     ` Oops on Power8 (was Re: [PATCH v2 1/7] workqueue: make workqueue available early during boot) Michael Ellerman
2016-10-10 11:17       ` Balbir Singh
2016-10-10 12:53         ` Tejun Heo
2016-10-10 13:22           ` Balbir Singh
2016-10-10 13:02       ` Tejun Heo
2016-10-10 13:14         ` Tejun Heo
2016-10-11 11:22         ` Michael Ellerman
2016-10-11 12:21           ` Balbir Singh [this message]
2016-10-14 15:08             ` Tejun Heo
2016-10-15  3:43               ` Balbir Singh
2016-10-14 15:07           ` Tejun Heo
2016-10-15  1:25             ` Balbir Singh
2016-10-15  9:48             ` Michael Ellerman
2016-10-17 18:13               ` Tejun Heo
2016-10-17 12:24             ` Michael Ellerman
2016-10-17 12:51               ` Balbir Singh
2016-10-18  2:35                 ` Michael Ellerman
2016-10-17 18:15               ` Tejun Heo
2016-10-17 19:30                 ` Tejun Heo
2016-10-18  4:37                   ` Michael Ellerman
2016-10-18 18:58                     ` Tejun Heo
2016-10-19 11:16                       ` Michael Ellerman
2016-10-19 16:15                         ` [PATCH wq/for-4.10] workqueue: move wq_numa_init() to workqueue_init() Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f7ddc83-fcd3-79c4-81b6-ec3c4de53be6@gmail.com \
    --to=bsingharora@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).