From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Josh Boyer <jwboyer@redhat.com>, Bruno Wolff III <bruno@wolff.to>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c
Date: Wed, 16 Jul 2014 21:17:32 +0200 [thread overview]
Message-ID: <53C6CFCC.2050300@arm.com> (raw)
In-Reply-To: <20140716151748.GC2460@hansolo.jdub.homelinux.org>
Hi Bruno and Josh,
On 16/07/14 17:17, Josh Boyer wrote:
> Adding Dietmar in since he is the original author.
>
> josh
>
> On Wed, Jul 16, 2014 at 09:55:46AM -0500, Bruno Wolff III wrote:
>> caffcdd8d27ba78730d5540396ce72ad022aff2c has been causing crashes
>> early in the boot process on one of three machines I have been
>> testing the kernel on. On that one machine it happens every boot. It
>> happens before netconsole is functional.
I tested this patch on two platforms (ARM TC2 and INTEL i5 M520) by
replacing the two lines (already with the new sg->sgc->capacity instead
of the old sg->sgp->power) by:
BUG_ON(!cpumask_empty(sched_group_cpus(sg)));
BUG_ON(sg->sgc->capacity);
The memory for sg is allocated and zeroed out in __sdt_alloc() with:
sgc = kzalloc_node(sizeof(struct sched_group_capacity) + cpumask_size(),
GFP_KERNEL, cpu_to_node(j));
The related call chain:
build_sched_domains()
__visit_domain_allocation_hell()
__sdt_alloc()
build_sched_groups()
>>
>> A partial revert of the commit fixes the problem. I do not know why
>> the commit is broken though.
>>
>> I have filed https://bugzilla.kernel.org/show_bug.cgi?id=80251 for
>> this issue.
From the issue, I see that the machine making trouble is an Xeon (2
processors w/ hyper-threading).
Could you please share:
cat /proc/cpuinfo and
cat /proc/schedstat (kernel config w/ CONFIG_SCHEDSTATS=y)
from this machine.
I don't think it is SMT (since it's also there on my INTEL i5 M520
(arch/x86/configs/x86_64_defconfig).
Could you also put the two BUG_ON lines into build_sched_groups()
[kernel/sched/core.c] wo/ the cpumask_clear() and setting
sg->sgc->capacity to 0 and share the possible crash output as well?
>>
>> The problem happens on both Fedora and Linus kernels.
>>
>> git diff caffcdd8d27ba78730d5540396ce72ad022aff2c^ caffcdd8d27ba78730d5540396ce72ad022aff2c
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 45d077ed24fb..6340c601475d 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5794,8 +5794,6 @@ build_sched_groups(struct sched_domain *sd, int cpu)
>> continue;
>>
>> group = get_group(i, sdd, &sg);
>> - cpumask_clear(sched_group_cpus(sg));
>> - sg->sgp->power = 0;
>> cpumask_setall(sched_group_mask(sg));
>>
>> for_each_cpu(j, span) {
>>
>> By rc5 the second line can't be added back because the structure has
>> changed. However adding back cpumask_clear(sched_group_cpus(sg)); to
>> rc5 got things working for me again.
That's because 'sched: Let 'struct sched_group_power' care about CPU
capacity' (commit id 63b2ca30bdb3) changes the struct sched_group member
from struct sched_group_power *sgp to struct sched_group_capacity *sgc .
I.e. the second line becomes
sg->sgc->capacity = 0;
Thanks,
-- Dietmar
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2014-07-16 19:17 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-16 14:55 Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Bruno Wolff III
2014-07-16 15:17 ` Josh Boyer
2014-07-16 19:17 ` Dietmar Eggemann [this message]
2014-07-16 19:54 ` Bruno Wolff III
2014-07-16 23:18 ` Dietmar Eggemann
2014-07-17 3:09 ` Bruno Wolff III
2014-07-17 8:57 ` Dietmar Eggemann
2014-07-17 9:04 ` Peter Zijlstra
2014-07-17 11:23 ` Dietmar Eggemann
2014-07-17 12:35 ` Peter Zijlstra
2014-07-18 5:34 ` Bruno Wolff III
2014-07-18 9:28 ` Dietmar Eggemann
2014-07-18 12:09 ` Bruno Wolff III
2014-07-18 10:16 ` Peter Zijlstra
2014-07-18 13:01 ` Bruno Wolff III
2014-07-18 14:16 ` Dietmar Eggemann
2014-07-18 14:16 ` Peter Zijlstra
2014-07-18 14:50 ` Peter Zijlstra
2014-07-18 16:16 ` Peter Zijlstra
2014-07-21 16:35 ` Bruno Wolff III
2014-07-21 16:52 ` Peter Zijlstra
2014-07-22 9:47 ` Peter Zijlstra
2014-07-22 10:38 ` Peter Zijlstra
2014-07-22 12:10 ` Bruno Wolff III
2014-07-22 13:03 ` Peter Zijlstra
2014-07-22 13:26 ` Peter Zijlstra
2014-07-22 13:35 ` Peter Zijlstra
2014-07-22 14:09 ` Bruno Wolff III
2014-07-22 14:18 ` Peter Zijlstra
2014-07-23 1:37 ` Bruno Wolff III
2014-07-23 6:51 ` Peter Zijlstra
2014-07-22 17:05 ` H. Peter Anvin
2014-07-23 15:11 ` Peter Zijlstra
2014-07-23 15:12 ` H. Peter Anvin
2014-07-24 1:45 ` Bruno Wolff III
2014-07-23 15:39 ` [tip:x86/urgent] x86, cpu: Fix cache topology for early P4-SMT tip-bot for Peter Zijlstra
2014-07-22 12:12 ` Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Dietmar Eggemann
2014-07-22 12:57 ` Bruno Wolff III
2014-07-28 8:28 ` [tip:sched/core] sched: Robustify topology setup tip-bot for Peter Zijlstra
2014-07-17 16:36 ` Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c Bruno Wolff III
2014-07-17 18:43 ` Dietmar Eggemann
2014-07-17 18:54 ` Bruno Wolff III
2014-07-17 4:21 ` Bruno Wolff III
2014-07-17 4:28 ` Bruno Wolff III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53C6CFCC.2050300@arm.com \
--to=dietmar.eggemann@arm.com \
--cc=bruno@wolff.to \
--cc=jwboyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.