From: Mike Travis <travis@sgi.com>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>,
Stephen Rothwell <sfr@canb.auug.org.au>,
linux-next@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: linux-next: Tree for June 5
Date: Fri, 06 Jun 2008 07:13:00 -0700 [thread overview]
Message-ID: <484945EC.3020508@sgi.com> (raw)
In-Reply-To: <19f34abd0806060650q203bef48rd3b20c0cabec4774@mail.gmail.com>
Vegard Nossum wrote:
> On Fri, Jun 6, 2008 at 3:33 PM, Mike Travis <travis@sgi.com> wrote:
>> Vegard Nossum wrote:
>>> I reproced it with gc 4.1.2. I think the error is somewhere in kernel/sched.c.
>>>
>>> static int __build_sched_domains(const cpumask_t *cpu_map,
>>> struct sched_domain_attr *attr)
>>> {
>>> ...
>>> for (i = 0; i < MAX_NUMNODES; i++) {
>>> ...
>>> sg = kmalloc_node(sizeof(struct sched_group), GFP_KERNEL, i);
>>> ...
>>>
>>> This code is calling into the allocator with a spurious value of i,
>>> which causes SLAB to use an index (of 4 in my case) that is out of
>>> bounds for its nodelist array (at least it hasn't been initialized).
>>>
>>> This bit of code (a bit further down, inside the same loop) is also dubious:
>>>
>>> sg = kmalloc_node(sizeof(struct sched_group),
>>> GFP_KERNEL, i);
>>> if (!sg) {
>>> printk(KERN_WARNING
>>> "Can not alloc domain group for node %d\n", j);
>>> goto error;
>>> }
>>>
>>> Where it passes i to kmalloc_node() but reports an allocation for node
>>> j. Which one is correct?
>>>
>
> Hm, I think I'm wrong and the code is correct. However...
>
>>> Hope this helps, will send an update if I find out more.
>>>
>>>
>>> Vegard
>>>
>> Thanks Vegard for tracking this down. My thoughts were along the same
>> wavelength... ;-)
>
> I applied this patch
> @@ -7133,6 +7133,14 @@ static int __build_sched_domains(const
> cpumask_t *cpu_map,
> cpus_clear(*covered);
>
> cpus_and(*nodemask, *nodemask, *cpu_map);
> +
> + printk("node %d\n", i);
> + for (j = 0; j < NR_CPUS; ++j)
> + printk("%c", cpu_isset(j, *nodemask) ? 'X' : '.');
> + printk("\n");
> +
> + printk("empty = %d\n", cpus_empty(*nodemask));
> +
> if (cpus_empty(*nodemask)) {
> sched_group_nodes[i] = NULL;
> continue;
>
> and it shows some really strange output, maybe it makes sense to you:
>
> (the X means cpu is in the node)
>
> Total of 2 processors activated (11976.24 BogoMIPS).
> node 0
> XX..............................................................................
> ................................................................................
> ................................................................................
> ...............
> empty = 0
> node 1
> XX..............................................................................
> ................................................................................
> ................................................................................
> ...............
> empty = 0
> l3 = cachep->nodelists[0] (size-64) = ffff81003f824340
> node 2
> ................................................................................
> ................................................................................
> ................................................................................
> ...............
> empty = 1
> node 3
> ................................................................................
> ................................................................................
> ................................................................................
> ...............
> empty = 1
> node 4
> X...............................................................................
> ................................................................................
> ................................................................................
> ...............
> empty = 0
>
> This is a P4 3.0GHz with 1 physical CPU (but HT, so two logical CPUs).
> Yet node 4 is claimed to have a cpu too. That's bogus!
>
> (But I don't think it's an error in sched.c any more, probably the
> code that sets up the node maps.)
>
>
> Vegard
>
Could you send me the full console log and your config file? The setup of
the node_to_cpumask map is dependent on the early discovery (usually in the
apic code) and there's been some changes in that area recently.
Thanks,
Mike
next prev parent reply other threads:[~2008-06-06 14:13 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-05 7:52 linux-next: Tree for June 5 Stephen Rothwell
2008-06-06 2:56 ` Andrew Morton
2008-06-06 3:46 ` Andrew Morton
2008-06-06 7:17 ` Ingo Molnar
2008-06-06 7:25 ` Ingo Molnar
2008-06-06 7:33 ` Andrew Morton
2008-06-06 7:41 ` Ingo Molnar
2008-06-06 7:47 ` Andrew Morton
2008-06-06 7:53 ` Stephen Rothwell
2008-06-06 8:01 ` Andrew Morton
2008-06-06 8:22 ` Stephen Rothwell
2008-06-06 8:30 ` Andrew Morton
2008-06-06 8:36 ` Ingo Molnar
2008-06-06 11:50 ` Paul Mackerras
2008-06-06 8:27 ` Ingo Molnar
2008-06-06 8:23 ` Ingo Molnar
2008-06-06 8:28 ` Stephen Rothwell
2008-06-06 8:33 ` Ingo Molnar
2008-06-06 8:38 ` Andrew Morton
2008-06-06 8:49 ` Ingo Molnar
2008-06-06 9:01 ` Andrew Morton
2008-06-06 10:47 ` Ingo Molnar
2008-06-06 16:37 ` Ingo Molnar
2008-06-06 7:29 ` Andrew Morton
2008-06-06 9:48 ` Andrew Morton
2008-06-06 9:54 ` Andrew Morton
2008-06-06 10:10 ` Ingo Molnar
2008-06-06 10:54 ` Andrew Morton
2008-06-06 11:21 ` Vegard Nossum
2008-06-06 11:57 ` Ingo Molnar
2008-06-06 12:33 ` Vegard Nossum
2008-06-06 13:33 ` Mike Travis
2008-06-06 13:50 ` Vegard Nossum
2008-06-06 14:07 ` Vegard Nossum
2008-06-06 14:20 ` Mike Travis
2008-06-06 14:36 ` Vegard Nossum
2008-06-06 14:41 ` Mike Travis
2008-06-06 14:51 ` Mike Travis
2008-06-06 14:54 ` Mike Travis
2008-06-06 14:57 ` Ingo Molnar
2008-06-06 15:01 ` Ingo Molnar
2008-06-06 15:13 ` Vegard Nossum
2008-06-06 15:23 ` Ingo Molnar
2008-06-06 15:52 ` Mike Travis
2008-06-18 8:26 ` Ingo Molnar
2008-06-06 15:04 ` Mike Travis
2008-06-06 15:20 ` Mike Travis
2008-06-06 15:33 ` Ingo Molnar
2008-06-06 15:13 ` Ingo Molnar
2008-06-06 14:13 ` Mike Travis [this message]
2008-06-06 13:28 ` Mike Travis
2008-06-06 17:15 ` Ingo Molnar
2008-06-06 7:33 ` Stephen Rothwell
-- strict thread matches above, loose matches on Subject: below --
2009-06-05 6:41 Stephen Rothwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=484945EC.3020508@sgi.com \
--to=travis@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=sfr@canb.auug.org.au \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.