All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	Mike Travis <travis@sgi.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH -tip/cpus4096-v2] cpumask: fix cpumask of	call_function_data
Date: Mon, 27 Oct 2008 16:07:26 -0700	[thread overview]
Message-ID: <490649AE.6050905@ct.jp.nec.com> (raw)
In-Reply-To: <20081027133248.GA1007@elte.hu>

Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
>> in any case, i've started testing tip/cpus4096-v2 again on x86 - the 
>> problem with d4de5a above was the only outstanding known issue, right?
> 
> the sched_init() slab corruption bug is still there, i just triggered it 
> on two separate test-systems:
> 
> [    0.510620] CPU1 attaching sched-domain:
> [    0.512007]  domain 0: span 0-1 level CPU
> [    0.517730]   groups: 1 0
> [    0.520528] =============================================================================
> [    0.524002] BUG kmalloc-8: Wrong object count. Counter is 11 but counted were 50
> [    0.524002] -----------------------------------------------------------------------------
> [    0.524002] 

Hm,

I think kmalloc-8 is too small.
In this case, struct cpumask is defined;

struct cpumask {
        DECLARE_BITMAP(bits, NR_CPUS);
};

So, storing cpumask such as cpu_core_map, cpu_sibling_map and sd->span etc.
requires NR_CPUS bits. In Ingo's config, it needs 4096 bits.

At alloc_cpumask_var uses cpumask_size() for kmalloc(),

bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
{
        if (likely(slab_is_available()))
                *mask = kmalloc(cpumask_size(), flags);

cpumask_size() looks nr_cpumask_bits and it defined as follows;

#define nr_cpumask_bits nr_cpu_ids

it's CONFIG_NR_CPUS > BITS_PER_LONG case.
And now nr_cpu_ids is 2 on this boot log.

...
> [    0.000000] PERCPU: Allocating 1900544 bytes of per cpu data
> [    0.000000] NR_CPUS:4096 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1

So, kmalloc(8, flags) for cpumask_var_t at alloc_cpumask_var().
But the content is treated as cpumask_t, it causes slab corruption
with overwritten when the mask data is copied.

For example, cpu_to_core_group()

static int
cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg,
                  cpumask_t *mask)
{
        int group;

        *mask = per_cpu(cpu_sibling_map, cpu);

this copies 0x200 bytes (= 4096 bits), compiled my environment as follows;
ffffffff80251c56 <cpu_to_core_group>:
cpu_to_core_group():
ffffffff80251c56:       55                      push   %rbp
ffffffff80251c57:       48 63 ff                movslq %edi,%rdi
ffffffff80251c5a:       48 89 e5                mov    %rsp,%rbp
ffffffff80251c5d:       41 55                   push   %r13
ffffffff80251c5f:       49 89 d5                mov    %rdx,%r13
ffffffff80251c62:       ba 00 02 00 00          mov    $0x200,%edx
ffffffff80251c67:       41 54                   push   %r12
ffffffff80251c69:       49 89 f4                mov    %rsi,%r12
ffffffff80251c6c:       48 c7 c6 00 c1 c8 81    mov    $0xffffffff81c8c100,%rsi
ffffffff80251c73:       53                      push   %rbx
ffffffff80251c74:       48 89 cb                mov    %rcx,%rbx
ffffffff80251c77:       48 83 ec 08             sub    $0x8,%rsp
ffffffff80251c7b:       48 8b 05 3e d0 98 01    mov    0x198d03e(%rip),%rax        # ffffffff81bdecc0 <_cpu_pda>
ffffffff80251c82:       48 8b 04 f8             mov    (%rax,%rdi,8),%rax
ffffffff80251c86:       48 89 cf                mov    %rcx,%rdi
ffffffff80251c89:       48 03 70 08             add    0x8(%rax),%rsi
ffffffff80251c8d:       e8 de 29 25 00          callq  ffffffff804a4670 <__memcpy>

the 3rd parameter of __memcpy is rdx = 0x200.

So, I guess, we need
kmalloc(BITS_TO_LONGS(NR_CPUS), flags)
at alloc_cpumask_var().

Or change cpumask handling in sched.c etc?
I've no idea for this more, now.

thanks,
Hiroshi Shimamoto

  reply	other threads:[~2008-10-27 23:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-24  4:47 [PATCH -tip/cpus4096-v2] cpumask: fix cpumask of call_function_data Hiroshi Shimamoto
2008-10-24 11:05 ` Rusty Russell
2008-10-24 21:46   ` Hiroshi Shimamoto
2008-10-26 22:40     ` Rusty Russell
2008-10-30 17:44       ` Hiroshi Shimamoto
2008-10-24 11:15 ` Rusty Russell
2008-10-27 12:55   ` Ingo Molnar
2008-10-27 12:59     ` Ingo Molnar
2008-10-27 13:02       ` Ingo Molnar
2008-10-27 13:32       ` Ingo Molnar
2008-10-27 23:07         ` Hiroshi Shimamoto [this message]
2008-10-28  0:46           ` Rusty Russell
2008-10-27 22:21     ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=490649AE.6050905@ct.jp.nec.com \
    --to=h-shimamoto@ct.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.