From: Jiang Liu <liuj97@gmail.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: linux-kernel@vger.kernel.org, a.p.zijlstra@chello.nl,
mingo@kernel.org, pjt@google.com, tglx@linutronix.de,
seto.hidetoshi@jp.fujitsu.com
Subject: Re: [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation
Date: Wed, 09 May 2012 18:21:30 +0800 [thread overview]
Message-ID: <4FAA452A.1070909@gmail.com> (raw)
In-Reply-To: <1336559908-32533-1-git-send-email-imammedo@redhat.com>
Hi Igor,
Thanks for fixing this bug! We encountered the same issue with an
IA64 systems too. That system could boot with 2.6.32, but can't boot with
any 3.x.x kernels. We have just found the root cause today.
--gerry
On 05/09/2012 06:38 PM, Igor Mammedov wrote:
> if we have one cpu that failed to boot and boot cpu gave up on waiting for it
> and then another cpu is being booted, kernel might crash with following OOPS:
>
> [ 723.865765] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> [ 723.866616] IP: [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
> [ 723.866616] PGD 7ba91067 PUD 7a205067 PMD 0
> [ 723.866616] Oops: 0000 [#1] SMP
> [ 723.898527] CPU 1
> ...
> [ 723.898527] Pid: 1221, comm: offV2.sh Tainted: G W 3.4.0-rc4+ #213 Red Hat KVM
> [ 723.898527] RIP: 0010:[<ffffffff812c3630>] [<ffffffff812c3630>] __bitmap_weight+0x30/0x80
> [ 723.898527] RSP: 0018:ffff88007ab9dc18 EFLAGS: 00010246
> [ 723.898527] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000
> [ 723.898527] RDX: 0000000000000018 RSI: 0000000000000100 RDI: 0000000000000018
> [ 723.898527] RBP: ffff88007ab9dc18 R08: 0000000000000000 R09: 0000000000000020
> [ 723.898527] R10: 0000000000000004 R11: 0000000000000000 R12: ffff88007c06ed60
> [ 723.898527] R13: ffff880037a94000 R14: 0000000000000003 R15: ffff88007c06ed60
> [ 723.898527] FS: 00007f1d6a7d8700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> [ 723.898527] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 723.898527] CR2: 0000000000000018 CR3: 000000007bb7f000 CR4: 00000000000007e0
> [ 723.898527] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 723.898527] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 723.898527] Process offV2.sh (pid: 1221, threadinfo ffff88007ab9c000, task ffff88007b358000)
> [ 723.898527] Stack:
> [ 723.898527] ffff88007ab9dcc8 ffffffff8108b9b6 ffff88007ab9dc58 ffff88007b4f2a00
> [ 723.898527] ffff88007c06ed60 0000000000000003 000000037ab9dc58 0000000000010008
> [ 723.898527] ffffffff81a308e8 0000000000000003 ffff88007b489cc0 ffff880037b6bd20
> [ 723.898527] Call Trace:
> [ 723.898527] [<ffffffff8108b9b6>] build_sched_domains+0x7b6/0xa50
> [ 723.898527] [<ffffffff8108bea9>] partition_sched_domains+0x259/0x3f0
> [ 723.898527] [<ffffffff810c4485>] cpuset_update_active_cpus+0x85/0x90
> [ 723.898527] [<ffffffff81084f65>] cpuset_cpu_active+0x25/0x30
> [ 723.898527] [<ffffffff81545b45>] notifier_call_chain+0x55/0x80
> [ 723.898527] [<ffffffff8107e59e>] __raw_notifier_call_chain+0xe/0x10
> [ 723.898527] [<ffffffff81058be0>] __cpu_notify+0x20/0x40
> [ 723.898527] [<ffffffff8153af08>] _cpu_up+0xc7/0x10e
> [ 723.898527] [<ffffffff8153af9b>] cpu_up+0x4c/0x5c
>
> crash happens in init_sched_groups_power() that expects sched_groups to be
> circular linked list. However it is not always true, since sched_groups
> preallocated in __sdt_alloc are initialized in build_sched_groups and it
> may exit early
>
> if (cpu != cpumask_first(sched_domain_span(sd)))
> return 0;
>
> without initializing sd->groups->next field.
>
> Fix bug by initializing next field right after sched_group was allocated.
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> kernel/sched/core.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0533a68..e5212ae 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6382,6 +6382,8 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
> if (!sg)
> return -ENOMEM;
>
> + sg->next = sg;
> +
> *per_cpu_ptr(sdd->sg, j) = sg;
>
> sgp = kzalloc_node(sizeof(struct sched_group_power),
next prev parent reply other threads:[~2012-05-09 10:21 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-09 10:38 [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation Igor Mammedov
2012-05-09 10:21 ` Jiang Liu [this message]
2012-05-09 11:44 ` Igor Mammedov
2012-05-09 11:52 ` Peter Zijlstra
2012-05-09 11:58 ` Igor Mammedov
2012-05-09 12:21 ` Peter Zijlstra
2012-05-09 12:22 ` Peter Zijlstra
2012-05-09 12:35 ` Igor Mammedov
2012-05-09 12:30 ` Peter Zijlstra
2012-05-09 13:27 ` [RFC][PATCH] printk: Add %pb to print bitmaps Peter Zijlstra
2012-05-09 13:29 ` Peter Zijlstra
2012-05-09 13:36 ` Ingo Molnar
2012-05-09 13:44 ` Peter Zijlstra
2012-05-09 13:59 ` Peter Zijlstra
2012-05-09 14:15 ` Ingo Molnar
2012-05-09 14:24 ` Peter Zijlstra
2012-05-09 15:32 ` Peter Zijlstra
2012-05-09 15:41 ` Ingo Molnar
2012-05-09 16:06 ` Peter Zijlstra
2012-05-09 16:39 ` Joe Perches
2012-05-09 17:22 ` Ingo Molnar
2012-05-09 17:24 ` Ingo Molnar
2012-05-09 17:25 ` Peter Zijlstra
2012-05-09 17:31 ` Ingo Molnar
2012-05-09 14:19 ` Joe Perches
2012-05-09 15:34 ` Ingo Molnar
2012-05-09 17:15 ` Linus Torvalds
2012-05-09 17:22 ` Peter Zijlstra
2012-05-09 17:26 ` Ingo Molnar
2012-05-09 17:30 ` Peter Zijlstra
2012-05-09 19:07 ` Andrew Morton
2012-05-09 20:58 ` Peter Zijlstra
2012-05-10 7:45 ` Ingo Molnar
2012-05-10 13:26 ` [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation Igor Mammedov
2012-05-10 13:45 ` Peter Zijlstra
2012-05-10 17:01 ` Igor Mammedov
2012-05-10 17:33 ` Peter Zijlstra
2012-05-09 10:35 ` [tip:sched/urgent] sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption tip-bot for Igor Mammedov
2012-05-09 11:41 ` [PATCH] sched_groups are expected to be circular linked list, make it so right after allocation Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FAA452A.1070909@gmail.com \
--to=liuj97@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=pjt@google.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox