All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russell King <rmk+lkml@arm.linux.org.uk>
To: Hubertus Franke <frankeh@watson.ibm.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>
Subject: Re: SMP BUG
Date: Wed, 15 Feb 2006 23:07:01 +0000	[thread overview]
Message-ID: <20060215230701.GD1508@flint.arm.linux.org.uk> (raw)
In-Reply-To: <43F12207.9010507@watson.ibm.com>

On Mon, Feb 13, 2006 at 07:19:19PM -0500, Hubertus Franke wrote:
> Folks the change introduced in 2.6.16-rc2   over 2.6.15
> wrt to the SMP initialization are wrong.
> Please apply to unroll the change..
> 
> Here is the logic ...
> sched_init is called from start_kernel before the
> architecture specific function cpu_check_smp() is called
> which is done as part of rest_init().
> 
> On s390 this actually sets the cpu_possible_map, which
> is now used in sched_init through the for_each_cpu without
> properly being initialized.
> As a result bringing 2nd and subsequent cpu online
> breaks.
> 
> This should be a quick fix, until this chicken and egg
> problem is solved otherwise.
> 
> -- Hubertus
> 
> --- kernel/sched.c.orig 2006-02-13 19:08:28.000000000 -0500
> +++ kernel/sched.c      2006-02-13 19:09:08.000000000 -0500
> @@ -6111,7 +6111,7 @@ void __init sched_init(void)
>         runqueue_t *rq;
>         int i, j, k;
> 
> -       for_each_cpu(i) {
> +       for (i = 0; i < NR_CPUS; i++ ) {
>                 prio_array_t *array;
> 
>                 rq = cpu_rq(i);

(left most of the message intact because it seems to have been ignored.
Copying Linus and akpm in the vague hope of a response.)

Yes, I'm also seeing an oops caused by exactly this on ARM:

<5>Linux version 2.6.16-rc3-rmk (rmk@dyn-67.arm.linux.org.uk) (gcc version 3.3 20030728 (Red Hat Linux 3.3-16)) #201 SMP Wed Feb 15 22:34:57 GMT 2006
CPU: Some Random V6 Processor [410fb020] revision 0 (ARMv6TEJ)
Machine: ARM-RealView EB
Memory policy: ECC disabled, Data cache writealloc
<7>On node 0 totalpages: 32768
<7>  DMA zone: 32768 pages, LIFO batch:7
<7>  DMA32 zone: 0 pages, LIFO batch:0
<7>  Normal zone: 0 pages, LIFO batch:0
<7>  HighMem zone: 0 pages, LIFO batch:0
CPU0: D VIPT write-back cache
CPU0: I cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
CPU0: D cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
Built 1 zonelists
<5>Kernel command line: root=/dev/nfs mem=128M console=ttyAMA0,38400 ip=dhcp cachepolicy=writealloc
PID hash table entries: 1024 (order: 10, 16384 bytes)
Console: colour dummy device 80x30
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
<6>Memory: 128MB = 128MB total
<5>Memory: 127104KB available (1992K code, 426K data, 100K init)
<7>Calibrating delay loop... 83.14 BogoMIPS (lpj=415744)
Mount-cache hash table entries: 512
<6>CPU: Testing write buffer coherency: ok
Calibrating local timer... 104.41MHz.
CPU1: Booted secondary processor
CPU1: D VIPT write-back cache
CPU1: I cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
CPU1: D cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
<7>Calibrating delay loop... 83.14 BogoMIPS (lpj=415744)
<1>Unable to handle kernel NULL pointer dereference at virtual address 0000001c
<1>pgd = c0004000
<1>[0000001c] *pgd=00000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0
PC is at enqueue_task+0x1c/0x64
LR is at activate_task+0xcc/0xe4
pc : [<c0034c28>]    lr : [<c0034f80>]    Not tainted
sp : c7c05ebc  ip : c7c05ed0  fp : c7c05ecc
r10: 00000001  r9 : c001b160  r8 : c038a240
r7 : c03fe2e0  r6 : 00000000  r5 : 0095257a  r4 : 00000008
r3 : 00000018  r2 : c03fe308  r1 : 00000000  r0 : c03fe2e0
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  Segment kernel
Control: C5787F  Table: 0000400A  DAC: 00000017
Process swapper (pid: 1, stack limit = 0xc7c04194)
Stack: (0xc7c05ebc to 0xc7c06000)
5ea0:                                                                00000008
5ec0: c7c05ef0 c7c05ed0 c0034f80 c0034c18 c001b160 00000001 00000000 c03fe2e0
5ee0: c038a240 c7c05f48 c7c05ef4 c0035960 c0034ec0 c0065a28 c00657b4 c7c04000
5f00: c7c0c000 c038a240 00000000 00000001 00000000 00000000 0000000f 80000013
5f20: c021a55c 00000001 00000001 00000000 c0264f8c 00000000 00000000 c7c05f58
5f40: c7c05f4c c00359d4 c00355e0 c7c05f88 c7c05f5c c00395dc c00359c8 c002b0a4
5f60: c021a55c 00000001 00000002 00000000 c0264f8c 00000000 00000000 c7c05fa4
5f80: c7c05f8c c004d544 c00394d4 00000000 00000001 00000001 c7c05fc8 c7c05fa8
5fa0: c005b488 c004d518 00000001 c0264f8c 00000000 00000000 00000000 c7c05fe0
5fc0: c7c05fcc c0008874 c005b3b8 c0262efc 00000000 c7c05ff4 c7c05fe4 c0021120
5fe0: c0008818 00000000 00000000 c7c05ff8 c0040a44 c0021084 2020202d 2d2d2020
Backtrace:
[<c0034c0c>] (enqueue_task+0x0/0x64) from [<c0034f80>] (activate_task+0xcc/0xe4) r4 = 00000008
[<c0034eb4>] (activate_task+0x0/0xe4) from [<c0035960>] (try_to_wake_up+0x38c/0x3e8)
[<c00355d4>] (try_to_wake_up+0x0/0x3e8) from [<c00359d4>] (wake_up_process+0x18/0x1c)
[<c00359bc>] (wake_up_process+0x0/0x1c) from [<c00395dc>] (migration_call+0x114/0x328)
[<c00394c8>] (migration_call+0x0/0x328) from [<c004d544>] (notifier_call_chain+0x38/0x50)
[<c004d50c>] (notifier_call_chain+0x0/0x50) from [<c005b488>] (cpu_up+0xdc/0x104)
[<c005b3ac>] (cpu_up+0x0/0x104) from [<c0008874>] (smp_init+0x68/0xc4)
[<c000880c>] (smp_init+0x0/0xc4) from [<c0021120>] (init+0xa8/0x1c8)
[<c0021078>] (init+0x0/0x1c8) from [<c0040a44>] (do_exit+0x0/0x3f4)
Code: e5903020 e2802028 e0813183 e2833018 (e593c004)
 <0>Kernel panic - not syncing: Attempted to kill init!
 <2>CPU1: stopping
[<c0027478>] (dump_stack+0x0/0x14) from [<c0028af8>] (ipi_cpu_stop+0x2c/0x64)
[<c0028acc>] (ipi_cpu_stop+0x0/0x64) from [<c0028bec>] (do_IPI+0xbc/0xe8)
[<c0028b30>] (do_IPI+0x0/0xe8) from [<c00219b0>] (__irq_svc+0x30/0xc0)
[<c0023b88>] (default_idle+0x0/0x44) from [<c0023c2c>] (cpu_idle+0x60/0x80)
[<c0023bcc>] (cpu_idle+0x0/0x80) from [<c00285b4>] (secondary_start_kernel+0xc8/0xd8)
[<c00284ec>] (secondary_start_kernel+0x0/0xd8) from [<000080e0>] (0x80e0)

enqueue_task is being called with p = c03fe2e0, array = NULL, leading
to a NULL pointer dereference because rq->array has not been initialised.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core

  reply	other threads:[~2006-02-15 23:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-14  0:19 SMP BUG Hubertus Franke
2006-02-15 23:07 ` Russell King [this message]
2006-02-15 23:17   ` Andrew Morton
2006-02-15 23:34     ` Russell King
2006-02-15 23:23   ` Linus Torvalds
2006-02-15 23:30     ` Andrew Morton
2006-02-15 23:37       ` Russell King
2006-02-15 23:46         ` Andrew Morton
2006-02-16  0:14           ` Russell King
2006-02-16  0:28             ` Andrew Morton
2006-02-16  0:52       ` Linus Torvalds
2006-02-16  3:29         ` Nick Piggin
2006-02-16  8:37     ` Ingo Molnar
2006-02-16 10:20     ` Russell King
2006-02-16 15:54       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060215230701.GD1508@flint.arm.linux.org.uk \
    --to=rmk+lkml@arm.linux.org.uk \
    --cc=akpm@osdl.org \
    --cc=frankeh@watson.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.