From: Nathan Lynch <nathanl@austin.ibm.com>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: Andrew Morton <akpm@osdl.org>, Ingo Molnar <mingo@elte.hu>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [PATCH] i386 CPU hotplug updated for -mm
Date: Tue, 12 Oct 2004 00:59:47 -0500 [thread overview]
Message-ID: <1097560787.6557.99.camel@biclops> (raw)
In-Reply-To: <Pine.LNX.4.61.0410102302170.2745@musoma.fsmlabs.com>
On Mon, 2004-10-11 at 03:19, Zwane Mwaikambo wrote:
> Hi Andrew,
> Find attached the i386 cpu hotplug patch updated for Ingo's latest
> round of goodies. In order to avoid dumping cpu hotplug code into
> kernel/irq/* i dropped the cpu_online check in do_IRQ() by modifying
> fixup_irqs(). The difference being that on cpu offline, fixup_irqs() is
> called before we clear the cpu from cpu_online_map and a long delay in
> order to ensure that we never have any queued external interrupts on the
> APICs. Due to my usual test victims being in boxes a continent away this
> hasn't been tested, but i'll cover bug reports (nudge, Nathan! ;)
Had to apply the patch to 2.6.9-rc4-mm1 (2.6.9-rc3-mm3 doesn't detect my
scsi controller). Tested on a dual Pentium 3. Simple offline/online
tests seem ok, except I see these warnings when taking a cpu down:
using smp_processor_id() in preemptible code: bash/3436
[<c0106f37>] dump_stack+0x17/0x20
[<c011a087>] smp_processor_id+0x87/0xa0
[<c0134d54>] cpu_down+0x134/0x240
[<c02770d9>] store_online+0x39/0x70
[<c0274487>] sysdev_store+0x37/0x50
[<c019339e>] flush_write_buffer+0x2e/0x40
[<c0193427>] sysfs_write_file+0x77/0x90
[<c015aa20>] vfs_write+0xe0/0x120
[<c015ab0d>] sys_write+0x3d/0x70
[<c0106123>] syscall_call+0x7/0xb
using smp_processor_id() in preemptible code: ksoftirqd/1/5233
[<c0106f37>] dump_stack+0x17/0x20
[<c011a087>] smp_processor_id+0x87/0xa0
[<c0124128>] ksoftirqd+0x68/0x140
[<c0133326>] kthread+0x96/0xe0
[<c0104275>] kernel_thread_helper+0x5/0x10
Under load (unpacking a kernel source tree while continuous
online/offline of a cpu) I got a panic within a couple of minutes:
Unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
c0145703
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 1
EIP: 0060:[<c0145703>] Not tainted VLI
EFLAGS: 00010082 (2.6.9-rc4-mm1i386cpuhp)
EIP is at kmem_cache_free+0x33/0x70
eax: cba2cd98 ebx: 00000000 ecx: 00000000 edx: cba2cd98
esi: c1561ac0 edi: cba2cd98 ebp: c058be8c esp: c058be7c
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c058b000 task=dff88540)
Stack: 00000082 c1559920 00001000 cba2cd9c c058bea4 c013e491 00000000 cf4abee0
00001000 c015f050 c058beb0 c015f92c cf4abee0 c058beb8 c015fbac c058bec4
c015f07d cf4abee0 c058beec c0160819 c0271a48 c0127c11 00000001 00000002
Call Trace:
[<c0106f0a>] show_stack+0x7a/0x90
[<c0107096>] show_registers+0x156/0x1d0
[<c01072b4>] die+0xf4/0x180
[<c0116bd7>] do_page_fault+0x457/0x66d
[<c0106b8d>] error_code+0x2d/0x38
[<c013e491>] mempool_free+0x81/0x90
[<c015f92c>] bio_destructor+0x3c/0x60
[<c015fbac>] bio_put+0x2c/0x40
[<c015f07d>] end_bio_bh_io_sync+0x2d/0x60
[<c0160819>] bio_endio+0x59/0x90
[<c027c578>] __end_that_request_first+0x188/0x250
[<c029db14>] __ide_end_request+0x54/0x150
[<c029dc68>] ide_end_request+0x58/0xc0
[<c02a3fe1>] task_end_request+0x31/0x70
[<c02a418b>] task_out_intr+0x7b/0x100
[<c029f6cd>] ide_intr+0xbd/0x160
[<c0139d44>] handle_IRQ_event+0x34/0x70
[<c0139e6c>] __do_IRQ+0xec/0x170
[<c0108570>] do_IRQ+0x60/0xa0
=======================
[<c0106a90>] common_interrupt+0x18/0x20
[<00000000>] 0x0
[<dff84fbc>] 0xdff84fbc
Code: f8 89 c6 89 5d f4 89 7d fc 9c 8f 45 f0 fa 89 d7 e8 13 49 fd ff 8b 1c 86 e8 6b ea ff ff 8b 4d 04 89 fa 89 f0 e8 2f f3 ff ff 89 c7 <8b> 43 04 39 03 73 20 f0 ff 86 c4 00 00 00 8b 03 89 7c 83 10 ff
<0>Kernel panic - not syncing: Fatal exception in interrupt
I fixed up the warning in cpu_down with the following patch and now am
running with that + 2.6.9-rc4-mm1 + your patch while doing continuous
online/offline and make -j8. It's been running for about 45 minutes and
I haven't seen the panic yet, although I'm at a loss to explain why the
change would fix it. Will let it run overnight and report back...
Nathan
--
Fix (harmless?) smp_processor_id() usage in preemptible section of
cpu_down.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
---
2.6.9-rc4-mm1-nathanl/kernel/cpu.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)
diff -puN kernel/cpu.c~cpu_down-fix-smp_processor_id-warning kernel/cpu.c
--- 2.6.9-rc4-mm1/kernel/cpu.c~cpu_down-fix-smp_processor_id-warning 2004-10-11 23:28:47.000000000 -0500
+++ 2.6.9-rc4-mm1-nathanl/kernel/cpu.c 2004-10-11 23:34:36.000000000 -0500
@@ -129,7 +129,8 @@ int cpu_down(unsigned int cpu)
__cpu_die(cpu);
/* Move it here so it can run. */
- kthread_bind(p, smp_processor_id());
+ kthread_bind(p, get_cpu());
+ put_cpu();
/* CPU is completely dead: tell everyone. Too late to complain. */
if (notifier_call_chain(&cpu_chain, CPU_DEAD, (void *)(long)cpu)
_
next prev parent reply other threads:[~2004-10-12 6:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-01 20:45 [patch 0/5, 2.6.9-rc3] generic irq subsystem, description Ingo Molnar
2004-10-01 20:46 ` [patch 1/5, 2.6.9-rc3] generic irq subsystem, core Ingo Molnar
2004-10-01 20:47 ` [patch 2/5, 2.6.9-rc3] generic irq subsystem, x86 port Ingo Molnar
2004-10-01 20:48 ` [patch 3/5, 2.6.9-rc3] generic irq subsystem, x64 port Ingo Molnar
2004-10-01 20:48 ` [patch 4/5, 2.6.9-rc3] generic irq subsystem, ppc port Ingo Molnar
2004-10-01 20:49 ` [patch 5/5, 2.6.9-rc3] generic irq subsystem, ppc64 port Ingo Molnar
[not found] ` <20041001143332.7e3a5aba.akpm@osdl.org>
[not found] ` <Pine.LNX.4.61.0410091550300.2870@musoma.fsmlabs.com>
2004-10-11 8:19 ` [PATCH] i386 CPU hotplug updated for -mm Zwane Mwaikambo
2004-10-12 5:59 ` Nathan Lynch [this message]
2004-10-12 6:04 ` Ingo Molnar
2004-10-12 14:16 ` Nathan Lynch
2004-10-12 14:38 ` Zwane Mwaikambo
2004-10-12 6:23 ` Dave Hansen
2004-10-12 14:47 ` Zwane Mwaikambo
2004-10-18 15:10 ` Pavel Machek
2004-10-21 14:47 ` Zwane Mwaikambo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1097560787.6557.99.camel@biclops \
--to=nathanl@austin.ibm.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=zwane@linuxpower.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox