public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nathan Lynch <nathanl@austin.ibm.com>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: Andrew Morton <akpm@osdl.org>, Ingo Molnar <mingo@elte.hu>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [PATCH] i386 CPU hotplug updated for -mm
Date: Tue, 12 Oct 2004 00:59:47 -0500	[thread overview]
Message-ID: <1097560787.6557.99.camel@biclops> (raw)
In-Reply-To: <Pine.LNX.4.61.0410102302170.2745@musoma.fsmlabs.com>

On Mon, 2004-10-11 at 03:19, Zwane Mwaikambo wrote:
> Hi Andrew,
> 	Find attached the i386 cpu hotplug patch updated for Ingo's latest 
> round of goodies. In order to avoid dumping cpu hotplug code into 
> kernel/irq/* i dropped the cpu_online check in do_IRQ() by modifying 
> fixup_irqs(). The difference being that on cpu offline, fixup_irqs() is 
> called before we clear the cpu from cpu_online_map and a long delay in 
> order to ensure that we never have any queued external interrupts on the 
> APICs. Due to my usual test victims being in boxes a continent away this 
> hasn't been tested, but i'll cover bug reports (nudge, Nathan! ;)

Had to apply the patch to 2.6.9-rc4-mm1 (2.6.9-rc3-mm3 doesn't detect my
scsi controller).  Tested on a dual Pentium 3.  Simple offline/online
tests seem ok, except I see these warnings when taking a cpu down:

using smp_processor_id() in preemptible code: bash/3436
 [<c0106f37>] dump_stack+0x17/0x20
 [<c011a087>] smp_processor_id+0x87/0xa0
 [<c0134d54>] cpu_down+0x134/0x240
 [<c02770d9>] store_online+0x39/0x70
 [<c0274487>] sysdev_store+0x37/0x50
 [<c019339e>] flush_write_buffer+0x2e/0x40
 [<c0193427>] sysfs_write_file+0x77/0x90
 [<c015aa20>] vfs_write+0xe0/0x120
 [<c015ab0d>] sys_write+0x3d/0x70
 [<c0106123>] syscall_call+0x7/0xb

using smp_processor_id() in preemptible code: ksoftirqd/1/5233
 [<c0106f37>] dump_stack+0x17/0x20
 [<c011a087>] smp_processor_id+0x87/0xa0
 [<c0124128>] ksoftirqd+0x68/0x140
 [<c0133326>] kthread+0x96/0xe0
 [<c0104275>] kernel_thread_helper+0x5/0x10

Under load (unpacking a kernel source tree while continuous
online/offline of a cpu) I got a panic within a couple of minutes:

Unable to handle kernel NULL pointer dereference at virtual address 00000004
 printing eip:
c0145703
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU:    1
EIP:    0060:[<c0145703>]    Not tainted VLI
EFLAGS: 00010082   (2.6.9-rc4-mm1i386cpuhp)
EIP is at kmem_cache_free+0x33/0x70
eax: cba2cd98   ebx: 00000000   ecx: 00000000   edx: cba2cd98
esi: c1561ac0   edi: cba2cd98   ebp: c058be8c   esp: c058be7c
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c058b000 task=dff88540)
Stack: 00000082 c1559920 00001000 cba2cd9c c058bea4 c013e491 00000000 cf4abee0
       00001000 c015f050 c058beb0 c015f92c cf4abee0 c058beb8 c015fbac c058bec4
       c015f07d cf4abee0 c058beec c0160819 c0271a48 c0127c11 00000001 00000002
Call Trace:
 [<c0106f0a>] show_stack+0x7a/0x90
 [<c0107096>] show_registers+0x156/0x1d0
 [<c01072b4>] die+0xf4/0x180
 [<c0116bd7>] do_page_fault+0x457/0x66d
 [<c0106b8d>] error_code+0x2d/0x38
 [<c013e491>] mempool_free+0x81/0x90
 [<c015f92c>] bio_destructor+0x3c/0x60
 [<c015fbac>] bio_put+0x2c/0x40
 [<c015f07d>] end_bio_bh_io_sync+0x2d/0x60
 [<c0160819>] bio_endio+0x59/0x90
 [<c027c578>] __end_that_request_first+0x188/0x250
 [<c029db14>] __ide_end_request+0x54/0x150
 [<c029dc68>] ide_end_request+0x58/0xc0
 [<c02a3fe1>] task_end_request+0x31/0x70
 [<c02a418b>] task_out_intr+0x7b/0x100
 [<c029f6cd>] ide_intr+0xbd/0x160
 [<c0139d44>] handle_IRQ_event+0x34/0x70
 [<c0139e6c>] __do_IRQ+0xec/0x170
 [<c0108570>] do_IRQ+0x60/0xa0
 =======================
 [<c0106a90>] common_interrupt+0x18/0x20
 [<00000000>] 0x0
 [<dff84fbc>] 0xdff84fbc
Code: f8 89 c6 89 5d f4 89 7d fc 9c 8f 45 f0 fa 89 d7 e8 13 49 fd ff 8b 1c 86 e8 6b ea ff ff 8b 4d 04 89 fa 89 f0 e8 2f f3 ff ff 89 c7 <8b> 43 04 39 03 73 20 f0 ff 86 c4 00 00 00 8b 03 89 7c 83 10 ff
 <0>Kernel panic - not syncing: Fatal exception in interrupt

I fixed up the warning in cpu_down with the following patch and now am
running with that + 2.6.9-rc4-mm1 + your patch while doing continuous
online/offline and make -j8.  It's been running for about 45 minutes and
I haven't seen the panic yet, although I'm at a loss to explain why the
change would fix it.  Will let it run overnight and report back...

Nathan

--

Fix (harmless?) smp_processor_id() usage in preemptible section of
cpu_down.

Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
---

 2.6.9-rc4-mm1-nathanl/kernel/cpu.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN kernel/cpu.c~cpu_down-fix-smp_processor_id-warning kernel/cpu.c
--- 2.6.9-rc4-mm1/kernel/cpu.c~cpu_down-fix-smp_processor_id-warning	2004-10-11 23:28:47.000000000 -0500
+++ 2.6.9-rc4-mm1-nathanl/kernel/cpu.c	2004-10-11 23:34:36.000000000 -0500
@@ -129,7 +129,8 @@ int cpu_down(unsigned int cpu)
 	__cpu_die(cpu);
 
 	/* Move it here so it can run. */
-	kthread_bind(p, smp_processor_id());
+	kthread_bind(p, get_cpu());
+	put_cpu();
 
 	/* CPU is completely dead: tell everyone.  Too late to complain. */
 	if (notifier_call_chain(&cpu_chain, CPU_DEAD, (void *)(long)cpu)
_



  reply	other threads:[~2004-10-12  6:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-01 20:45 [patch 0/5, 2.6.9-rc3] generic irq subsystem, description Ingo Molnar
2004-10-01 20:46 ` [patch 1/5, 2.6.9-rc3] generic irq subsystem, core Ingo Molnar
2004-10-01 20:47   ` [patch 2/5, 2.6.9-rc3] generic irq subsystem, x86 port Ingo Molnar
2004-10-01 20:48     ` [patch 3/5, 2.6.9-rc3] generic irq subsystem, x64 port Ingo Molnar
2004-10-01 20:48       ` [patch 4/5, 2.6.9-rc3] generic irq subsystem, ppc port Ingo Molnar
2004-10-01 20:49       ` [patch 5/5, 2.6.9-rc3] generic irq subsystem, ppc64 port Ingo Molnar
     [not found]   ` <20041001143332.7e3a5aba.akpm@osdl.org>
     [not found]     ` <Pine.LNX.4.61.0410091550300.2870@musoma.fsmlabs.com>
2004-10-11  8:19       ` [PATCH] i386 CPU hotplug updated for -mm Zwane Mwaikambo
2004-10-12  5:59         ` Nathan Lynch [this message]
2004-10-12  6:04           ` Ingo Molnar
2004-10-12 14:16             ` Nathan Lynch
2004-10-12 14:38               ` Zwane Mwaikambo
2004-10-12  6:23           ` Dave Hansen
2004-10-12 14:47           ` Zwane Mwaikambo
2004-10-18 15:10         ` Pavel Machek
2004-10-21 14:47           ` Zwane Mwaikambo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1097560787.6557.99.camel@biclops \
    --to=nathanl@austin.ibm.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox