* Oops in get_boot_pages at reboot
@ 2004-04-01 16:56 Olof Johansson
2004-04-01 17:11 ` Manfred Spraul
2004-04-01 18:55 ` Andrew Morton
0 siblings, 2 replies; 6+ messages in thread
From: Olof Johansson @ 2004-04-01 16:56 UTC (permalink / raw)
To: manfred; +Cc: torvalds, akpm, linux-kernel, anton
Hi,
We've started to see Oopses when rebooting some ppc64 systems built with
CONFIG_NUMA after the distribute-early-allocations-across-nodes patch went
in. This happens on the way down at a reboot:
Please stand by while rebooting the system...
md: stopping all md devices.
md: md0 switched to read-only mode.
Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=32 NUMA
NIP: C00000000046D424 XER: 0000000020000000 LR: C0000000000D36CC
REGS: c000000033f8f6b0 TRAP: 0700 Tainted: GF (2.6.5-rc2-ames)
MSR: 8000000000089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK: c0000000018ff240[1] 'init' THREAD: c000000033f8c000 CPU: 7
GPR00: C0000000000D36CC C000000033F8F930 C000000000605100 00000000000000D0
GPR04: 0000000000000000 C000000033F8FAC0 0000000000000000 0000000000000000
GPR08: C000000033F8FAC0 0000000000001000 C00000003374B680 C000000000603000
GPR12: C0000000078A3BB0 C00000000049E000 0000000000000000 0000000000000400
GPR16: 0000000000000000 0000000000000000 C00000002FEBEB28 C00000002FEBEB18
GPR20: C00000002FEBEB20 C00000002FEBEB08 C00000002FEBEB10 C00000002FEBEB18
GPR24: 000000000000000B 0000000000000000 0000000000000400 C00000003374B680
GPR28: C000000033F8FAC0 C000000033ABFE80 C000000000549008 0000000000000000
NIP [c00000000046d424] .get_boot_pages+0x0/0x1ac
LR [c0000000000d36cc] .__pollwait+0x5c/0x108
Call Trace:
[c0000000000c9984] .pipe_poll+0xc4/0xd0
[c0000000000d3bac] .do_select+0x334/0x400
[c000000000020828] .sys32_select+0x440/0x654
[c000000000020a50] .ppc32_select+0x14/0x28
[c000000000011bdc] .ret_from_syscall_1+0x0/0xa4
So __pollwait() calls __get_free_page(), system_running is 0 so
get_boot_pages is called. Since get_boot_pages is labeled __init, badness
happens.
How about checking against mem_init_done instead of system_running? It
helps against the oops, but there might be some good reason not to do
it. I don't claim to know the intrisic details about the MM. :-)
I'm not sure yet why we don't see this on all kernels at all times. Most
likely seems to be that it's a timing issue with init doing select (it
seems to use it to sleep). Either way, the test of system_running still
seems risky.
Thanks,
Olof
===== mm/page_alloc.c 1.190 vs edited =====
--- 1.190/mm/page_alloc.c Sat Mar 20 05:56:20 2004
+++ edited/mm/page_alloc.c Wed Mar 31 18:22:48 2004
@@ -732,9 +732,10 @@
fastcall unsigned long __get_free_pages(unsigned int gfp_mask, unsigned int order)
{
struct page * page;
-
#ifdef CONFIG_NUMA
- if (unlikely(!system_running))
+ extern int mem_init_done;
+
+ if (unlikely(!mem_init_done))
return get_boot_pages(gfp_mask, order);
#endif
page = alloc_pages(gfp_mask, order);
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Oops in get_boot_pages at reboot 2004-04-01 16:56 Oops in get_boot_pages at reboot Olof Johansson @ 2004-04-01 17:11 ` Manfred Spraul 2004-04-01 17:32 ` Olof Johansson 2004-04-01 18:55 ` Andrew Morton 1 sibling, 1 reply; 6+ messages in thread From: Manfred Spraul @ 2004-04-01 17:11 UTC (permalink / raw) To: Olof Johansson; +Cc: torvalds, akpm, linux-kernel, anton Olof Johansson wrote: >So __pollwait() calls __get_free_page(), system_running is 0 so >get_boot_pages is called. Since get_boot_pages is labeled __init, badness >happens. > > I didn't notice that the reboot code resets system_running to 0, sorry. >How about checking against mem_init_done instead of system_running? It >helps against the oops, but there might be some good reason not to do >it. > mem_init_done is ppc specific. Is there an equivalent to system_running that is not set to 0 during reboot? -- Manfred ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Oops in get_boot_pages at reboot 2004-04-01 17:11 ` Manfred Spraul @ 2004-04-01 17:32 ` Olof Johansson 0 siblings, 0 replies; 6+ messages in thread From: Olof Johansson @ 2004-04-01 17:32 UTC (permalink / raw) To: Manfred Spraul; +Cc: torvalds, akpm, linux-kernel, anton Manfred Spraul wrote: > mem_init_done is ppc specific. Is there an equivalent to system_running > that is not set to 0 during reboot? Doh, my bad. :( I'm not able to spot any other suitable variable myself. system_running is checked in call_usermodehelper(), so keeping it at 1 across devices_shutdown() might cause problems(?). -Olof -- Olof Johansson Office: 4F005/905 Linux on Power Development IBM Systems Group Email: olof@austin.ibm.com Phone: 512-838-9858 All opinions are my own and not those of IBM ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Oops in get_boot_pages at reboot 2004-04-01 16:56 Oops in get_boot_pages at reboot Olof Johansson 2004-04-01 17:11 ` Manfred Spraul @ 2004-04-01 18:55 ` Andrew Morton 2004-04-02 1:26 ` Olof Johansson 1 sibling, 1 reply; 6+ messages in thread From: Andrew Morton @ 2004-04-01 18:55 UTC (permalink / raw) To: Olof Johansson; +Cc: manfred, torvalds, linux-kernel, anton Olof Johansson <olof@austin.ibm.com> wrote: > > So __pollwait() calls __get_free_page(), system_running is 0 so > get_boot_pages is called. Since get_boot_pages is labeled __init, badness > happens. > > How about checking against mem_init_done instead of system_running? It > helps against the oops, but there might be some good reason not to do > it. I don't claim to know the intrisic details about the MM. :-) Do we really need to clear system_running at reboot? I'd always viewed it as "everything is initialised and usable", and that's generally true at reboot time. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Oops in get_boot_pages at reboot 2004-04-01 18:55 ` Andrew Morton @ 2004-04-02 1:26 ` Olof Johansson 2004-04-02 1:44 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Olof Johansson @ 2004-04-02 1:26 UTC (permalink / raw) To: Andrew Morton; +Cc: manfred, torvalds, linux-kernel, anton On Thu, 1 Apr 2004, Andrew Morton wrote: > Do we really need to clear system_running at reboot? I'd always viewed it > as "everything is initialised and usable", and that's generally true at > reboot time. Seems like it was added explicitly to avoid user mode helpers from being invoked during early boot and shutdown/reboot: http://linux.bkbits.net:8080/linux-2.5/cset@3d5c548dNM0Kqg9aP03NhtxXmntmKQ Well, how about making system_running a state instead of a flag, and defining states SYSTEM_BOOTING/RUNNING/SHUTDOWN? See patch below. -Olof Olof Johansson Office: 4F005/905 Linux on Power Development IBM Systems Group Email: olof@austin.ibm.com Phone: 512-838-9858 All opinions are my own and not those of IBM ===== arch/ppc/platforms/pmac_nvram.c 1.12 vs edited ===== --- 1.12/arch/ppc/platforms/pmac_nvram.c Wed Feb 4 23:20:40 2004 +++ edited/arch/ppc/platforms/pmac_nvram.c Thu Apr 1 19:13:02 2004 @@ -154,11 +154,11 @@ struct adb_request req; DECLARE_COMPLETION(req_complete); - req.arg = system_running ? &req_complete : NULL; + req.arg = system_running == SYSTEM_RUNNING ? &req_complete : NULL; if (pmu_request(&req, pmu_nvram_complete, 3, PMU_READ_NVRAM, (addr >> 8) & 0xff, addr & 0xff)) return 0xff; - if (system_running) + if (system_running == SYSTEM_RUNNING) wait_for_completion(&req_complete); while (!req.complete) pmu_poll(); @@ -170,11 +170,11 @@ struct adb_request req; DECLARE_COMPLETION(req_complete); - req.arg = system_running ? &req_complete : NULL; + req.arg = system_running == SYSTEM_RUNNING ? &req_complete : NULL; if (pmu_request(&req, pmu_nvram_complete, 4, PMU_WRITE_NVRAM, (addr >> 8) & 0xff, addr & 0xff, val)) return; - if (system_running) + if (system_running == SYSTEM_RUNNING) wait_for_completion(&req_complete); while (!req.complete) pmu_poll(); ===== include/linux/kernel.h 1.47 vs edited ===== --- 1.47/include/linux/kernel.h Wed Mar 10 10:45:49 2004 +++ edited/include/linux/kernel.h Thu Apr 1 19:11:16 2004 @@ -109,9 +109,15 @@ extern void bust_spinlocks(int yes); extern int oops_in_progress; /* If set, an oops, panic(), BUG() or die() is in progress */ extern int panic_on_oops; -extern int system_running; +extern int system_running; /* See values below */ extern int tainted; extern const char *print_tainted(void); + +/* Values used for system_running */ +#define SYSTEM_BOOTING 0 +#define SYSTEM_RUNNING 1 +#define SYSTEM_SHUTDOWN 2 + #define TAINT_PROPRIETARY_MODULE (1<<0) #define TAINT_FORCED_MODULE (1<<1) #define TAINT_UNSAFE_SMP (1<<2) ===== init/main.c 1.127 vs edited ===== --- 1.127/init/main.c Tue Mar 16 04:10:35 2004 +++ edited/init/main.c Thu Apr 1 19:11:17 2004 @@ -613,7 +613,7 @@ */ free_initmem(); unlock_kernel(); - system_running = 1; + system_running = SYSTEM_RUNNING; if (sys_open("/dev/console", O_RDWR, 0) < 0) printk("Warning: unable to open an initial console.\n"); ===== kernel/kmod.c 1.36 vs edited ===== --- 1.36/kernel/kmod.c Wed Feb 25 04:31:13 2004 +++ edited/kernel/kmod.c Thu Apr 1 19:11:17 2004 @@ -249,7 +249,7 @@ }; DECLARE_WORK(work, __call_usermodehelper, &sub_info); - if (!system_running) + if (system_running != SYSTEM_RUNNING) return -EBUSY; if (path[0] == '\0') ===== kernel/printk.c 1.35 vs edited ===== --- 1.35/kernel/printk.c Mon Mar 8 18:57:46 2004 +++ edited/kernel/printk.c Thu Apr 1 19:11:17 2004 @@ -522,7 +522,8 @@ log_level_unknown = 1; } - if (!cpu_online(smp_processor_id()) && !system_running) { + if (!cpu_online(smp_processor_id()) && + system_running != SYSTEM_RUNNING) { /* * Some console drivers may assume that per-cpu resources have * been allocated. So don't allow them to be called by this ===== kernel/sched.c 1.255 vs edited ===== --- 1.255/kernel/sched.c Thu Mar 18 22:54:57 2004 +++ edited/kernel/sched.c Thu Apr 1 19:11:17 2004 @@ -2982,7 +2982,8 @@ #if defined(in_atomic) static unsigned long prev_jiffy; /* ratelimiting */ - if ((in_atomic() || irqs_disabled()) && system_running) { + if ((in_atomic() || irqs_disabled()) && + system_running == SYSTEM_RUNNING) { if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) return; prev_jiffy = jiffies; ===== kernel/sys.c 1.73 vs edited ===== --- 1.73/kernel/sys.c Mon Feb 23 13:46:54 2004 +++ edited/kernel/sys.c Thu Apr 1 19:11:17 2004 @@ -436,7 +436,7 @@ switch (cmd) { case LINUX_REBOOT_CMD_RESTART: notifier_call_chain(&reboot_notifier_list, SYS_RESTART, NULL); - system_running = 0; + system_running = SYSTEM_SHUTDOWN; device_shutdown(); printk(KERN_EMERG "Restarting system.\n"); machine_restart(NULL); @@ -452,7 +452,7 @@ case LINUX_REBOOT_CMD_HALT: notifier_call_chain(&reboot_notifier_list, SYS_HALT, NULL); - system_running = 0; + system_running = SYSTEM_SHUTDOWN; device_shutdown(); printk(KERN_EMERG "System halted.\n"); machine_halt(); @@ -462,7 +462,7 @@ case LINUX_REBOOT_CMD_POWER_OFF: notifier_call_chain(&reboot_notifier_list, SYS_POWER_OFF, NULL); - system_running = 0; + system_running = SYSTEM_SHUTDOWN; device_shutdown(); printk(KERN_EMERG "Power down.\n"); machine_power_off(); @@ -478,7 +478,7 @@ buffer[sizeof(buffer) - 1] = '\0'; notifier_call_chain(&reboot_notifier_list, SYS_RESTART, buffer); - system_running = 0; + system_running = SYSTEM_SHUTDOWN; device_shutdown(); printk(KERN_EMERG "Restarting system with command '%s'.\n", buffer); machine_restart(buffer); ===== mm/page_alloc.c 1.197 vs edited ===== --- 1.197/mm/page_alloc.c Tue Mar 16 20:10:10 2004 +++ edited/mm/page_alloc.c Thu Apr 1 19:11:18 2004 @@ -734,7 +734,7 @@ struct page * page; #ifdef CONFIG_NUMA - if (unlikely(!system_running)) + if (unlikely(system_running == SYSTEM_BOOTING)) return get_boot_pages(gfp_mask, order); #endif page = alloc_pages(gfp_mask, order); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Oops in get_boot_pages at reboot 2004-04-02 1:26 ` Olof Johansson @ 2004-04-02 1:44 ` Andrew Morton 0 siblings, 0 replies; 6+ messages in thread From: Andrew Morton @ 2004-04-02 1:44 UTC (permalink / raw) To: Olof Johansson; +Cc: manfred, torvalds, linux-kernel, anton Olof Johansson <olof@austin.ibm.com> wrote: > > Well, how about making system_running a state instead of a flag, and > defining states SYSTEM_BOOTING/RUNNING/SHUTDOWN? Eminently sane, of course. I'll rename it to system_state, to reflect the new reality and to catch unconverted users. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-04-02 1:42 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-04-01 16:56 Oops in get_boot_pages at reboot Olof Johansson 2004-04-01 17:11 ` Manfred Spraul 2004-04-01 17:32 ` Olof Johansson 2004-04-01 18:55 ` Andrew Morton 2004-04-02 1:26 ` Olof Johansson 2004-04-02 1:44 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox