public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch] Race between init_idle and reschedule_idle
@ 2001-09-30  5:50 Martin J. Bligh
  2001-09-30 12:19 ` Richard Gooch
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Martin J. Bligh @ 2001-09-30  5:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Alan Cox, Andrew Morton

On an SMP system, if the boot cpu calls reschedule_idle before the
secondary cpus have all called init_idle, we hit the following code
in reschedule_idle:

...
    for (i = 0; i < smp_num_cpus; i++) {
        cpu = cpu_logical_map(i);
        if (!can_schedule(p, cpu))
            continue;
        tsk = cpu_curr(cpu);
...

Because init_idle hasn't run for all cpus yet, the expression 
"cpu_curr(cpu)"
gives us NULL. When we derefernce an offset of 0x50 off tsk a few
nanoseconds later we panic, complaining vaddr 0x00000050 is invalid.

This is more likely to happen on larger systems, but could happen on any
SMP system (especially if I enable serial console, which really slows down
the secondary procs doing printk ;-) ). If you want to see the panic / 
analysis,
I can send it.

Thanks to Alan Cox & Andrew Morton for showing me how to serialise the
cpus to make the panic legible. The following patch holds back the boot
cpu at the end of smp_init until all the secondarys have done init_idle:

--- virgin-2.4.10/kernel/sched.c	Mon Sep 17 23:03:09 2001
+++ linux-2.4.10/kernel/sched.c	Sat Sep 29 19:57:10 2001
@@ -1309,6 +1309,8 @@
 	atomic_inc(&current->files->count);
 }

+extern volatile unsigned long wait_init_idle;
+
 void __init init_idle(void)
 {
 	struct schedule_data * sched_data;
@@ -1321,6 +1323,7 @@
 	}
 	sched_data->curr = current;
 	sched_data->last_schedule = get_cycles();
+	clear_bit(current->processor, &wait_init_idle);
 }

 extern void init_timervecs (void);
--- virgin-2.4.10/init/main.c	Thu Sep 20 21:02:01 2001
+++ linux-2.4.10/init/main.c	Sat Sep 29 21:11:52 2001
@@ -477,6 +477,8 @@
 extern void setup_arch(char **);
 extern void cpu_idle(void);

+volatile unsigned long wait_init_idle = 0UL;
+
 #ifndef CONFIG_SMP

 #ifdef CONFIG_X86_IO_APIC
@@ -490,13 +492,25 @@

 #else

+
 /* Called by boot processor to activate the rest. */
 static void __init smp_init(void)
 {
 	/* Get other processors into their bootup holding patterns. */
 	smp_boot_cpus();
+	wait_init_idle = cpu_online_map;
+	clear_bit(current->processor, &wait_init_idle); /* Don't wait on me! */
+	printk("Waiting on wait_init_idle (map = 0x%lx)\n", wait_init_idle);
 	smp_threads_ready=1;
 	smp_commence();
+
+	/* Wait for the other cpus to set up their idle processes */
+        while (1) {
+                if (!wait_init_idle)
+                        break;
+                rep_nop();
+        }
+	printk("All processors have done init_idle\n");
 }		

 #endif




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-10-09 15:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-09-30  5:50 [patch] Race between init_idle and reschedule_idle Martin J. Bligh
2001-09-30 12:19 ` Richard Gooch
2001-10-02 18:43 ` george anzinger
2001-10-09 13:12 ` Hubertus Franke
2001-10-09 15:56   ` Martin J. Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox